Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection

Shaofei Huang; Zhenwei Shen; Zehao Huang; Zihan Ding; Jiao Dai; Jizhong Han; Naiyan Wang; Si Liu

Anchor3DLane: 単眼 3D レーン検出のための 3D アンカーの回帰学習

単眼 3D 車線検出は、深度情報が不足しているため、困難な作業です。 3D 車線検出の一般的なソリューションは、まず正面 (FV) 画像またはフィーチャを逆透視図マッピング (IPM) を使用して鳥瞰図 (BEV) 空間に変換し、BEV フィーチャから車線を検出することです。ただし、平地の仮定とコンテキスト情報の損失に対する IPM の依存により、BEV 表現から 3D 情報を復元することが不正確になります。 BEV を取り除き、FV 表現から直接 3D レーンを予測する試みが行われましたが、3D レーンの構造化された表現がないため、他の BEV ベースの方法よりもパフォーマンスが劣っています。この論文では、3D 空間で 3D レーンアンカーを定義し、FV 表現から直接 3D レーンを予測する Anchor3DLane という名前の BEV フリーの方法を提案します。 3D レーンアンカーが FV フィーチャに投影され、適切な構造情報とコンテキスト情報の両方を含むフィーチャが抽出され、正確な予測が行われます。さらに、Anchor3DLane をマルチフレーム設定に拡張して、パフォーマンス向上のための時間情報を組み込みます。さらに、車線間の等幅特性を利用して予測の横誤差を低減するグローバル最適化手法も開発します。 3 つの一般的な 3D レーン検出ベンチマークでの広範な実験により、Anchor3DLane が以前の BEV ベースの方法よりも優れており、最先端のパフォーマンスを達成することが示されています。

Monocular 3D lane detection is a challenging task due to its lack of depth information. A popular solution to 3D lane detection is to first transform the front-viewed (FV) images or features into the bird-eye-view (BEV) space with inverse perspective mapping (IPM) and detect lanes from BEV features. However, the reliance of IPM on flat ground assumption and loss of context information makes it inaccurate to restore 3D information from BEV representations. An attempt has been made to get rid of BEV and predict 3D lanes from FV representations directly, while it still underperforms other BEV-based methods given its lack of structured representation for 3D lanes. In this paper, we define 3D lane anchors in the 3D space and propose a BEV-free method named Anchor3DLane to predict 3D lanes directly from FV representations. 3D lane anchors are projected to the FV features to extract their features which contain both good structural and context information to make accurate predictions. We further extend Anchor3DLane to the multi-frame setting to incorporate temporal information for performance improvement. In addition, we also develop a global optimization method that makes use of the equal-width property between lanes to reduce the lateral error of predictions. Extensive experiments on three popular 3D lane detection benchmarks show that our Anchor3DLane outperforms previous BEV-based methods and achieves state-of-the-art performances.

updated: Fri Jan 06 2023 04:35:31 GMT+0000 (UTC)

published: Fri Jan 06 2023 04:35:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト