InstaGraM: Instance-level Graph Modeling for Vectorized HD Map Learning

Juyeb Shin; Francois Rameau; Hyeonjun Jeong; Dongsuk Kum

InstaGraM: ベクトル化された HD マップ学習のためのインスタンスレベルのグラフモデリング

自動運転を大規模に展開するには、幾何学的情報とセマンティック情報を含む軽量の高精細 (HD) マップの構築が最も重要です。車両によってキャプチャされた一連の画像からこのようなタイプのマップを自動的に生成するために、ほとんどの作業では、このマッピングをセグメンテーションの問題として定式化します。これは、最終的なベクトル化された表現を取得するための重い後処理を意味します。代替手法には、エンドツーエンドの方法で HD マップを生成する機能がありますが、計算コストの高い自己回帰モデルに依存しています。カメラベースを適切なレベルにするために、マップ要素のインスタンスレベルのグラフモデリングを介してベクトル化された HD マップを生成する高速エンドツーエンドネットワークである InstaGraM を提案します。私たちの戦略は、3 つの主な段階で構成されています: トップビューの特徴抽出、道路要素の頂点とエッジの検出、セマンティックベクトル表現への変換です。トップダウンの特徴抽出の後、エンコーダー/デコーダーアーキテクチャを利用して、道路要素の一連の頂点とエッジマップを予測します。最後に、これらの頂点とエッジマップは、セマンティックベクトル化されたマップを生成するアテンショングラフニューラルネットワークを介して関連付けられます。一般的なセグメンテーションアプローチに頼る代わりに、頂点間の強い空間関係と方向情報を提供する距離変換マップを回帰することを提案します。 nuScenes データセットの包括的な実験では、提案されたネットワークが HDMapNet よりも 13.7 mAP 優れており、VectorMapNet と同等の精度を 5 倍高速な推論速度で達成することが示されています。

The construction of lightweight High-definition (HD) maps containing geometric and semantic information is of foremost importance for the large-scale deployment of autonomous driving. To automatically generate such type of map from a set of images captured by a vehicle, most works formulate this mapping as a segmentation problem, which implies heavy post-processing to obtain the final vectorized representation. Alternative techniques have the ability to generate an HD map in an end-to-end manner but rely on computationally expensive auto-regressive models. To bring camera-based to an applicable level, we propose InstaGraM, a fast end-to-end network generating a vectorized HD map via instance-level graph modeling of the map elements. Our strategy consists of three main stages: top-view feature extraction, road elements' vertices and edges detection, and conversion to a semantic vector representation. After top-down feature extraction, an encoder-decoder architecture is utilized to predict a set of vertices and edge maps of the road elements. Finally, these vertices along with edge maps are associated through an attentional graph neural network generating a semantic vectorized map. Instead of relying on a common segmentation approach, we propose to regress distance transform maps as they provide strong spatial relations and directional information between vertices. Comprehensive experiments on nuScenes dataset show that our proposed network outperforms HDMapNet by 13.7 mAP and achieves comparable accuracy with VectorMapNet 5x faster inference speed.

updated: Tue Jan 10 2023 08:15:35 GMT+0000 (UTC)

published: Tue Jan 10 2023 08:15:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト