Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph

Honghui Yang; Zili Liu; Xiaopei Wu; Wenxiao Wang; Wei Qian; Xiaofei He; Deng Cai

グラフ R-CNN: セマンティック装飾されたローカルグラフによる正確な 3D オブジェクト検出に向けて

2 段階検出器は、3D オブジェクト検出で非常に人気があります。ほとんどの 2 段階 3D 検出器は、第 2 段階での RoI 特徴抽出にグリッドポイント、ボクセルグリッド、またはサンプリングされたキーポイントを利用します。ただし、このような方法は、不均一に分散されたまばらな屋外ポイントを処理するには非効率的です。この論文では、この問題を 3 つの側面で解決します。 1) 動的ポイント集約。 3D プロポーザルごとに局所領域内のポイントをすばやく検索するためのパッチ検索を提案します。次に、ポイントを均等にサンプリングするために、動的な最も遠いボクセルサンプリングが適用されます。特に、ポイントの不均一な分布に対応するために、ボクセルサイズは距離に沿って変化します。 2) RoI グラフのプーリング。サンプリングされたポイントにローカルグラフを作成して、コンテキスト情報をより適切にモデル化し、反復的なメッセージパッシングを通じてポイントの関係をマイニングします。 3) 視覚的特徴の増強。限られたセマンティックキューでまばらな LiDAR ポイントを補うために、シンプルでありながら効果的な融合戦略を紹介します。これらのモジュールに基づいて、Graph R-CNN を第 2 段階として構築します。これを既存の 1 段階検出器に適用して、検出性能を一貫して向上させることができます。広範な実験により、Graph R-CNN が KITTI と Waymo オープンデータセットの両方で最先端の 3D 検出モデルよりも大幅に優れていることが示されています。また、KITTI BEV 車検出リーダーボードで 1 位にランクされています。コードは https://github.com/Nightmare-n/GraphRCNN で入手できます。

Two-stage detectors have gained much popularity in 3D object detection. Most two-stage 3D detectors utilize grid points, voxel grids, or sampled keypoints for RoI feature extraction in the second stage. Such methods, however, are inefficient in handling unevenly distributed and sparse outdoor points. This paper solves this problem in three aspects. 1) Dynamic Point Aggregation. We propose the patch search to quickly search points in a local region for each 3D proposal. The dynamic farthest voxel sampling is then applied to evenly sample the points. Especially, the voxel size varies along the distance to accommodate the uneven distribution of points. 2) RoI-graph Pooling. We build local graphs on the sampled points to better model contextual information and mine point relations through iterative message passing. 3) Visual Features Augmentation. We introduce a simple yet effective fusion strategy to compensate for sparse LiDAR points with limited semantic cues. Based on these modules, we construct our Graph R-CNN as the second stage, which can be applied to existing one-stage detectors to consistently improve the detection performance. Extensive experiments show that Graph R-CNN outperforms the state-of-the-art 3D detection models by a large margin on both the KITTI and Waymo Open Dataset. And we rank first place on the KITTI BEV car detection leaderboard. Code will be available at https://github.com/Nightmare-n/GraphRCNN.

updated: Sun Aug 07 2022 02:56:56 GMT+0000 (UTC)

published: Sun Aug 07 2022 02:56:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト