Exploring Data Augmentation for Multi-Modality 3D Object Detection

Wenwei Zhang; Zhe Wang; Chen Change Loy

マルチモダリティ3Dオブジェクト検出のためのデータ拡張の調査

点群と画像に基づくマルチモダリティ手法が、点群のみを使用するアプローチよりもわずかに優れているか、場合によっては劣っているというのは直感に反します。この論文では、この現象の背後にある理由を調査します。マルチモダリティデータ拡張は点群と画像の間の一貫性を維持する必要があるという事実のため、この分野の最近の方法は通常、比較的不十分なデータ拡張を使用します。この不足により、彼らのパフォーマンスは期待を下回っています。したがって、トランスフォーメーションの反転と再生を使用して、単一モダリティとマルチモダリティのデータ拡張の間のギャップを埋めるために、トランスフォーメーションフローという名前のパイプラインを提供します。さらに、オクルージョンを考慮すると、異なるモダリティのポイントが異なるオブジェクトによって占有される可能性があり、マルチモダリティ検出ではカットアンドペーストなどの拡張が重要になります。さらに、マルチモダリティの一貫性を維持するために閉塞と物理的妥当性を同時に考慮するマルチモダリティカットアンドペースト（MoCa）を紹介します。検出器のアンサンブルを使用せずに、当社のマルチモダリティ検出器は、nuScenesデータセットで新しい最先端のパフォーマンスを実現し、KITTI3Dベンチマークで競争力のあるパフォーマンスを実現します。私たちの方法はまた、3回目のnuScenes検出チャレンジで最高のPKL賞を受賞しています。コードとモデルはhttps://github.com/open-mmlab/mmdetection3dでリリースされます。

It is counter-intuitive that multi-modality methods based on point cloud and images perform only marginally better or sometimes worse than approaches that solely use point cloud. This paper investigates the reason behind this phenomenon. Due to the fact that multi-modality data augmentation must maintain consistency between point cloud and images, recent methods in this field typically use relatively insufficient data augmentation. This shortage makes their performance under expectation. Therefore, we contribute a pipeline, named transformation flow, to bridge the gap between single and multi-modality data augmentation with transformation reversing and replaying. In addition, considering occlusions, a point in different modalities may be occupied by different objects, making augmentations such as cut and paste non-trivial for multi-modality detection. We further present Multi-mOdality Cut and pAste (MoCa), which simultaneously considers occlusion and physical plausibility to maintain the multi-modality consistency. Without using ensemble of detectors, our multi-modality detector achieves new state-of-the-art performance on nuScenes dataset and competitive performance on KITTI 3D benchmark. Our method also wins the best PKL award in the 3rd nuScenes detection challenge. Code and models will be released at https://github.com/open-mmlab/mmdetection3d.

updated: Wed Apr 21 2021 16:23:20 GMT+0000 (UTC)

published: Wed Dec 23 2020 15:23:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト