Depth- and Semantics-aware Multi-modal Domain Translation: Generating 3D Panoramic Color Images from LiDAR Point Clouds

Tiago Cortinhal; Eren Erdal Aksoy

深度およびセマンティクスを考慮したマルチモーダルドメイン変換: LiDAR ポイントクラウドからの 3D パノラマカラー画像の生成

この作業は、LiDAR とカメラセンサー間のマルチモーダルセットアップでクロスドメインの画像から画像への変換のために、TITAN-Next という名前の新しい深度およびセマンティクスを認識する条件付き生成モデルを提示します。提案されたモデルは、中間レベルの表現としてシーンセマンティクスを活用し、セマンティックシーンセグメントのみに依存することで生の LiDAR ポイントクラウドを RGB-D カメラ画像に変換できます。これはこの種の最初のフレームワークであり、フェイルセーフメカニズムの提供やターゲット画像ドメインで利用可能なデータの増強など、自動運転車での実用的なアプリケーションがあると主張しています。提案されたモデルは、大規模で挑戦的な Semantic-KITTI データセットで評価され、実験結果は、IoU に関して元の TITAN-Net および他の強力なベースラインを 23.7% 大幅に上回ることを示しています。

This work presents a new depth- and semantics-aware conditional generative model, named TITAN-Next, for cross-domain image-to-image translation in a multi-modal setup between LiDAR and camera sensors. The proposed model leverages scene semantics as a mid-level representation and is able to translate raw LiDAR point clouds to RGB-D camera images by solely relying on semantic scene segments. We claim that this is the first framework of its kind and it has practical applications in autonomous vehicles such as providing a fail-safe mechanism and augmenting available data in the target image domain. The proposed model is evaluated on the large-scale and challenging Semantic-KITTI dataset, and experimental findings show that it considerably outperforms the original TITAN-Net and other strong baselines by 23.7% margin in terms of IoU.

updated: Thu Nov 16 2023 09:14:36 GMT+0000 (UTC)

published: Wed Feb 15 2023 13:48:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト