Transformers in 3D Point Clouds: A Survey

Dening Lu; Qian Xie; Mingqiang Wei; Linlin Xu; Jonathan Li

3D点群のトランスフォーマー：調査

近年、Transformerモデルは、長距離依存関係モデリングの優れた機能を備えていることが証明されています。彼らは自然言語処理（NLP）と画像処理の両方で満足のいく結果を達成しました。この重要な成果は、さまざまな3Dタスクに適用するための3D点群処理の研究者の間で大きな関心を呼んでいます。固有の順列不変性と強力なグローバル特徴学習能力により、3Dトランスフォーマーは点群の処理と分析に最適です。最先端の非トランスフォーマーアルゴリズムと比較して、競争力のある、またはさらに優れたパフォーマンスを実現しています。この調査は、さまざまなタスク（ポイントクラウドの分類、セグメンテーション、オブジェクト検出など）用に設計された3Dトランスフォーマーの包括的な概要を提供することを目的としています。まず、一般的なTransformerの基本的なコンポーネントを紹介し、2Dおよび3Dフィールドでのそのアプリケーションについて簡単に説明します。次に、メソッド分類のための3つの異なる分類法（つまり、Transformer実装ベースの分類法、データ表現ベースの分類法、およびタスクベースの分類法）を示します。これにより、関連するメソッドを複数の観点から分析できます。さらに、パフォーマンス向上のために設計された3D自己注意メカニズムのバリエーションの調査も実施します。 3Dトランスフォーマーの優位性を実証するために、ポイントクラウドの分類、セグメンテーション、およびオブジェクト検出の観点から、トランスフォーマーベースのアルゴリズムのパフォーマンスを比較します。最後に、3Dトランスフォーマーの開発に役立つ参考資料を提供することを期待して、3つの潜在的な将来の研究の方向性を指摘します。

In recent years, Transformer models have been proven to have the remarkable ability of long-range dependencies modeling. They have achieved satisfactory results both in Natural Language Processing (NLP) and image processing. This significant achievement sparks great interest among researchers in 3D point cloud processing to apply them to various 3D tasks. Due to the inherent permutation invariance and strong global feature learning ability, 3D Transformers are well suited for point cloud processing and analysis. They have achieved competitive or even better performance compared to the state-of-the-art non-Transformer algorithms. This survey aims to provide a comprehensive overview of 3D Transformers designed for various tasks (e.g. point cloud classification, segmentation, object detection, and so on). We start by introducing the fundamental components of the general Transformer and providing a brief description of its application in 2D and 3D fields. Then, we present three different taxonomies (i.e., Transformer implementation-based taxonomy, data representation-based taxonomy, and task-based taxonomy) for method classification, which allows us to analyze involved methods from multiple perspectives. Furthermore, we also conduct an investigation of 3D self-attention mechanism variants designed for performance improvement. To demonstrate the superiority of 3D Transformers, we compare the performance of Transformer-based algorithms in terms of point cloud classification, segmentation, and object detection. Finally, we point out three potential future research directions, expecting to provide some benefit references for the development of 3D Transformers.

updated: Mon May 16 2022 01:32:18 GMT+0000 (UTC)

published: Mon May 16 2022 01:32:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト