LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context Propagation in Transformers

Zhuoxu Huang; Zhiyou Zhao; Banghuai Li; Jungong Han

LCPFormer: トランスフォーマーでのローカルコンテキスト伝搬による効果的な 3D ポイントクラウド分析に向けて

基礎となるアテンションメカニズムと長期的な依存関係をキャプチャする機能を備えた Transformer は、順序付けられていない点群データの自然な選択になります。ただし、ローカルリージョンが一般的なサンプリングアーキテクチャから分離されていると、インスタンスの構造情報が破損し、隣接するローカルリージョン間の固有の関係が探索されません。一方、トランスベースの 3D ポイントクラウドモデルでは、ローカルの構造情報が重要です。したがって、この論文では、ローカルコンテキスト伝播（LCP）という名前の新しいモジュールを提案して、隣接するローカルリージョン間のメッセージパッシングを活用し、それらの表現をより有益で識別力のあるものにします。より具体的には、隣接するローカルリージョンのオーバーラップポイント (統計的に一般的であることが示されています) を仲介として使用し、異なるローカルリージョンからのこれらの共有ポイントの特徴を再重み付けしてから、次のレイヤーに渡します。 2 つのトランス層の間に LCP モジュールを挿入すると、ネットワークの表現力が大幅に向上します。最後に、LCP モジュールを備えた柔軟な LCPFormer アーキテクチャを設計します。提案された方法は、さまざまなタスクに適用でき、3D 形状分類や、3D オブジェクト検出やセマンティックセグメンテーションなどの高密度予測タスクを含むベンチマークで、さまざまな変換器ベースの方法よりも優れています。コードは複製用にリリースされます。

Transformer with its underlying attention mechanism and the ability to capture long-range dependencies makes it become a natural choice for unordered point cloud data. However, separated local regions from the general sampling architecture corrupt the structural information of the instances, and the inherent relationships between adjacent local regions lack exploration, while local structural information is crucial in a transformer-based 3D point cloud model. Therefore, in this paper, we propose a novel module named Local Context Propagation (LCP) to exploit the message passing between neighboring local regions and make their representations more informative and discriminative. More specifically, we use the overlap points of adjacent local regions (which statistically show to be prevalent) as intermediaries, then re-weight the features of these shared points from different local regions before passing them to the next layers. Inserting the LCP module between two transformer layers results in a significant improvement in network expressiveness. Finally, we design a flexible LCPFormer architecture equipped with the LCP module. The proposed method is applicable to different tasks and outperforms various transformer-based methods in benchmarks including 3D shape classification and dense prediction tasks such as 3D object detection and semantic segmentation. Code will be released for reproduction.

updated: Sun Feb 19 2023 14:44:11 GMT+0000 (UTC)

published: Sun Oct 23 2022 15:43:01 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト