ExpPoint-MAE: Better interpretability and performance for self-supervised point cloud transformers

Ioannis Romanelis; Vlassis Fotis; Konstantinos Moustakas; Adrian Munteanu

ExpPoint-MAE: 自己監視型点群変換器の解釈性とパフォーマンスの向上

この論文では、点群領域における自己監視によって得られる変圧器の特性を詳しく掘り下げます。具体的には、事前トレーニングスキームとしてマスクされた自動エンコーディングの有効性を評価し、代替案として Momentum Contrast を検討します。私たちの研究では、学習された特徴に対するデータ量の影響を調査し、ドメイン間でのトランスフォーマーの動作の類似性を明らかにしました。包括的な視覚化を通じて、トランスフォーマーが意味的に意味のある領域に注意を向けることを学習することが観察され、事前トレーニングが基礎となるジオメトリのより良い理解につながることを示しています。さらに、微調整プロセスとその学習された表現への影響を調べます。それに基づいて、モデルやトレーニングパイプラインに他の変更を導入することなく、一貫してベースラインを上回る凍結解除戦略を考案し、トランスフォーマーモデル間の分類タスクで最先端の結果を達成します。

In this paper we delve into the properties of transformers, attained through self-supervision, in the point cloud domain. Specifically, we evaluate the effectiveness of Masked Autoencoding as a pretraining scheme, and explore Momentum Contrast as an alternative. In our study we investigate the impact of data quantity on the learned features, and uncover similarities in the transformer's behavior across domains. Through comprehensive visualiations, we observe that the transformer learns to attend to semantically meaningful regions, indicating that pretraining leads to a better understanding of the underlying geometry. Moreover, we examine the finetuning process and its effect on the learned representations. Based on that, we devise an unfreezing strategy which consistently outperforms our baseline without introducing any other modifications to the model or the training pipeline, and achieve state-of-the-art results in the classification task among transformer models.

updated: Wed Apr 10 2024 11:42:22 GMT+0000 (UTC)

published: Mon Jun 19 2023 09:38:21 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト