VN-Transformer: Rotation-Equivariant Attention for Vector Neurons

Serge Assaad; Carlton Downey; Rami Al-Rfou; Nigamaa Nayakanti; Ben Sapp

VNトランスフォーマー：ベクトルニューロンの回転同変注意

回転同変は、モーション予測や3D知覚などの多くの実用的なアプリケーションで望ましい特性であり、サンプルの効率、より優れた一般化、入力摂動に対するロバスト性などの利点を提供できます。 Vector Neurons（VN）は最近開発されたフレームワークであり、1次元のスカラーニューロンを3次元の「ベクトルニューロン」に拡張することで、標準的な機械学習操作の回転等価アナログを導出するためのシンプルで効果的なアプローチを提供します。現在のVNモデルのいくつかの欠点に対処するために、新しい「VN-Transformer」アーキテクチャを紹介します。私たちの貢献は次のとおりです。（i）元のベクトルニューロンモデルで必要とされる重い特徴の前処理の必要性を排除する回転同変注意メカニズムを導き出します。（ii）VNフレームワークを拡張して非空間属性をサポートし、これらのモデルの実際のデータセットへの適用性を拡張します。（iii）点群の解像度をマルチスケールで低減するための回転同変メカニズムを導き出し、推論とトレーニングを大幅に高速化します。（iv）等分散性（ϵ-近似等分散性）の小さなトレードオフを使用して、加速ハードウェアでの数値安定性とトレーニングの堅牢性を大幅に改善できることを示し、モデルでの等分散性違反の伝播を制限しました。最後に、VN-Transformerを3D形状分類とモーション予測に適用して、説得力のある結果を出します。

Rotation equivariance is a desirable property in many practical applications such as motion forecasting and 3D perception, where it can offer benefits like sample efficiency, better generalization, and robustness to input perturbations. Vector Neurons (VN) is a recently developed framework offering a simple yet effective approach for deriving rotation-equivariant analogs of standard machine learning operations by extending one-dimensional scalar neurons to three-dimensional "vector neurons." We introduce a novel "VN-Transformer" architecture to address several shortcomings of the current VN models. Our contributions are: (i) we derive a rotation-equivariant attention mechanism which eliminates the need for the heavy feature preprocessing required by the original Vector Neurons models; (ii) we extend the VN framework to support non-spatial attributes, expanding the applicability of these models to real-world datasets; (iii) we derive a rotation-equivariant mechanism for multi-scale reduction of point-cloud resolution, greatly speeding up inference and training; (iv) we show that small tradeoffs in equivariance (ϵ-approximate equivariance) can be used to obtain large improvements in numerical stability and training robustness on accelerated hardware, and we bound the propagation of equivariance violations in our models. Finally, we apply our VN-Transformer to 3D shape classification and motion forecasting with compelling results.

updated: Wed Jun 08 2022 21:48:47 GMT+0000 (UTC)

published: Wed Jun 08 2022 21:48:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト