Learning Invariant Representations for Equivariant Neural Networks Using Orthogonal Moments

Jaspreet Singh; Chandan Singh

直交モーメントを使用した同変ニューラルネットワークの不変表現の学習

標準の畳み込みニューラルネットワーク (CNN) の畳み込み層は、翻訳と同変です。ただし、畳み込み層と全結合層は、他のアフィン幾何学的変換と同変または不変ではありません。最近、CNN の従来の層が等変畳み込み、プーリング、およびバッチ正規化層に置き換えられた新しいクラスの CNN が提案されました。等変ニューラルネットワークの最終的な分類層は、回転、反射、平行移動などのさまざまなアフィン幾何学的変換に対して不変であり、スカラー値は、ネットワーク全体で畳み込みとダウンサンプリングを使用してフィルター応答の空間次元を除去するか、または平均を使用して取得されます。フィルター応答を引き継いだ。この作業では、完全に接続されたレイヤーでの回転、反射、および平行移動に関するグローバルな不変性をエンコードするための効果的な手段として、関数の高次統計を与える直交モーメントを統合することを提案します。その結果、ネットワークの中間層は等変になり、分類層は不変になります。この目的のために、最も広く使用されているゼルニケ、擬ゼルニケ、直交フーリエメリンモーメントが考慮されます。提案された作業の有効性は、回転した MNIST および CIFAR10 データセットのグループ等価 CNN (G-CNN) のアーキテクチャに不変遷移と全結合層を統合することによって評価されます。

The convolutional layers of standard convolutional neural networks (CNNs) are equivariant to translation. However, the convolution and fully-connected layers are not equivariant or invariant to other affine geometric transformations. Recently, a new class of CNNs is proposed in which the conventional layers of CNNs are replaced with equivariant convolution, pooling, and batch-normalization layers. The final classification layer in equivariant neural networks is invariant to different affine geometric transformations such as rotation, reflection and translation, and the scalar value is obtained by either eliminating the spatial dimensions of filter responses using convolution and down-sampling throughout the network or average is taken over the filter responses. In this work, we propose to integrate the orthogonal moments which gives the high-order statistics of the function as an effective means for encoding global invariance with respect to rotation, reflection and translation in fully-connected layers. As a result, the intermediate layers of the network become equivariant while the classification layer becomes invariant. The most widely used Zernike, pseudo-Zernike and orthogonal Fourier-Mellin moments are considered for this purpose. The effectiveness of the proposed work is evaluated by integrating the invariant transition and fully-connected layer in the architecture of group-equivariant CNNs (G-CNNs) on rotated MNIST and CIFAR10 datasets.

updated: Thu Sep 22 2022 11:48:39 GMT+0000 (UTC)

published: Thu Sep 22 2022 11:48:39 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト