OneDConv: Generalized Convolution For Transform-Invariant Representation

Tong Zhang; Haohan Weng; Ke Yi; C. L. Philip Chen

OneDConv：変換不変表現のための一般化された畳み込み

畳み込みニューラルネットワーク（CNN）は、さまざまな視覚タスクで大きな力を発揮してきました。ただし、変換不変プロパティがないため、複雑な実世界のシナリオでのさらなるアプリケーションが制限されます。この作業では、入力機能に基づいて計算上およびパラメトリックに効率的な方法で畳み込みカーネルを動的に変換する、新しい一般化された1次元畳み込み演算子（OneDConv）を提案しました。提案された演算子は、変換不変の特徴を自然に抽出できます。一般的な画像のパフォーマンスを犠牲にすることなく、畳み込みの堅牢性と一般化を向上させます。提案されたOneDConv演算子は、バニラ畳み込みを置き換えることができるため、現在人気のある畳み込みアーキテクチャに組み込んで、エンドツーエンドで簡単にトレーニングできます。いくつかの一般的なベンチマークでは、OneDConvは、標準画像と歪んだ画像の両方で、元の畳み込み操作や他の提案されたモデルよりも優れています。

Convolutional Neural Networks (CNNs) have exhibited their great power in a variety of vision tasks. However, the lack of transform-invariant property limits their further applications in complicated real-world scenarios. In this work, we proposed a novel generalized one dimension convolutional operator (OneDConv), which dynamically transforms the convolution kernels based on the input features in a computationally and parametrically efficient manner. The proposed operator can extract the transform-invariant features naturally. It improves the robustness and generalization of convolution without sacrificing the performance on common images. The proposed OneDConv operator can substitute the vanilla convolution, thus it can be incorporated into current popular convolutional architectures and trained end-to-end readily. On several popular benchmarks, OneDConv outperforms the original convolution operation and other proposed models both in canonical and distorted images.

updated: Sat Jan 15 2022 07:44:44 GMT+0000 (UTC)

published: Sat Jan 15 2022 07:44:44 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト