Learning Convolutional Neural Networks in the Frequency Domain

Hengyue Pan; Yixin Chen; Xin Niu; Wenbo Zhou; Dongsheng Li

周波数領域での畳み込みニューラルネットワークの学習

畳み込みニューラルネットワーク（CNN）は、過去数十年の間にコンピュータービジョンで目覚ましい成功を収めてきました。画像畳み込み操作は、CNNが画像関連のタスクで優れたパフォーマンスを得るのに役立ちます。ただし、画像の畳み込みは計算の複雑さが高く、実装が困難です。この論文では、周波数領域でトレーニングできるCEMNetを提案します。この研究の最も重要な動機は、単純な要素ごとの乗算演算を使用して、相互相関定理に基づく周波数領域での画像の畳み込みを置き換えることができることです。これにより、計算の複雑さが明らかに軽減されます。さらに、過剰適合の問題を軽減するための重み固定メカニズムを導入し、周波数領域でのバッチ正規化、リークReLU、およびドロップアウトの動作動作を分析して、CEMNetの対応するものを設計します。また、離散フーリエ変換によってもたらされる複雑な入力を処理するために、CEMNet用の2分岐ネットワーク構造を設計します。実験結果は、CEMNetがMNISTおよびCIFAR-10データベースで優れたパフォーマンスを達成していることを示しています。

Convolutional neural network (CNN) has achieved impressive success in computer vision during the past few decades. The image convolution operation helps CNNs to get good performance on image-related tasks. However, the image convolution has high computation complexity and hard to be implemented. This paper proposes the CEMNet, which can be trained in the frequency domain. The most important motivation of this research is that we can use the straightforward element-wise multiplication operation to replace the image convolution in the frequency domain based on the Cross-Correlation Theorem, which obviously reduces the computation complexity. We further introduce a Weight Fixation mechanism to alleviate the problem of over-fitting, and analyze the working behavior of Batch Normalization, Leaky ReLU, and Dropout in the frequency domain to design their counterparts for CEMNet. Also, to deal with complex inputs brought by Discrete Fourier Transform, we design a two-branches network structure for CEMNet. Experimental results imply that CEMNet achieves good performance on MNIST and CIFAR-10 databases.

updated: Wed Apr 27 2022 02:16:16 GMT+0000 (UTC)

published: Thu Apr 14 2022 03:08:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト