DCT Perceptron Layer: A Transform Domain Approach for Convolution Layer

Hongyi Pan; Xin Zhu; Salih Atici; Ahmet Enis Cetin

DCT パーセプトロン層: 畳み込み層の変換ドメインアプローチ

この論文では、残差ニューラルネットワーク (ResNet) の 3×3 Conv2D 層を置き換えるために、DCT パーセプトロンと呼ぶ新しい離散コサイン変換 (DCT) ベースのニューラルネットワーク層を提案します。畳み込みフィルタリング操作は、フーリエ定理と DCT 畳み込み定理を利用して、要素ごとの乗算を使用して DCT ドメインで実行されます。 DCT パーセプトロンでは、トレーニング可能なソフトしきい値レイヤーが非線形性として使用されます。空間に依存せずチャネル固有の ResNet の Conv2D レイヤーと比較して、提案されたレイヤーは位置固有およびチャネル固有です。 DCT パーセプトロンレイヤーは、CIFAR-10 および ImageNet-1K の通常の ResNet と同等の精度結果を維持しながら、パラメーターと乗算の数を大幅に削減します。さらに、DCTパーセプトロン層は、分類精度を向上させる追加層として、従来のResNetのグローバル平均プーリング層の前にバッチ正規化層を挿入できます。

In this paper, we propose a novel Discrete Cosine Transform (DCT)-based neural network layer which we call DCT-perceptron to replace the 3×3 Conv2D layers in the Residual neural Network (ResNet). Convolutional filtering operations are performed in the DCT domain using element-wise multiplications by taking advantage of the Fourier and DCT Convolution theorems. A trainable soft-thresholding layer is used as the nonlinearity in the DCT perceptron. Compared to ResNet's Conv2D layer which is spatial-agnostic and channel-specific, the proposed layer is location-specific and channel-specific. The DCT-perceptron layer reduces the number of parameters and multiplications significantly while maintaining comparable accuracy results of regular ResNets in CIFAR-10 and ImageNet-1K. Moreover, the DCT-perceptron layer can be inserted with a batch normalization layer before the global average pooling layer in the conventional ResNets as an additional layer to improve classification accuracy.

updated: Tue Nov 15 2022 23:44:56 GMT+0000 (UTC)

published: Tue Nov 15 2022 23:44:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト