Linear CNNs Discover the Statistical Structure of the Dataset Using Only the Most Dominant Frequencies

Hannah Pinson; Joeri Lenaerts; Vincent Ginis

線形 CNN は、最も優勢な周波数のみを使用してデータセットの統計構造を発見します

一般的な畳み込みニューラルネットワーク (CNN) の内部動作に関する理論的な理解は限られています。ここでは、線形 CNN の学習理論という形で、そのような理解に向けた新しい足がかりを提示します。勾配降下方程式を分析することにより、畳み込みを使用すると、データセット構造とネットワーク構造の間に不一致が生じることがわかりました。線形 CNN は、非線形のステージのような遷移でデータセットの統計的構造を発見し、この構造の不一致に応じて発見の速度が変化することを示します。さらに、この不一致は、「優性周波数バイアス」と呼ばれるものの中心にあることがわかりました。線形 CNN は、データセットに存在するさまざまな構造部分の優性周波数のみを使用してこれらの発見に到達します。私たちの調査結果は、ショートカット学習や形状ではなくテクスチャに依存する傾向など、一般的な CNN のいくつかの特徴を説明するのに役立ちます。

Our theoretical understanding of the inner workings of general convolutional neural networks (CNN) is limited. We here present a new stepping stone towards such understanding in the form of a theory of learning in linear CNNs. By analyzing the gradient descent equations, we discover that using convolutions leads to a mismatch between the dataset structure and the network structure. We show that linear CNNs discover the statistical structure of the dataset with non-linear, stage-like transitions, and that the speed of discovery changes depending on this structural mismatch. Moreover, we find that the mismatch lies at the heart of what we call the 'dominant frequency bias', where linear CNNs arrive at these discoveries using only the dominant frequencies of the different structural parts present in the dataset. Our findings can help explain several characteristics of general CNNs, such as their shortcut learning and their tendency to rely on texture instead of shape.

updated: Fri Mar 03 2023 15:52:06 GMT+0000 (UTC)

published: Fri Mar 03 2023 15:52:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト