CNNs are Myopic

Vamshi C. Madala; Shivkumar Chandrasekaran

CNNはMyopicです

CNNs are Myopic

畳み込みニューラルネットワーク（CNN）は、一見認識できない小さなタイルのみを使用して画像を分類することを学習すると主張します。このようなタイルのみを使用してトレーニングされたCNNは、完全な画像でトレーニングされたCNNのパフォーマンスに匹敵するか、それを超えることができることを実験的に示します。逆に、完全な画像でトレーニングされたCNNは、小さなタイルでも同様の予測を示します。また、この動作を説明していると思われる畳み込みデータセットの最初の事前理論モデルを提案します。これは、CNNが最先端の精度を達成するために画像のグローバル構造を理解する必要がないという長年の疑念をさらにサポートします。驚くべきことに、それはまた、過剰適合も必要ないことを示唆しています。

We claim that Convolutional Neural Networks (CNNs) learn to classify images using only small seemingly unrecognizable tiles. We show experimentally that CNNs trained only using such tiles can match or even surpass the performance of CNNs trained on full images. Conversely, CNNs trained on full images show similar predictions on small tiles. We also propose the first a priori theoretical model for convolutional data sets that seems to explain this behavior. This gives additional support to the long standing suspicion that CNNs do not need to understand the global structure of images to achieve state-of-the-art accuracies. Surprisingly it also suggests that over-fitting is not needed either.

updated: Wed Jun 01 2022 04:05:26 GMT+0000 (UTC)

published: Sun May 22 2022 06:22:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト