Can CNNs Be More Robust Than Transformers?

Zeyu Wang; Yutong Bai; Yuyin Zhou; Cihang Xie

CNN はトランスフォーマーよりも堅牢にできますか?

Vision Transformers の最近の成功は、10 年間にわたる画像認識における畳み込みニューラルネットワーク (CNN) の長い支配を揺るがしています。具体的には、分布外のサンプルの堅牢性に関して、最近の研究では、さまざまなトレーニング設定に関係なく、トランスフォーマーは本質的に CNN よりも堅牢であることがわかりました。さらに、トランスフォーマーのそのような優位性は、自己注意のようなアーキテクチャ自体に大きく起因すると考えられています。この論文では、トランスフォーマーの設計を綿密に調べることによって、その信念に疑問を投げかけます。私たちの調査結果は、堅牢性を高めるための 3 つの非常に効果的なアーキテクチャ設計につながりますが、数行のコードで実装できるほどシンプルです。つまり、a) 入力画像のパッチ適用、b) カーネルサイズの拡大、c) アクティベーションレイヤーと正規化レイヤーの削減です。これらのコンポーネントを組み合わせることで、トランスフォーマーと同じかそれ以上に堅牢な注意のような操作なしで、純粋な CNN アーキテクチャを構築できます。この作業が、コミュニティが堅牢なニューラルアーキテクチャの設計をよりよく理解するのに役立つことを願っています。コードは、https://github.com/UCSC-VLAA/RobustCNN で公開されています。

The recent success of Vision Transformers is shaking the long dominance of Convolutional Neural Networks (CNNs) in image recognition for a decade. Specifically, in terms of robustness on out-of-distribution samples, recent research finds that Transformers are inherently more robust than CNNs, regardless of different training setups. Moreover, it is believed that such superiority of Transformers should largely be credited to their self-attention-like architectures per se. In this paper, we question that belief by closely examining the design of Transformers. Our findings lead to three highly effective architecture designs for boosting robustness, yet simple enough to be implemented in several lines of code, namely a) patchifying input images, b) enlarging kernel size, and c) reducing activation layers and normalization layers. Bringing these components together, we are able to build pure CNN architectures without any attention-like operations that are as robust as, or even more robust than, Transformers. We hope this work can help the community better understand the design of robust neural architectures. The code is publicly available at https://github.com/UCSC-VLAA/RobustCNN.

updated: Mon Mar 06 2023 05:51:33 GMT+0000 (UTC)

published: Tue Jun 07 2022 17:17:07 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト