Reversible Column Networks

Yuxuan Cai; Yizhuang Zhou; Qi Han; Jianjian Sun; Xiangwen Kong; Jun Li; Xiangyu Zhang

リバーシブルカラムネットワーク

新しいニューラルネットワーク設計パラダイムのリバーシブルカラムネットワーク (RevCol) を提案します。 RevCol の本体は、サブネットワークの複数のコピーで構成され、それぞれ列と名付けられ、その間にマルチレベルの可逆接続が採用されています。このようなアーキテクチャスキームは、RevCol に従来のネットワークとは非常に異なる動作を割り当てます。順方向伝搬中、RevCol の機能は、他のネットワークのように圧縮または破棄されるのではなく、全体の情報が保持される各列を通過するときに、徐々に絡み合っていないことが学習されます。私たちの実験は、CNN スタイルの RevCol モデルが、画像分類、オブジェクト検出、セマンティックセグメンテーションなどの複数のコンピュータービジョンタスクで、特に大きなパラメーターバジェットと大規模なデータセットで、非常に競争力のあるパフォーマンスを達成できることを示唆しています。たとえば、ImageNet-22K 事前トレーニングの後、RevCol-XL は 88.2% の ImageNet-1K 精度を取得します。より多くの事前トレーニングデータが与えられると、最大のモデル RevCol-H は、ImageNet-1K で 90.0%、COCO 検出最小セットで 63.8% APbox、ADE20k セグメンテーションで 61.0% mIoU に達します。私たちの知る限り、これは純粋な (静的) CNN モデルの中で最高の COCO 検出と ADE20k セグメンテーションの結果です。さらに、一般的なマクロアーキテクチャの方法として、RevCol をトランスフォーマーやその他のニューラルネットワークに導入することもできます。これにより、コンピュータービジョンと NLP タスクの両方でパフォーマンスが向上することが実証されています。 https://github.com/megvii-research/RevCol でコードとモデルをリリースします

We propose a new neural network design paradigm Reversible Column Network (RevCol). The main body of RevCol is composed of multiple copies of subnetworks, named columns respectively, between which multi-level reversible connections are employed. Such architectural scheme attributes RevCol very different behavior from conventional networks: during forward propagation, features in RevCol are learned to be gradually disentangled when passing through each column, whose total information is maintained rather than compressed or discarded as other network does. Our experiments suggest that CNN-style RevCol models can achieve very competitive performances on multiple computer vision tasks such as image classification, object detection and semantic segmentation, especially with large parameter budget and large dataset. For example, after ImageNet-22K pre-training, RevCol-XL obtains 88.2% ImageNet-1K accuracy. Given more pre-training data, our largest model RevCol-H reaches 90.0% on ImageNet-1K, 63.8% APbox on COCO detection minival set, 61.0% mIoU on ADE20k segmentation. To our knowledge, it is the best COCO detection and ADE20k segmentation result among pure (static) CNN models. Moreover, as a general macro architecture fashion, RevCol can also be introduced into transformers or other neural networks, which is demonstrated to improve the performances in both computer vision and NLP tasks. We release code and models at https://github.com/megvii-research/RevCol

updated: Thu Dec 22 2022 13:37:59 GMT+0000 (UTC)

published: Thu Dec 22 2022 13:37:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト