ResMLP: Feedforward networks for image classification with data-efficient training

Hugo Touvron; Piotr Bojanowski; Mathilde Caron; Matthieu Cord; Alaaeldin El-Nouby; Edouard Grave; Armand Joulin; Gabriel Synnaeve; Jakob Verbeek; Hervé Jégou

ResMLP：データ効率の高いトレーニングによる画像分類のためのフィードフォワードネットワーク

画像分類のために多層パーセプトロン上に完全に構築されたアーキテクチャであるResMLPを紹介します。これは、（i）画像パッチがチャネル間で独立して同一に相互作用する線形層と（ii）チャネルがパッチごとに独立して相互作用する2層フィードフォワードネットワークを交互に繰り返す単純な残差ネットワークです。大量のデータ拡張とオプションで蒸留を使用する最新のトレーニング戦略でトレーニングすると、ImageNetで驚くほど優れた精度と複雑さのトレードオフが達成されます。 Timmライブラリと事前トレーニング済みモデルに基づいてコードを共有します。

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.

updated: Fri May 07 2021 17:31:44 GMT+0000 (UTC)

published: Fri May 07 2021 17:31:44 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト