Automated Progressive Learning for Efficient Training of Vision Transformers

Changlin Li; Bohan Zhuang; Guangrun Wang; Xiaodan Liang; Xiaojun Chang; Yi Yang

ビジョントランスフォーマーの効率的なトレーニングのための自動プログレッシブ学習

ビジョントランスフォーマー（ViT）の最近の進歩により、コンピューティング能力に対する貪欲な欲求が生まれ、ViTの効率的なトレーニング方法を開発する緊急の必要性が浮き彫りになりました。トレーニング中にモデルの容量が徐々に増加するトレーニングスキームであるプログレッシブ学習は、効率的なトレーニングでその能力を示し始めています。このホワイトペーパーでは、プログレッシブ学習をカスタマイズおよび自動化することにより、ViTの効率的なトレーニングに向けた実践的な一歩を踏み出します。まず、モデルの成長によってもたらされるギャップを埋めるために運動量の成長（MoGrow）を導入することにより、ViTの漸進的な学習のための強力な手動ベースラインを開発します。次に、自動プログレッシブ学習（AutoProg）を提案します。これは、トレーニングの過負荷をオンザフライで自動的に増加させることにより、ロスレス加速を実現することを目的とした効率的なトレーニングスキームです。これは、プログレッシブ学習中にモデルを成長させるかどうか、どこで、どれだけ成長させるかを適応的に決定することによって実現されます。具体的には、まず成長スケジュールの最適化をサブネットワークアーキテクチャ最適化問題に緩和し、次に弾性スーパーネットを介したサブネットワークパフォーマンスのワンショット推定を提案します。スーパーネットのパラメータをリサイクルすることにより、検索のオーバーヘッドが最小限に抑えられます。 2つの代表的なViTモデルであるDeiTとVOLOを使用したImageNetでの効率的なトレーニングの広範な実験は、AutoProgがパフォーマンスを低下させることなくViTトレーニングを最大85.1％加速できることを示しています。コード：https：//github.com/changlin31/AutoProg

Recent advances in vision Transformers (ViTs) have come with a voracious appetite for computing power, high-lighting the urgent need to develop efficient training methods for ViTs. Progressive learning, a training scheme where the model capacity grows progressively during training, has started showing its ability in efficient training. In this paper, we take a practical step towards efficient training of ViTs by customizing and automating progressive learning. First, we develop a strong manual baseline for progressive learning of ViTs, by introducing momentum growth (MoGrow) to bridge the gap brought by model growth. Then, we propose automated progressive learning (AutoProg), an efficient training scheme that aims to achieve lossless acceleration by automatically increasing the training overload on-the-fly; this is achieved by adaptively deciding whether, where and how much should the model grow during progressive learning. Specifically, we first relax the optimization of the growth schedule to sub-network architecture optimization problem, then propose one-shot estimation of the sub-network performance via an elastic supernet. The searching overhead is reduced to minimal by recycling the parameters of the supernet. Extensive experiments of efficient training on ImageNet with two representative ViT models, DeiT and VOLO, demonstrate that AutoProg can accelerate ViTs training by up to 85.1% with no performance drop. Code: https://github.com/changlin31/AutoProg

updated: Mon Mar 28 2022 05:37:08 GMT+0000 (UTC)

published: Mon Mar 28 2022 05:37:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト