Vision Transformer Pruning

Mingjian Zhu; Yehui Tang; Kai Han

ビジョントランスフォーマーの剪定

ビジョントランスフォーマーは、さまざまなコンピュータービジョンアプリケーションで競争力のあるパフォーマンスを実現しています。ただし、それらのストレージ、ランタイムメモリ、および計算上の要求により、モバイルデバイスへの展開が妨げられています。ここでは、変圧器の各層の寸法の影響を特定し、それに応じて剪定を実行するビジョン変圧器の剪定アプローチを紹介します。トランスの寸法ごとのスパース性を促進することにより、重要な寸法が自動的に出現します。重要度スコアが小さい多数のディメンションを破棄して、精度を大幅に損なうことなく高い剪定率を実現できます。ビジョントランスプルーニングのパイプラインは次のとおりです。1）スパース性の正則化によるトレーニング。 2）線形投影の剪定寸法。 3）微調整。提案されたアルゴリズムの削減されたパラメータとFLOP比は、ImageNetデータセットで十分に評価および分析され、提案された方法の有効性を示しています。

Vision transformer has achieved competitive performance on a variety of computer vision applications. However, their storage, run-time memory, and computational demands are hindering the deployment to mobile devices. Here we present a vision transformer pruning approach, which identifies the impacts of dimensions in each layer of transformer and then executes pruning accordingly. By encouraging dimension-wise sparsity in the transformer, important dimensions automatically emerge. A great number of dimensions with small importance scores can be discarded to achieve a high pruning ratio without significantly compromising accuracy. The pipeline for vision transformer pruning is as follows: 1) training with sparsity regularization; 2) pruning dimensions of linear projections; 3) fine-tuning. The reduced parameters and FLOPs ratios of the proposed algorithm are well evaluated and analyzed on ImageNet dataset to demonstrate the effectiveness of our proposed method.

updated: Sat Aug 14 2021 06:06:37 GMT+0000 (UTC)

published: Sat Apr 17 2021 09:49:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト