Visual Transformer Pruning

Mingjian Zhu; Kai Han; Yehui Tang; Yunhe Wang

ビジュアルトランスフォーマーの剪定

ビジュアルトランスフォーマーは、さまざまなコンピュータービジョンアプリケーションで競争力のあるパフォーマンスを実現しています。ただし、それらのストレージ、ランタイムメモリ、および計算上の要求により、モバイルデバイスへの展開が妨げられています。ここでは、各レイヤーのチャネルの影響を識別し、それに応じてプルーニングを実行する、視覚的なトランスプルーニングアプローチを紹介します。 Transformerでチャネルごとのスパース性を促進することにより、重要なチャネルが自動的に出現します。係数が小さい多数のチャネルを破棄して、精度を大幅に損なうことなく高いプルーニング比を実現できます。ビジュアルトランスプルーニングのパイプラインは次のとおりです。1）スパース性の正則化によるトレーニング。 2）チャネルの剪定。 3）微調整。提案されたアルゴリズムの削減されたパラメータとFLOP比は、その有効性を実証するためにImageNetデータセットで十分に評価および分析されています。

Visual transformer has achieved competitive performance on a variety of computer vision applications. However, their storage, run-time memory, and computational demands are hindering the deployment on mobile devices. Here we present an visual transformer pruning approach, which identifies the impacts of channels in each layer and then executes pruning accordingly. By encouraging channel-wise sparsity in the Transformer, important channels automatically emerge. A great number of channels with small coefficients can be discarded to achieve a high pruning ratio without significantly compromising accuracy. The pipeline for visual transformer pruning is as follows: 1) training with sparsity regularization; 2) pruning channels; 3) finetuning. The reduced parameters and FLOPs ratios of the proposed algorithm are well evaluated and analyzed on ImageNet dataset to demonstrate its effectiveness.

updated: Sat Apr 17 2021 09:49:24 GMT+0000 (UTC)

published: Sat Apr 17 2021 09:49:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト