DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks

Yonggan Fu; Haichuan Yang; Jiayi Yuan; Meng Li; Cheng Wan; Raghuraman Krishnamoorthi; Vikas Chandra; Yingyan Lin

DepthShrinker：コンパクトニューラルネットワークのリアルハードウェア効率の向上に向けた新しい圧縮パラダイム

コンパクト演算子（深さ方向の畳み込みなど）を備えた効率的なディープニューラルネットワーク（DNN）モデルは、適切なモデルの精度を維持しながら、DNNの理論上の複雑さ（重み/操作の総数など）を減らす大きな可能性を示しています。ただし、既存の効率的なDNNは、一般的に採用されているコンパクトオペレーターのハードウェア使用率が低いため、実際のハードウェアの効率を高めるという約束を果たすにはまだ限界があります。この作業では、実際のハードウェアで効率的なDNNを開発するための新しい圧縮パラダイムを開き、モデルの精度を維持しながらハードウェアの効率を高めます。興味深いことに、一部のDNNレイヤーの活性化関数は、DNNのトレーニングの最適化と達成可能な精度に役立ちますが、モデルの精度を損なうことなく、トレーニング後に適切に削除できることがわかります。この観察に触発されて、DepthShrinkerと呼ばれるフレームワークを提案します。これは、不規則な計算パターンを特徴とする既存の効率的なDNNの基本的な構成要素を、ハードウェアの使用率が大幅に向上し、実際のハードウェアの効率が向上した高密度のDNNに縮小することで、ハードウェアに適したコンパクトなネットワークを開発します。わくわくすることに、私たちのDepthShrinkerフレームワークは、最先端の効率的なDNNと圧縮技術の両方を上回る、ハードウェアに優しいコンパクトなネットワークを提供します。コードはhttps://github.com/facebookresearch/DepthShrinkerで入手できます。

Efficient deep neural network (DNN) models equipped with compact operators (e.g., depthwise convolutions) have shown great potential in reducing DNNs' theoretical complexity (e.g., the total number of weights/operations) while maintaining a decent model accuracy. However, existing efficient DNNs are still limited in fulfilling their promise in boosting real-hardware efficiency, due to their commonly adopted compact operators' low hardware utilization. In this work, we open up a new compression paradigm for developing real-hardware efficient DNNs, leading to boosted hardware efficiency while maintaining model accuracy. Interestingly, we observe that while some DNN layers' activation functions help DNNs' training optimization and achievable accuracy, they can be properly removed after training without compromising the model accuracy. Inspired by this observation, we propose a framework dubbed DepthShrinker, which develops hardware-friendly compact networks via shrinking the basic building blocks of existing efficient DNNs that feature irregular computation patterns into dense ones with much improved hardware utilization and thus real-hardware efficiency. Excitingly, our DepthShrinker framework delivers hardware-friendly compact networks that outperform both state-of-the-art efficient DNNs and compression techniques, e.g., a 3.06% higher accuracy and 1.53× throughput on Tesla V100 over SOTA channel-wise pruning method MetaPruning. Our codes are available at: https://github.com/facebookresearch/DepthShrinker.

updated: Fri Jun 17 2022 05:13:00 GMT+0000 (UTC)

published: Thu Jun 02 2022 02:32:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト