A Surrogate Lagrangian Relaxation-based Model Compression for Deep Neural Networks

Deniz Gurevin; Shanglin Zhou; Lynn Pepin; Bingbing Li; Mikhail Bragin; Caiwen Ding; Fei Miao

深層ニューラルネットワークのための代理ラグランジュ緩和ベースのモデル圧縮

ネットワークプルーニングは、ディープニューラルネットワークの計算コストとモデルサイズを削減するために広く使用されている手法です。ただし、一般的な3段階のパイプライン、つまりトレーニング、プルーニング、および再トレーニング（微調整）により、トレーニングトレイル全体が大幅に増加します。たとえば、再トレーニングプロセスは、ImageNet上のResNet-18で最大80エポックかかる可能性があります。これは、元のモデルトレーニングトレイルの70％です。この論文では、サロゲートラグランジュ緩和（SLR）に基づく体系的な重み剪定最適化アプローチを開発します。これは、重み剪定問題の離散的な性質によって引き起こされる問題を克服し、高速収束を保証するように調整されています。重み剪定問題をサブ問題に分解します。サブ問題は、ラグランジュ乗数を更新することによって調整されます。次に、2次ペナルティ項を使用して収束を加速します。 ImageNetとCIFAR-10を使用した画像分類タスク（ResNet-18、ResNet-50、VGG-16）と、COCO 2014を使用したオブジェクト検出タスク（YOLOv3とYOLOv3-tiny）、PointPillarsを使用したPointPillarsで提案された方法を評価します。 KITTI 2017、およびTuSimpleレーン検出データセットを使用した超高速レーン検出。数値テストの結果は、サロゲートラグランジュ緩和法の採用により、SLRベースの重み剪定最適化アプローチが、PointPillarsオブジェクト検出モデルのように、多くのエポックを再トレーニングすることなく、ハード剪定段階でも高いモデル精度を達成することを示しています。精度の低下が1％未満で、3つのエポックを再トレーニングするだけで9.44倍の圧縮率を達成するKITTIデータセット。圧縮率が高くなると、SLRはADMMよりもパフォーマンスが向上し始め、それらの間の精度のギャップが大きくなります。 SLRは、9.49倍の圧縮でプルーニングした後、PointPillarsでADMMよりも15.2％高い精度を達成します。エポックの再トレーニングの予算が限られていることを考えると、私たちのアプローチはモデルの精度をすばやく回復します。

Network pruning is a widely used technique to reduce computation cost and model size for deep neural networks. However, the typical three-stage pipeline, i.e., training, pruning and retraining (fine-tuning) significantly increases the overall training trails. For instance, the retraining process could take up to 80 epochs for ResNet-18 on ImageNet, that is 70% of the original model training trails. In this paper, we develop a systematic weight-pruning optimization approach based on Surrogate Lagrangian relaxation (SLR), which is tailored to overcome difficulties caused by the discrete nature of the weight-pruning problem while ensuring fast convergence. We decompose the weight-pruning problem into subproblems, which are coordinated by updating Lagrangian multipliers. Convergence is then accelerated by using quadratic penalty terms. We evaluate the proposed method on image classification tasks, i.e., ResNet-18, ResNet-50 and VGG-16 using ImageNet and CIFAR-10, as well as object detection tasks, i.e., YOLOv3 and YOLOv3-tiny using COCO 2014, PointPillars using KITTI 2017, and Ultra-Fast-Lane-Detection using TuSimple lane detection dataset. Numerical testing results demonstrate that with the adoption of the Surrogate Lagrangian Relaxation method, our SLR-based weight-pruning optimization approach achieves a high model accuracy even at the hard-pruning stage without retraining for many epochs, such as on PointPillars object detection model on KITTI dataset where we achieve 9.44x compression rate by only retraining for 3 epochs with less than 1% accuracy loss. As the compression rate increases, SLR starts to perform better than ADMM and the accuracy gap between them increases. SLR achieves 15.2% better accuracy than ADMM on PointPillars after pruning under 9.49x compression. Given a limited budget of retraining epochs, our approach quickly recovers the model accuracy.

updated: Fri Dec 18 2020 07:17:30 GMT+0000 (UTC)

published: Fri Dec 18 2020 07:17:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト