Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Pruning

Huan Wang; Can Qin; Yue Bai; Yun Fu

ニューラルネットワークプルーニングの状態はなぜ混乱を招くのでしょうか?ネットワーク枝刈りにおける公平性、比較設定、訓練可能性について

ニューラルネットワークプルーニングの状態は、主に「標準化されたベンチマークとメトリックの欠如」が原因で、しばらく不明確であり、混乱を招くことさえあります [3]。ベンチマークを標準化するには、まず、どのような比較設定が公正と見なされるかという質問に答える必要があります。残念ながら、この基本的かつ重要な質問は、コミュニティではほとんど解明されていません。一方、いくつかの論文では、枝刈りの実験で (ひどく) 最適化されていないハイパーパラメータが使用されていることが確認されていますが、その背後にある理由もわかりにくいものです。これらの最適化されていないハイパーパラメーターは、歪んだベンチマークをさらに悪化させ、ニューラルネットワークのプルーニングの状態をさらにあいまいにします。プルーニングの 2 つのミステリーは、このような紛らわしい状況を表しています。より大きな微調整学習率のパフォーマンス向上効果と、フィルタープルーニングで事前トレーニング済みの重みを継承する価値のない引数です。この作業では、2 つの謎を解き明かすことによって、ネットワークプルーニングの紛らわしい状態を説明しようとします。具体的には、（1）最初に、剪定実験における公平性の原則を明確にし、広く使用されている比較設定を要約します。 (2) 次に、2 つの剪定の謎を明らかにし、ネットワークの訓練可能性の中心的な役割を指摘しますが、これはこれまであまり認識されていませんでした。 (3) 最後に、この論文を締めくくり、将来の剪定ベンチマークをどのように調整するかについて具体的な提案をします。コード: https://github.com/mingsun-tse/why-the-state-of-pruning-so-confusing.

The state of neural network pruning has been noticed to be unclear and even confusing for a while, largely due to "a lack of standardized benchmarks and metrics" [3]. To standardize benchmarks, first, we need to answer: what kind of comparison setup is considered fair? This basic yet crucial question has barely been clarified in the community, unfortunately. Meanwhile, we observe several papers have used (severely) sub-optimal hyper-parameters in pruning experiments, while the reason behind them is also elusive. These sub-optimal hyper-parameters further exacerbate the distorted benchmarks, rendering the state of neural network pruning even more obscure. Two mysteries in pruning represent such a confusing status: the performance-boosting effect of a larger finetuning learning rate, and the no-value argument of inheriting pretrained weights in filter pruning. In this work, we attempt to explain the confusing state of network pruning by demystifying the two mysteries. Specifically, (1) we first clarify the fairness principle in pruning experiments and summarize the widely-used comparison setups; (2) then we unveil the two pruning mysteries and point out the central role of network trainability, which has not been well recognized so far; (3) finally, we conclude the paper and give some concrete suggestions regarding how to calibrate the pruning benchmarks in the future. Code: https://github.com/mingsun-tse/why-the-state-of-pruning-so-confusing.

updated: Tue Feb 21 2023 21:36:43 GMT+0000 (UTC)

published: Thu Jan 12 2023 18:58:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト