Calibrating the Rigged Lottery: Making All Tickets Reliable

Bowen Lei; Ruqi Zhang; Dongkuan Xu; Bani Mallick

不正な宝くじの調整: すべてのチケットの信頼性を高める

スパーストレーニングは、メモリを節約し、トレーニングを高速化し、推論時間を短縮するために、リソースが制限されたさまざまなディープラーニングタスクで成功裏に使用されていますが、生成されたスパースモデルの信頼性は未調査のままです。以前の研究では、ディープニューラルネットワークは自信過剰になる傾向があることが示されていますが、まばらなトレーニングがこの問題を悪化させることがわかっています。したがって、スパースモデルを調整することは、信頼性の高い予測と意思決定にとって非常に重要です。この論文では、改善された信頼キャリブレーションを備えたスパースモデルを生成するための新しいスパーストレーニング方法を提案します。スパーストポロジを制御するために 1 つのマスクのみを使用する以前の研究とは対照的に、この方法では、決定論的マスクとランダムマスクを含む 2 つのマスクを使用します。前者は、重みと勾配の大きさを利用して、重要な重みを効率的に検索してアクティブにします。後者はより良い探索をもたらし、ランダムな更新によってより適切な重み値を見つけます。理論的には、私たちの方法が確率論的な深いガウス過程の階層的な変分近似と見なすことができることを証明します。複数のデータセット、モデルアーキテクチャ、スパース性に関する広範な実験により、私たちの方法が ECE 値を最大 47.8% 削減し、同時に計算とストレージの負荷をわずかに増加させるだけで精度を維持または向上させることが示されています。

Although sparse training has been successfully used in various resource-limited deep learning tasks to save memory, accelerate training, and reduce inference time, the reliability of the produced sparse models remains unexplored. Previous research has shown that deep neural networks tend to be over-confident, and we find that sparse training exacerbates this problem. Therefore, calibrating the sparse models is crucial for reliable prediction and decision-making. In this paper, we propose a new sparse training method to produce sparse models with improved confidence calibration. In contrast to previous research that uses only one mask to control the sparse topology, our method utilizes two masks, including a deterministic mask and a random mask. The former efficiently searches and activates important weights by exploiting the magnitude of weights and gradients. While the latter brings better exploration and finds more appropriate weight values by random updates. Theoretically, we prove our method can be viewed as a hierarchical variational approximation of a probabilistic deep Gaussian process. Extensive experiments on multiple datasets, model architectures, and sparsities show that our method reduces ECE values by up to 47.8% and simultaneously maintains or even improves accuracy with only a slight increase in computation and storage burden.

updated: Wed Mar 01 2023 03:48:17 GMT+0000 (UTC)

published: Sat Feb 18 2023 15:53:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト