Rethinking Data Distillation: Do Not Overlook Calibration

Dongyao Zhu; Bowen Lei; Jie Zhang; Yanbo Fang; Ruqi Zhang; Yiqun Xie; Dongkuan Xu

データ抽出の再考: キャリブレーションを見落とさないでください

抽出されたデータに基づいてトレーニングされたニューラルネットワークは、多くの場合、自信過剰な出力を生成するため、キャリブレーション方法による修正が必要になります。温度スケーリングやミックスアップなどの既存のキャリブレーション手法は、元の大規模データでトレーニングされたネットワークにはうまく機能します。ただし、これらの方法では、大規模なソースデータセットから抽出されたデータに基づいてトレーニングされたネットワークを調整できないことがわかりました。この論文では、(i) 最大ロジットのより集中した分布、および (ii) 意味的に意味があるが分類タスクとは関係のない情報の損失により、蒸留されたデータが調整不可能なネットワークにつながることを示します。この問題に対処するために、蒸留データの制限を緩和し、データセット蒸留の効率を維持しながらより良いキャリブレーション結果を達成するマスク温度スケーリング (MTS) とマスク蒸留トレーニング (MDT) を提案します。

Neural networks trained on distilled data often produce over-confident output and require correction by calibration methods. Existing calibration methods such as temperature scaling and mixup work well for networks trained on original large-scale data. However, we find that these methods fail to calibrate networks trained on data distilled from large source datasets. In this paper, we show that distilled data lead to networks that are not calibratable due to (i) a more concentrated distribution of the maximum logits and (ii) the loss of information that is semantically meaningful but unrelated to classification tasks. To address this problem, we propose Masked Temperature Scaling (MTS) and Masked Distillation Training (MDT) which mitigate the limitations of distilled data and achieve better calibration results while maintaining the efficiency of dataset distillation.

updated: Mon Aug 21 2023 16:16:59 GMT+0000 (UTC)

published: Mon Jul 24 2023 00:53:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト