Adaptive Data-Free Quantization

Biao Qian; Yang Wang; Richang Hong; Meng Wang

適応データフリー量子化

データフリー量子化 (DFQ) は、実際のデータにアクセスせずに量子化ネットワーク (Q) のパフォーマンスを回復しますが、代わりに完全精度ネットワーク (P) から学習することにより、ジェネレーター (G) を介して偽のサンプルを生成します。しかし、そのようなサンプル生成プロセスは、Q から完全に独立しており、生成されたサンプルからの知識の適応性、つまり、Q の学習プロセスに有益かどうかを見落とし、一般化エラーのオーバーフローにつながります。これに基づいて、いくつかの重要な質問 - さまざまなビット幅のシナリオでサンプルの Q への適応性を測定する方法は? Qの一般化を改善するために適応性の高いサンプルを生成する方法は?最大の適応性が最高かどうか?上記の質問に答えるために、この論文では、適応型データフリー量子化 (AdaDFQ) メソッドを提案します。これは、DFQ を 2 人のプレーヤー (ジェネレーターと量子化されたネットワーク) 間のサンプル適応性に関するゼロサムゲームとして再定式化します。この観点に従って、不一致サンプルと一致サンプルをさらに定義して 2 つの境界を形成します。そこでは、Q への望ましい適応性を備えたサンプルを生成するために、マージンが最適化されて過剰フィッティングの問題に対処します。 1) 最大の適応性は、Q の一般化に役立つサンプル生成に最適ではありません。 2) 生成されたサンプルの知識は、Q だけに有益であってはならず、P のトレーニングデータのカテゴリと分布情報にも関連している必要があります。芸術。コードは https: github.com/hfutqian/AdaDFQ で入手できます。

Data-free quantization (DFQ) recovers the performance of quantized network (Q) without accessing the real data, but generates the fake sample via a generator (G) by learning from full-precision network (P) instead. However, such sample generation process is totally independent of Q, overlooking the adaptability of the knowledge from generated samples, i.e., informative or not to the learning process of Q, resulting into the overflow of generalization error. Building on this, several critical questions -- how to measure the sample adaptability to Q under varied bit-width scenarios? how to generate the samples with large adaptability to improve Q's generalization? whether the largest adaptability is the best? To answer the above questions, in this paper, we propose an Adaptive Data-Free Quantization (AdaDFQ) method, which reformulates DFQ as a zero-sum game upon the sample adaptability between two players -- a generator and a quantized network. Following this viewpoint, we further define the disagreement and agreement samples to form two boundaries, where the margin is optimized to address the over-and-under fitting issues, so as to generate the samples with the desirable adaptability to Q. Our AdaDFQ reveals: 1) the largest adaptability is NOT the best for sample generation to benefit Q's generalization; 2) the knowledge of the generated sample should not be informative to Q only, but also related to the category and distribution information of the training data for P. The theoretical and empirical analysis validate the advantages of AdaDFQ over the state-of-the-arts. Our code is available at https: github.com/hfutqian/AdaDFQ.

updated: Mon Mar 13 2023 05:37:40 GMT+0000 (UTC)

published: Mon Mar 13 2023 05:37:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト