ACQ: Improving Generative Data-free Quantization Via Attention Correction

Jixing Li; Xiaozhou Guo; Benzhe Dai; Guoliang Gong; Min Jin; Gang Chen; Wenyu Mao; Huaxiang Lu

ACQ: Attention Correction による Generative Data-Free Quantization の改善

データなしの量子化は、本物のサンプルにアクセスせずにモデルの量子化を実現することを目的としています。これは、データのプライバシーを含むアプリケーション指向のコンテキストで重要です。ジェネレーターを介してノイズベクトルを合成サンプルに変換することは、一般的なデータフリー量子化方法であり、ジェネレーティブデータフリー量子化と呼ばれます。ただし、合成サンプルと本物のサンプルでは注意点に違いがあります。これは常に無視され、量子化のパフォーマンスが制限されます。まず、同じクラスの合成サンプルは均一な注意を持つ傾向があるため、量子化されたネットワークは限られたモードの注意しか学習できません。第 2 に、評価モードとトレーニングモードの合成サンプルは異なる注意を示します。したがって、バッチ正規化統計の一致は不正確になる傾向があります。 ACQ は、合成サンプルの注意を修正するために、この論文で提案されています。クラス内注意の均質化に関して、注意中心位置条件生成器を確立する。注意中心のマッチング損失によって制限されるため、注意中心の位置は、さまざまな注意を得る際に合成サンプルを導くためのジェネレーターの条件入力として扱われます。さらに、同じ条件下でペアの合成サンプルの敵対的損失を設計して、ジェネレーターが条件に過度の注意を払うことを防ぎ、モード崩壊を引き起こす可能性があります。異なるネットワークモードでの合成サンプルの注意の類似性を改善するために、正確な BN 統計の一致を保証する一貫性ペナルティを導入します。実験結果は、ACQ が合成サンプルの注意力の問題を効果的に改善することを示しています。さまざまなトレーニング設定の下で、ACQ は最高の量子化パフォーマンスを実現します。 Resnet18 と Resnet50 の 4 ビット量子化では、ACQ はそれぞれ 67.55% と 72.23% の精度に達します。

Data-free quantization aims to achieve model quantization without accessing any authentic sample. It is significant in an application-oriented context involving data privacy. Converting noise vectors into synthetic samples through a generator is a popular data-free quantization method, which is called generative data-free quantization. However, there is a difference in attention between synthetic samples and authentic samples. This is always ignored and restricts the quantization performance. First, since synthetic samples of the same class are prone to have homogenous attention, the quantized network can only learn limited modes of attention. Second, synthetic samples in eval mode and training mode exhibit different attention. Hence, the batch-normalization statistics matching tends to be inaccurate. ACQ is proposed in this paper to fix the attention of synthetic samples. An attention center position-condition generator is established regarding the homogenization of intra-class attention. Restricted by the attention center matching loss, the attention center position is treated as the generator's condition input to guide synthetic samples in obtaining diverse attention. Moreover, we design adversarial loss of paired synthetic samples under the same condition to prevent the generator from paying overmuch attention to the condition, which may result in mode collapse. To improve the attention similarity of synthetic samples in different network modes, we introduce a consistency penalty to guarantee accurate BN statistics matching. The experimental results demonstrate that ACQ effectively improves the attention problems of synthetic samples. Under various training settings, ACQ achieves the best quantization performance. For the 4-bit quantization of Resnet18 and Resnet50, ACQ reaches 67.55% and 72.23% accuracy, respectively.

updated: Sat Jul 29 2023 04:36:20 GMT+0000 (UTC)

published: Wed Jan 18 2023 02:13:43 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト