Make RepVGG Greater Again: A Quantization-aware Approach

Xiangxiang Chu; Liang Li; Bo Zhang

RepVGG を再び大きくする: 量子化を意識したアプローチ

実際のアプリケーションでは、パフォーマンスと推論速度のトレードオフが重要です。アーキテクチャの再パラメータ化はより良いトレードオフを実現し、最新の畳み込みニューラルネットワークでますます一般的な要素になりつつあります。それにもかかわらず、その量子化パフォーマンスは通常、INT8 推論が必要な場合に展開するにはあまりにも貧弱です (たとえば、ImageNet でトップ 1 の精度が 20% 以上低下します)。この論文では、元の設計では必然的に量子化誤差が拡大するという、この失敗の根底にあるメカニズムについて詳しく説明します。再パラメータ化の利点も享受する量子化に適した構造を持つための、シンプルで堅牢で効果的な救済策を提案します。私たちの方法は、RepVGG の INT8 と FP32 の精度の間のギャップを大幅に埋めます。付加機能がなければ、ImageNet のトップ 1 の精度低下は、標準的なトレーニング後の量子化によって 2% 以内に減少します。

The tradeoff between performance and inference speed is critical for practical applications. Architecture reparameterization obtains better tradeoffs and it is becoming an increasingly popular ingredient in modern convolutional neural networks. Nonetheless, its quantization performance is usually too poor to deploy (e.g. more than 20% top-1 accuracy drop on ImageNet) when INT8 inference is desired. In this paper, we dive into the underlying mechanism of this failure, where the original design inevitably enlarges quantization error. We propose a simple, robust, and effective remedy to have a quantization-friendly structure that also enjoys reparameterization benefits. Our method greatly bridges the gap between INT8 and FP32 accuracy for RepVGG. Without bells and whistles, the top-1 accuracy drop on ImageNet is reduced within 2% by standard post-training quantization.

updated: Sat Dec 03 2022 11:14:10 GMT+0000 (UTC)

published: Sat Dec 03 2022 11:14:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト