Parallelized Rate-Distortion Optimized Quantization Using Deep Learning

Dana Kianfar; Auke Wiggers; Amir Said; Reza Pourreza; Taco Cohen

ディープラーニングを使用した並列化されたレート歪み最適化量子化

レート歪み最適化量子化（RDOQ）は、H.264 / AVC、H.265 / HEVC、VP9、AV1などの最近のビデオ圧縮規格のコーディングパフォーマンスにおいて重要な役割を果たしてきました。このスキームでは、歪みの増加は比較的小さいものの、ビットレートが大幅に低下します。通常、RDOQアルゴリズムは、シーケンシャルな性質とエントロピーコーディングコストを頻繁に取得する必要があるため、リアルタイムハードウェアエンコーダーに実装するには非常にコストがかかります。この作業は、オフラインの教師ありトレーニング中にレートと歪みをトレードオフすることを学習するニューラルネットワークベースのアプローチを使用して、この制限に対処します。これらのネットワークは、既存のニューラルネットワークハードウェアで実行できる標準の算術演算のみに基づいているため、専用のRDOQ回路用に追加のエリアオンチップを予約する必要はありません。完全畳み込みネットワークと自己回帰ネットワークの2つのクラスのニューラルネットワークをトレーニングし、それぞれをスカラー量子化（SQ）などの安価な量子化スキームを改良するために設計された量子化後のステップとして評価します。どちらのネットワークアーキテクチャも、計算のオーバーヘッドが低くなるように設計されています。トレーニング後、それらはHEVCのHM 16.20実装に統合され、それらのビデオコーディングパフォーマンスはH.266 / VVCSDR共通テストシーケンスのサブセットで評価されます。 HM16.20のRDOQおよびSQの実装と比較されます。私たちの方法は、HM SQアンカーと比較して光度で1.64％のBDレートの節約を達成し、平均して反復HM RDOQアルゴリズムのパフォーマンスの45％に達します。

Rate-Distortion Optimized Quantization (RDOQ) has played an important role in the coding performance of recent video compression standards such as H.264/AVC, H.265/HEVC, VP9 and AV1. This scheme yields significant reductions in bit-rate at the expense of relatively small increases in distortion. Typically, RDOQ algorithms are prohibitively expensive to implement on real-time hardware encoders due to their sequential nature and their need to frequently obtain entropy coding costs. This work addresses this limitation using a neural network-based approach, which learns to trade-off rate and distortion during offline supervised training. As these networks are based solely on standard arithmetic operations that can be executed on existing neural network hardware, no additional area-on-chip needs to be reserved for dedicated RDOQ circuitry. We train two classes of neural networks, a fully-convolutional network and an auto-regressive network, and evaluate each as a post-quantization step designed to refine cheap quantization schemes such as scalar quantization (SQ). Both network architectures are designed to have a low computational overhead. After training they are integrated into the HM 16.20 implementation of HEVC, and their video coding performance is evaluated on a subset of the H.266/VVC SDR common test sequences. Comparisons are made to RDOQ and SQ implementations in HM 16.20. Our method achieves 1.64% BD-rate savings on luminosity compared to the HM SQ anchor, and on average reaches 45% of the performance of the iterative HM RDOQ algorithm.

updated: Fri Dec 11 2020 14:28:30 GMT+0000 (UTC)

published: Fri Dec 11 2020 14:28:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト