CaDM: Codec-aware Diffusion Modeling for Neural-enhanced Video Streaming

Qihua Zhou; Ruibin Li; Song Guo; Peiran Dong; Yi Liu; Jingcai Guo; Zhenda Xu

CaDM: ニューラル強化ビデオストリーミングのためのコーデック対応拡散モデリング

近年、ストリーマーのアップリンク帯域幅に合わせてビデオビットストリームが圧縮され、低品質で配信されるインターネットビデオトラフィックが劇的に増加しています。品質の低下を軽減するために、Neural-enhanced Video Streaming (NVS) が登場しました。これは、主にメディアサーバーにニューラル超解像 (SR) を展開することで、低品質のビデオを回復する大きな可能性を示しています。その利点にもかかわらず、SR 強化を伴う現在の主流の作業は、ビットレートの節約と品質の回復の間で望ましいレートと歪みのトレードオフを達成していないことを明らかにします。エンコーダーの、(2) 忠実度の高い知覚的な詳細を復元するための限られた生成能力、および (3) 色のビット深度を考慮せずに、解像度の観点からのみ圧縮および復元パイプラインを最適化する。これらの制限を克服することを目指して、拡散モデルの固有の視覚生成特性を活用することにより、エンコーダー/デコーダー (つまり、コーデック) の相乗効果を実行する最初の企業です。具体的には、コーデック対応拡散モデリング (CaDM) を紹介します。これは、既存の方法よりもかなり高い復元容量を保持しながら、ストリーミング配信ビットレートを大幅に削減する新しい NVS パラダイムです。まず、CaDM は、ビデオフレームの解像度とカラービット深度を同時に削減することで、エンコーダの圧縮効率を向上させます。第 2 に、CaDM は、エンコーダーの解像度と色の状態をノイズ除去拡散復元に認識させることで、デコーダーに高品質の拡張機能を提供します。 OpenMMLab ベンチマークを使用したパブリッククラウドサービスでの評価では、CaDM が一般的なビデオ標準に基づいて最大 5.12 ～ 21.44 倍のビットレートを効果的に節約し、最先端の神経強化方法よりもはるかに優れた回復品質 (たとえば、0.61 の FID) を達成することが示されています。 .

Recent years have witnessed the dramatic growth of Internet video traffic, where the video bitstreams are often compressed and delivered in low quality to fit the streamer's uplink bandwidth. To alleviate the quality degradation, it comes the rise of Neural-enhanced Video Streaming (NVS), which shows great prospects for recovering low-quality videos by mostly deploying neural super-resolution (SR) on the media server. Despite its benefit, we reveal that current mainstream works with SR enhancement have not achieved the desired rate-distortion trade-off between bitrate saving and quality restoration, due to: (1) overemphasizing the enhancement on the decoder side while omitting the co-design of encoder, (2) limited generative capacity to recover high-fidelity perceptual details, and (3) optimizing the compression-and-restoration pipeline from the resolution perspective solely, without considering color bit-depth. Aiming at overcoming these limitations, we are the first to conduct an encoder-decoder (i.e., codec) synergy by leveraging the inherent visual-generative property of diffusion models. Specifically, we present the Codec-aware Diffusion Modeling (CaDM), a novel NVS paradigm to significantly reduce streaming delivery bitrates while holding pretty higher restoration capacity over existing methods. First, CaDM improves the encoder's compression efficiency by simultaneously reducing resolution and color bit-depth of video frames. Second, CaDM empowers the decoder with high-quality enhancement by making the denoising diffusion restoration aware of encoder's resolution-color conditions. Evaluation on public cloud services with OpenMMLab benchmarks shows that CaDM effectively saves up to 5.12 - 21.44 times bitrates based on common video standards and achieves much better recovery quality (e.g., FID of 0.61) over state-of-the-art neural-enhancing methods.

updated: Wed Mar 08 2023 14:25:10 GMT+0000 (UTC)

published: Tue Nov 15 2022 05:14:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト