MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding

Bowen Liu; Yu Chen; Rakesh Chowdary Machineni; Shiyu Liu; Hun-Seok Kim

MMVC: ブロックベースの予測モード選択と密度適応型エントロピーコーディングによる学習済みマルチモードビデオ圧縮

学習ベースのビデオ圧縮は、過去数年間にわたって広く研究されてきましたが、さまざまなモーションパターンやエントロピーモデルへの適応にはまだ限界があります。この論文では、さまざまな動きパターンに適応する機能ドメイン予測に最適なモードを選択するブロック単位モードアンサンブルディープビデオ圧縮フレームワークであるマルチモードビデオ圧縮 (MMVC) を提案します。提案されているマルチモードには、ConvLSTM ベースの機能ドメイン予測、オプティカルフロー条件付き機能ドメイン予測、機能伝播が含まれており、明らかな動きのない静的シーンからカメラが移動する動的シーンまで、幅広いケースに対応します。空間ブロックベースの表現での時間予測のために、特徴空間をブロックに分割します。エントロピーコーディングでは、量子化後の残差ブロックが密と疎の両方を考慮し、オプションのランレングスコーディングを疎な残差に適用して圧縮率を向上させます。この意味で、私たちの方法は、バイナリ密度マップによって導かれるデュアルモードエントロピーコーディングスキームを使用します。これは、バイナリ選択マップを送信するための追加コストを上回る大幅なレート削減を提供します。最も人気のあるベンチマークデータセットのいくつかを使用して、スキームを検証します。最先端のビデオ圧縮スキームと標準コーデックと比較して、私たちの方法は、PSNR と MS-SSIM で測定されたより優れた、または競争力のある結果をもたらします。

Learning-based video compression has been extensively studied over the past years, but it still has limitations in adapting to various motion patterns and entropy models. In this paper, we propose multi-mode video compression (MMVC), a block wise mode ensemble deep video compression framework that selects the optimal mode for feature domain prediction adapting to different motion patterns. Proposed multi-modes include ConvLSTM-based feature domain prediction, optical flow conditioned feature domain prediction, and feature propagation to address a wide range of cases from static scenes without apparent motions to dynamic scenes with a moving camera. We partition the feature space into blocks for temporal prediction in spatial block-based representations. For entropy coding, we consider both dense and sparse post-quantization residual blocks, and apply optional run-length coding to sparse residuals to improve the compression rate. In this sense, our method uses a dual-mode entropy coding scheme guided by a binary density map, which offers significant rate reduction surpassing the extra cost of transmitting the binary selection map. We validate our scheme with some of the most popular benchmarking datasets. Compared with state-of-the-art video compression schemes and standard codecs, our method yields better or competitive results measured with PSNR and MS-SSIM.

updated: Wed Apr 05 2023 07:37:48 GMT+0000 (UTC)

published: Wed Apr 05 2023 07:37:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト