Exploring Long & Short Range Temporal Information for Learned Video Compression

Huairui Wang; Zhenzhong Chen

学習したビデオ圧縮のための長期および短期の時間情報の探索

学習されたビデオ圧縮方法は、従来のビデオコーデックのレート歪み (RD) パフォーマンスに匹敵するか、それを超えることさえあるため、ビデオコーディングコミュニティでさまざまな関心を集めています。ただし、現在の学習ベースの方法の多くは、短期的な時間情報の利用に専念しているため、パフォーマンスが制限されます。このホワイトペーパーでは、ビデオコンテンツの固有の特性を活用し、時間情報をさらに調査して圧縮パフォーマンスを向上させることに焦点を当てます。具体的には、長期的な時間情報を活用するために、推論中にグループオブピクチャ（GOP）内で継続的に更新できる時間事前分布を提案します。その場合、一時的な事前情報には、現在の GOP 内のすべてのデコードされた画像の貴重な一時的な情報が含まれています。短距離の時間情報に関しては、堅牢で効果的な補償を実現するために、プログレッシブガイド付き動き補償を提案します。詳細には、マルチスケール補償を実現するための階層構造を設計します。さらに重要なことは、オプティカルフローガイダンスを使用して、各スケールのフィーチャマップ間のピクセルオフセットを生成し、各スケールでの補正結果を使用して、次のスケールの補正をガイドすることです。十分な実験結果は、私たちの方法が最先端のビデオ圧縮アプローチよりも優れた RD パフォーマンスを得ることができることを示しています。コードは、https://github.com/Huairui/LSTVC で公開されています。

Learned video compression methods have gained a variety of interest in the video coding community since they have matched or even exceeded the rate-distortion (RD) performance of traditional video codecs. However, many current learning-based methods are dedicated to utilizing short-range temporal information, thus limiting their performance. In this paper, we focus on exploiting the unique characteristics of video content and further exploring temporal information to enhance compression performance. Specifically, for long-range temporal information exploitation, we propose temporal prior that can update continuously within the group of pictures (GOP) during inference. In that case temporal prior contains valuable temporal information of all decoded images within the current GOP. As for short-range temporal information, we propose a progressive guided motion compensation to achieve robust and effective compensation. In detail, we design a hierarchical structure to achieve multi-scale compensation. More importantly, we use optical flow guidance to generate pixel offsets between feature maps at each scale, and the compensation results at each scale will be used to guide the following scale's compensation. Sufficient experimental results demonstrate that our method can obtain better RD performance than state-of-the-art video compression approaches. The code is publicly available on: https://github.com/Huairui/LSTVC.

updated: Tue Aug 23 2022 07:38:14 GMT+0000 (UTC)

published: Sun Aug 07 2022 15:57:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト