Offline and Online Optical Flow Enhancement for Deep Video Compression

Chuanbo Tang; Xihua Sheng; Zhuoyuan Li; Haotian Zhang; Li Li; Dong Liu

オフラインおよびオンラインのオプティカルフロー強化によるディープビデオ圧縮

ビデオ圧縮はビデオフレーム間の時間的冗長性の活用に大きく依存しており、これは通常、動き情報を推定して使用することによって実現されます。既存のディープビデオ圧縮ネットワークのほとんどでは、モーション情報はオプティカルフローとして表現されます。実際、これらのネットワークは、動き推定に事前トレーニングされたオプティカルフロー推定ネットワークを採用することがよくあります。ただし、オプティカルフローは、次の 2 つの要因により、ビデオ圧縮にはあまり適していない可能性があります。まず、オプティカルフロー推定ネットワークは、フレーム間予測を可能な限り正確に実行するようにトレーニングされていますが、オプティカルフロー自体のエンコードにはビット数が多すぎる可能性があります。第 2 に、オプティカルフロー推定ネットワークは合成データに基づいてトレーニングされており、現実世界のビデオには十分に一般化できない可能性があります。私たちは、オフラインとオンラインの 2 段階でオプティカルフローを強化することで、2 つの制限に対処します。オフライン段階では、H.266/VVC などの従来の (非ディープ) ビデオ圧縮スキームによって提供される動き情報を使用して、トレーニング済みのオプティカルフロー推定ネットワークを微調整します。これは、H.266/VVC の動き情報を信じているためです。 VVC は、より優れたレートと歪みのトレードオフを実現します。オンライン段階では、圧縮されるビデオに対して勾配降下ベースのアルゴリズムを使用してオプティカルフローの潜在的な特徴をさらに最適化し、オプティカルフローの適応性を強化します。最先端のディープビデオ圧縮方式 DCVC の実験を行っています。実験結果は、提案されたオフラインとオンラインの拡張機能を組み合わせると、デコーダー側のモデルや計算の複雑さを増加させることなく、テストされたビデオで平均 12.8% のビットレート節約を達成することを示しています。

Video compression relies heavily on exploiting the temporal redundancy between video frames, which is usually achieved by estimating and using the motion information. The motion information is represented as optical flows in most of the existing deep video compression networks. Indeed, these networks often adopt pre-trained optical flow estimation networks for motion estimation. The optical flows, however, may be less suitable for video compression due to the following two factors. First, the optical flow estimation networks were trained to perform inter-frame prediction as accurately as possible, but the optical flows themselves may cost too many bits to encode. Second, the optical flow estimation networks were trained on synthetic data, and may not generalize well enough to real-world videos. We address the twofold limitations by enhancing the optical flows in two stages: offline and online. In the offline stage, we fine-tune a trained optical flow estimation network with the motion information provided by a traditional (non-deep) video compression scheme, e.g. H.266/VVC, as we believe the motion information of H.266/VVC achieves a better rate-distortion trade-off. In the online stage, we further optimize the latent features of the optical flows with a gradient descent-based algorithm for the video to be compressed, so as to enhance the adaptivity of the optical flows. We conduct experiments on a state-of-the-art deep video compression scheme, DCVC. Experimental results demonstrate that the proposed offline and online enhancement together achieves on average 12.8% bitrate saving on the tested videos, without increasing the model or computational complexity of the decoder side.

updated: Tue Jul 11 2023 07:52:06 GMT+0000 (UTC)

published: Tue Jul 11 2023 07:52:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト