A Residual Diffusion Model for High Perceptual Quality Codec Augmentation

Noor Fathima Ghouse; Jens Petersen; Auke Wiggers; Tianlin Xu; Guillaume Sautière

高知覚品質コーデック増強のための残留拡散モデル

拡散確率モデルは、最近、高品質の画像およびビデオデータの生成において目覚ましい成功を収めました。この作業では、このクラスの生成モデルを構築し、高解像度画像の非可逆圧縮の方法を紹介します。得られたコーデックは、DIffuson ベースの Residual Augmentation Codec (DIRAC) と呼ばれ、テスト時にレート、歪み、知覚のトレードオフをスムーズに通過できるようにする最初のニューラルコーデックであり、知覚品質において GAN ベースの方法と競合するパフォーマンスを得ます.さらに、拡散確率モデルからのサンプリングは非常にコストがかかることで知られていますが、圧縮設定ではステップ数を大幅に削減できることを示しています。

Diffusion probabilistic models have recently achieved remarkable success in generating high quality image and video data. In this work, we build on this class of generative models and introduce a method for lossy compression of high resolution images. The resulting codec, which we call DIffuson-based Residual Augmentation Codec (DIRAC), is the first neural codec to allow smooth traversal of the rate-distortion-perception tradeoff at test time, while obtaining competitive performance with GAN-based methods in perceptual quality. Furthermore, while sampling from diffusion probabilistic models is notoriously expensive, we show that in the compression setting the number of steps can be drastically reduced.

updated: Wed Mar 29 2023 16:13:22 GMT+0000 (UTC)

published: Fri Jan 13 2023 11:27:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト