Lossy Image Compression with Quantized Hierarchical VAEs

Zhihao Duan; Ming Lu; Zhan Ma; Fengqing Zhu

量子化された階層型 VAE による非可逆画像圧縮

最近の研究では、変分オートエンコーダー (VAE) とレート歪み理論の間の強力な理論的つながりが示されています。これに動機付けられて、生成モデリングの観点から非可逆画像圧縮の問題を検討します。もともとデータ (画像) 分布モデリング用に設計された ResNet VAE から始めて、量子化を意識した事後および事前を使用して潜在変数モデルを再設計し、テスト時に簡単な量子化とエントロピーコーディングを可能にします。改善されたニューラルネットワークアーキテクチャに加えて、自然な画像の非可逆圧縮に関する以前の方法よりも優れた、強力で効率的なモデルを提示します。私たちのモデルは、粗いものから細かいものへの方法で画像を圧縮し、並列エンコードとデコードをサポートしているため、GPU での高速実行につながります。コードは https://github.com/duanzhiihao/lossy-vae で入手できます。

Recent research has shown a strong theoretical connection between variational autoencoders (VAEs) and the rate-distortion theory. Motivated by this, we consider the problem of lossy image compression from the perspective of generative modeling. Starting with ResNet VAEs, which are originally designed for data (image) distribution modeling, we redesign their latent variable model using a quantization-aware posterior and prior, enabling easy quantization and entropy coding at test time. Along with improved neural network architecture, we present a powerful and efficient model that outperforms previous methods on natural image lossy compression. Our model compresses images in a coarse-to-fine fashion and supports parallel encoding and decoding, leading to fast execution on GPUs. Code is available at https://github.com/duanzhiihao/lossy-vae.

updated: Sat Mar 25 2023 15:52:29 GMT+0000 (UTC)

published: Sat Aug 27 2022 17:15:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト