Joint Global and Local Hierarchical Priors for Learned Image Compression

Jun-Hyuk Kim; Byeongho Heo; Jong-Seok Lee

学習した画像圧縮のためのグローバルおよびローカルの共同階層的事前確率

最近、学習した画像圧縮方法は、BPGを含む従来の手作りの方法を上回っています。この成功の鍵の1つは、量子化された潜在表現の確率分布を推定する学習エントロピーモデルです。他の視覚タスクと同様に、最近学習されたエントロピーモデルは、畳み込みニューラルネットワーク（CNN）に基づいています。ただし、CNNには、ローカル接続の性質のために長距離依存関係のモデル化に制限があります。これは、空間冗長性の削減が重要なポイントである画像圧縮の重大なボトルネックになる可能性があります。この問題を克服するために、アテンションメカニズムを使用してコンテンツに依存する方法でグローバル情報とローカル情報の両方を活用するInformation Transformer（Informer）と呼ばれる新しいエントロピーモデルを提案します。私たちの実験は、Informerが、二次計算の複雑さの問題なしに、KodakおよびTecnickデータセットの最先端の方法よりもレート歪みパフォーマンスを改善することを示しています。ソースコードはhttps://github.com/naver-ai/informerで入手できます。

Recently, learned image compression methods have outperformed traditional hand-crafted ones including BPG. One of the keys to this success is learned entropy models that estimate the probability distribution of the quantized latent representation. Like other vision tasks, most recent learned entropy models are based on convolutional neural networks (CNNs). However, CNNs have a limitation in modeling long-range dependencies due to their nature of local connectivity, which can be a significant bottleneck in image compression where reducing spatial redundancy is a key point. To overcome this issue, we propose a novel entropy model called Information Transformer (Informer) that exploits both global and local information in a content-dependent manner using an attention mechanism. Our experiments show that Informer improves rate--distortion performance over the state-of-the-art methods on the Kodak and Tecnick datasets without the quadratic computational complexity problem. Our source code is available at https://github.com/naver-ai/informer.

updated: Thu Jul 21 2022 03:34:14 GMT+0000 (UTC)

published: Wed Dec 08 2021 06:17:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト