RD-Optimized Trit-Plane Coding of Deep Compressed Image Latent Tensors

Seungmin Jeon; Jae-Han Lee; Chang-Su Kim

深く圧縮された画像潜像テンソルのRD最適化されたTrit-Planeコーディング

DPICTは、きめ細かいスケーラビリティをサポートする最初の学習ベースの画像コーデックです。このホワイトペーパーでは、DPICTの2つの主要コンポーネントであるトリットプレーンスライシングとレート歪み最適化（RD最適化）コーディングを効率的に実装する方法について説明します。 DPICTでは、画像を潜在テンソルに変換し、テンソルを3桁の数字（トリット）で表し、重要度の高い順にトリットをエンコードします。エントロピー符号化の場合、各トリットの確率を計算する必要があります。これには、エンコーダーとデコーダーの両方で高い時間計算量が必要です。複雑さを軽減するために、確率の並列計算スキームを開発します。これは、擬似コードで詳細に説明されています。さらに、DPICTのトリットプレーンスライシングを代替ビットプレーンスライシングと比較します。実験結果は、並列計算によって時間計算量が大幅に削減され、ビットプレーンスライシングよりもトリットプレーンスライシングの方が優れたRDパフォーマンスを提供することを示しています。

DPICT is the first learning-based image codec supporting fine granular scalability. In this paper, we describe how to implement two key components of DPICT efficiently: trit-plane slicing and rate-distortion-optimized (RD-optimized) coding. In DPICT, we transform an image into a latent tensor, represent the tensor in ternary digits (trits), and encode the trits in the decreasing order of significance. For entropy encoding, it is necessary to compute the probability of each trit, which demands high time complexity in both the encoder and the decoder. To reduce the complexity, we develop a parallel computing scheme for the probabilities, which is described in detail with pseudo-codes. Moreover, we compare the trit-plane slicing in DPICT with the alternative bit-plane slicing. Experimental results show that the time complexity is reduced significantly by the parallel computing and that the trit-plane slicing provides better RD performances than the bit-plane slicing.

updated: Sun May 08 2022 07:53:19 GMT+0000 (UTC)

published: Fri Mar 25 2022 06:33:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト