Conceptual Compression via Deep Structure and Texture Synthesis

Jianhui Chang; Zhenghui Zhao; Chuanmin Jia; Shiqi Wang; Lingbo Yang; Qi Mao; Jian Zhang; Siwei Ma

深層構造とテクスチャ合成による概念圧縮

既存の圧縮方法は通常、信号レベルの冗長性の除去に焦点を合わせていますが、視覚データをコンパクトな概念コンポーネントに分解する可能性と多様性については、まだ研究が不足しています。この目的のために、視覚データをコンパクトな構造とテクスチャ表現にエンコードし、深い合成方法でデコードする新しい概念圧縮フレームワークを提案し、より良い視覚再構成品質、柔軟なコンテンツ操作、およびさまざまな視覚タスクの潜在的なサポートを実現します。特に、2つの補完的な視覚的特徴からなる2層モデルによって画像を圧縮することを提案します：1）構造マップによって表される構造層と2）低次元の深い表現によって特徴付けられるテクスチャ層。エンコーダ側では、構造マップとテクスチャ表現が個別に抽出および圧縮され、コンパクトで解釈可能で相互運用可能なビットストリームが生成されます。デコード段階では、階層的融合GAN（HF-GAN）を提案して、テクスチャがデコードされた構造マップにレンダリングされる合成パラダイムを学習し、優れた視覚的リアリズムを備えた高品質の再構成を実現します。多様な画像での広範な実験により、ビットレートが低く、再構成の品質が高く、視覚分析およびコンテンツ操作タスクに対する汎用性が向上したフレームワークの優位性が実証されています。

Existing compression methods typically focus on the removal of signal-level redundancies, while the potential and versatility of decomposing visual data into compact conceptual components still lack further study. To this end, we propose a novel conceptual compression framework that encodes visual data into compact structure and texture representations, then decodes in a deep synthesis fashion, aiming to achieve better visual reconstruction quality, flexible content manipulation, and potential support for various vision tasks. In particular, we propose to compress images by a dual-layered model consisting of two complementary visual features: 1) structure layer represented by structural maps and 2) texture layer characterized by low-dimensional deep representations. At the encoder side, the structural maps and texture representations are individually extracted and compressed, generating the compact, interpretable, inter-operable bitstreams. During the decoding stage, a hierarchical fusion GAN (HF-GAN) is proposed to learn the synthesis paradigm where the textures are rendered into the decoded structural maps, leading to high-quality reconstruction with remarkable visual realism. Extensive experiments on diverse images have demonstrated the superiority of our framework with lower bitrates, higher reconstruction quality, and increased versatility towards visual analysis and content manipulation tasks.

updated: Thu Mar 10 2022 10:53:06 GMT+0000 (UTC)

published: Tue Nov 10 2020 08:48:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト