COIN++: Neural Compression Across Modalities

Emilien Dupont; Hrushikesh Loya; Milad Alizadeh; Adam Goliński; Yee Whye Teh; Arnaud Doucet

COIN++: モダリティ間の神経圧縮

通常、ニューラル圧縮アルゴリズムは、さまざまなデータモダリティに対応する特殊なエンコーダーおよびデコーダーアーキテクチャを必要とするオートエンコーダーに基づいています。この論文では、幅広いデータモダリティをシームレスに処理するニューラル圧縮フレームワークである COIN++ を提案します。私たちのアプローチは、データを暗黙的なニューラル表現、つまり座標 (ピクセル位置など) を特徴 (RGB 値など) にマッピングするニューラル関数に変換することに基づいています。次に、暗黙的なニューラル表現の重みを直接保存する代わりに、メタ学習ベースネットワークに適用された変調をデータの圧縮コードとして保存します。これらの変調をさらに量子化してエントロピー符号化することで、ベースラインと比較して符号化時間を 2 桁短縮しながら、圧縮率を大幅に向上させます。画像や音声から医療や気候データまで、さまざまなデータモダリティを圧縮することにより、この方法の実現可能性を経験的に示しています。

Neural compression algorithms are typically based on autoencoders that require specialized encoder and decoder architectures for different data modalities. In this paper, we propose COIN++, a neural compression framework that seamlessly handles a wide range of data modalities. Our approach is based on converting data to implicit neural representations, i.e. neural functions that map coordinates (such as pixel locations) to features (such as RGB values). Then, instead of storing the weights of the implicit neural representation directly, we store modulations applied to a meta-learned base network as a compressed code for the data. We further quantize and entropy code these modulations, leading to large compression gains while reducing encoding time by two orders of magnitude compared to baselines. We empirically demonstrate the feasibility of our method by compressing various data modalities, from images and audio to medical and climate data.

updated: Thu Dec 08 2022 11:07:51 GMT+0000 (UTC)

published: Sun Jan 30 2022 20:12:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト