Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization

Jingtang Liang; Xiaodong Cun; Chi-Man Pun

効率的で高解像度の画像調和のための空間分離曲線レンダリングネットワーク

画像の調和は、特定の背景に対して合成領域の色を変更することを目的としています。以前の作品では、このタスクをUNetファミリ構造を使用したピクセル単位の画像から画像への変換としてモデル化しています。ただし、モデルのサイズと計算コストにより、エッジデバイスや高解像度の画像でのモデルのパフォーマンスが制限されます。この目的のために、我々は初めて効率的で高解像度の画像調和のための新しい空間分離曲線レンダリングネットワーク（S ^ 2CRNet）を提案します。 S ^ 2CRNetでは、最初に、マスクされた前景と背景のサムネイルから、空間的に分離された埋め込みを個別に抽出します。次に、曲線レンダリングモジュール（CRM）を設計します。このモジュールは、線形レイヤーを使用して空間固有の知識を学習および結合し、前景領域のピクセル単位の曲線マッピングのパラメーターを生成します。最後に、学習したカラーカーブを使用して、元の高解像度画像を直接レンダリングします。さらに、カスケードされた改良とセマンティックガイダンスのために、Cascaded-CRMとSemantic-CRMを介して提案されたフレームワークの2つの拡張も行います。実験は、提案された方法が以前の方法と比較して90％以上のパラメーターを削減するが、それでも合成されたiHarmony4と実際のDIHテストセットの両方で最先端のパフォーマンスを達成することを示しています。さらに、私たちの方法は、既存の方法よりも10倍以上高速な高解像度の画像をリアルタイムでスムーズに処理できます。コードと事前トレーニング済みモデルは、https：//github.com/stefanLeong/S2CRNetで利用可能になり、リリースされます。

Image harmonization aims to modify the color of the composited region with respect to the specific background. Previous works model this task as a pixel-wise image-to-image translation using UNet family structures. However, the model size and computational cost limit the performability of their models on edge devices and higher-resolution images. To this end, we propose a novel spatial-separated curve rendering network (S^2CRNet) for efficient and high-resolution image harmonization for the first time. In S^2CRNet, we firstly extract the spatial-separated embeddings from the thumbnails of the masked foreground and background individually. Then, we design a curve rendering module (CRM), which learns and combines the spatial-specific knowledge using linear layers to generate the parameters of the pixel-wise curve mapping in the foreground region. Finally, we directly render the original high-resolution images using the learned color curve. Besides, we also make two extensions of the proposed framework via the Cascaded-CRM and Semantic-CRM for cascaded refinement and semantic guidance, respectively. Experiments show that the proposed method reduces more than 90% parameters compared with previous methods but still achieves the state-of-the-art performance on both synthesized iHarmony4 and real-world DIH test set. Moreover, our method can work smoothly on higher resolution images in real-time which is more than 10× faster than the existing methods. The code and pre-trained models will be made available and released at https://github.com/stefanLeong/S2CRNet.

updated: Tue Sep 14 2021 08:02:51 GMT+0000 (UTC)

published: Mon Sep 13 2021 07:20:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト