LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization

Sheng Liu; Cong Phuoc Huynh; Cong Chen; Maxim Arap; Raffay Hamid

LEMaRT: 画像調和のためのラベル効率の高いマスク領域変換

大規模な注釈なしの画像データセットを活用できる、画像の調和のためのシンプルで効果的な自己教師付き事前トレーニング方法を提示します。この目標を達成するために、まず、Label-Efficient Masked Region Transform (LEMaRT) パイプラインを使用して事前トレーニングデータをオンラインで生成します。画像を指定すると、LEMaRT は前景マスクを生成し、一連の変換を適用して、生成されたマスクによって指定された領域のさまざまな視覚属性 (焦点ぼけ、コントラスト、彩度など) を乱します。次に、摂動画像から元の画像を復元することにより、画像調和モデルを事前トレーニングします。次に、Swin Transformer [27] をローカルおよびグローバルな自己注意メカニズムの組み合わせで改造することにより、画像調和モデル、すなわち SwinIH を導入します。 LEMaRT を使用して SwinIH を事前トレーニングすると、画像調和の新しい最先端技術が得られますが、ラベル効率が高くなります。つまり、既存の方法よりも微調整のための注釈付きデータの消費が少なくなります。特に、iHarmony4 データセット [8] では、SwinIH は最先端の SCS-Co [16] よりも 0.4 dB 優れており、トレーニングデータの 50% のみで微調整されている場合は 1.0 dB 優れています。完全なトレーニングデータセットでトレーニングされた場合。

We present a simple yet effective self-supervised pre-training method for image harmonization which can leverage large-scale unannotated image datasets. To achieve this goal, we first generate pre-training data online with our Label-Efficient Masked Region Transform (LEMaRT) pipeline. Given an image, LEMaRT generates a foreground mask and then applies a set of transformations to perturb various visual attributes, e.g., defocus blur, contrast, saturation, of the region specified by the generated mask. We then pre-train image harmonization models by recovering the original image from the perturbed image. Secondly, we introduce an image harmonization model, namely SwinIH, by retrofitting the Swin Transformer [27] with a combination of local and global self-attention mechanisms. Pre-training SwinIH with LEMaRT results in a new state of the art for image harmonization, while being label-efficient, i.e., consuming less annotated data for fine-tuning than existing methods. Notably, on iHarmony4 dataset [8], SwinIH outperforms the state of the art, i.e., SCS-Co [16] by a margin of 0.4 dB when it is fine-tuned on only 50% of the training data, and by 1.0 dB when it is trained on the full training dataset.

updated: Tue Apr 25 2023 21:51:22 GMT+0000 (UTC)

published: Tue Apr 25 2023 21:51:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト