Domain Agnostic Image-to-image Translation using Low-Resolution Conditioning

Mohamed Abid; Arman Afrasiyabi; Ihsen Hedhli; Jean-François Lalonde; Christian Gagné

低解像度コンディショニングを使用したドメインに依存しない画像から画像への変換

一般に、画像間変換 (i2i) 方法は、翻訳に使用される画像がコンテンツ (ポーズなど) を共有しているが、独自のドメイン固有の情報 (別名スタイル) を持っているという前提で、ドメイン間のマッピングを学習することを目的としています。ターゲット画像を条件として、このようなメソッドはターゲットスタイルを抽出し、それをソース画像コンテンツと組み合わせて、ドメイン間の一貫性を保ちます。私たちの提案では、この従来の見方から離れ、代わりにターゲットドメインが非常に低解像度 (LR) 画像で表されるシナリオを考慮し、ドメインが関連している、きめ細かい問題に対してドメインに依存しない i2i 手法を提案します。。より具体的には、私たちのドメインに依存しないアプローチは、ソース画像の視覚的特徴と LR ターゲット画像の低周波情報 (姿勢、色など) を組み合わせた画像を生成することを目的としています。そのために、生成モデルのトレーニングに依存して、関連するソース画像の固有の情報を共有し、ダウンスケール時に LR ターゲット画像と正確に一致する画像を生成する新しいアプローチを提案します。視覚的な品質の向上を実証することで、CelebA-HQ および AFHQ データセットに対するメソッドを検証します。定性的および定量的な結果は、ドメイン内画像変換を扱う場合、StarGAN v2 などの最先端の方法と比較して、私たちの方法が現実的なサンプルを生成することを示しています。アブレーション研究では、私たちの方法が色の変化に対して堅牢であり、分布外の画像にも適用でき、最終結果を手動で制御できることも明らかにしています。

Generally, image-to-image translation (i2i) methods aim at learning mappings across domains with the assumption that the images used for translation share content (e.g., pose) but have their own domain-specific information (a.k.a. style). Conditioned on a target image, such methods extract the target style and combine it with the source image content, keeping coherence between the domains. In our proposal, we depart from this traditional view and instead consider the scenario where the target domain is represented by a very low-resolution (LR) image, proposing a domain-agnostic i2i method for fine-grained problems, where the domains are related. More specifically, our domain-agnostic approach aims at generating an image that combines visual features from the source image with low-frequency information (e.g. pose, color) of the LR target image. To do so, we present a novel approach that relies on training the generative model to produce images that both share distinctive information of the associated source image and correctly match the LR target image when downscaled. We validate our method on the CelebA-HQ and AFHQ datasets by demonstrating improvements in terms of visual quality. Qualitative and quantitative results show that when dealing with intra-domain image translation, our method generates realistic samples compared to state-of-the-art methods such as StarGAN v2. Ablation studies also reveal that our method is robust to changes in color, it can be applied to out-of-distribution images, and it allows for manual control over the final results.

updated: Thu May 11 2023 03:15:45 GMT+0000 (UTC)

published: Mon May 08 2023 19:58:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト