Image-Based CLIP-Guided Essence Transfer

Hila Chefer; Sagie Benaim; Roni Paiss; Lior Wolf

画像ベースのCLIPガイド付きエッセンス転送

2つの信号の概念的な混合は、創造性と知性の両方を強調する可能性のあるセマンティックタスクです。ジェネレータネットワークとセマンティックネットワークの2つの潜在空間を組み込んだ方法でこのようなブレンディングを実行することを提案します。最初のネットワークには強力なStyleGANジェネレーターを採用し、2番目のネットワークにはCLIPの強力な画像言語マッチングネットワークを採用しています。新しい方法では、両方の潜在空間で同時に加算されるように最適化されたブレンド演算子が作成されます。私たちの結果は、これが各空間で別々に得られるものよりもはるかに自然なブレンドにつながることを示しています。

The conceptual blending of two signals is a semantic task that may underline both creativity and intelligence. We propose to perform such blending in a way that incorporates two latent spaces: that of the generator network and that of the semantic network. For the first network, we employ the powerful StyleGAN generator, and for the second, the powerful image-language matching network of CLIP. The new method creates a blending operator that is optimized to be simultaneously additive in both latent spaces. Our results demonstrate that this leads to blending that is much more natural than what can be obtained in each space separately.

updated: Tue Oct 26 2021 06:31:25 GMT+0000 (UTC)

published: Sun Oct 24 2021 12:46:53 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト