Tensor-based Emotion Editing in the StyleGAN Latent Space

René Haas; Stella Graßhof; Sami S. Brandt

StyleGAN潜在空間でのテンソルベースの感情編集

この論文では、高次特異値分解（HOSVD）に基づくテンソルモデルを使用して、敵対的生成ネットワークの意味的方向を発見します。これは、最初にe4eエンコーダーを使用して、構造化された表情データベースを潜在空間に埋め込むことによって実現されます。具体的には、怒り、嫌悪感、恐怖、幸福、悲しみ、驚きの6つの典型的な感情に対応する潜在空間の方向と、ヨー回転の方向を発見します。これらの潜在的な空間方向は、実際の顔画像の表現またはヨー回転を変更するために使用されます。見つかった方向を、他の2つの方法で見つかった同様の方向と比較します。結果は、結果として得られる編集の視覚的品質が最先端のものと同等であることを示しています。また、テンソルベースのモデルは感情とヨーの編集に適していると結論付けることができます。つまり、新しい顔画像の感情またはヨーの回転は、画像内のアイデンティティやその他の属性に大きな影響を与えることなく、確実に変更できます。

In this paper, we use a tensor model based on the Higher-Order Singular Value Decomposition (HOSVD) to discover semantic directions in Generative Adversarial Networks. This is achieved by first embedding a structured facial expression database into the latent space using the e4e encoder. Specifically, we discover directions in latent space corresponding to the six prototypical emotions: anger, disgust, fear, happiness, sadness, and surprise, as well as a direction for yaw rotation. These latent space directions are employed to change the expression or yaw rotation of real face images. We compare our found directions to similar directions found by two other methods. The results show that the visual quality of the resultant edits are on par with State-of-the-Art. It can also be concluded that the tensor-based model is well suited for emotion and yaw editing, i.e., that the emotion or yaw rotation of a novel face image can be robustly changed without a significant effect on identity or other attributes in the images.

updated: Thu May 12 2022 14:10:45 GMT+0000 (UTC)

published: Thu May 12 2022 14:10:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト