Measuring the Biases and Effectiveness of Content-Style Disentanglement

Xiao Liu; Spyridon Thermos; Gabriele Valvano; Agisilaos Chartsias; Alison O'Neil; Sotirios A. Tsaftaris

コンテンツスタイルの解きほぐしのバイアスと有効性の測定

最先端のセミおよび監視されていないソリューションの最近の相次ぐは、画像の「コンテンツ」を解きほぐして空間テンソルにエンコードし、画像の外観または「スタイル」をベクトルにエンコードして、空間的に等変のタスクで優れたパフォーマンスを実現します（例：画像から画像への変換）。これを達成するために、彼らは異なるモデル設計、学習目標、およびデータバイアスを採用しています。ベクトル表現の解きほぐしを測定し、タスクのパフォーマンスへの影響を評価するためにかなりの努力が払われてきましたが、（空間）コンテンツのそのような分析-スタイルの解きほぐしは欠けています。この論文では、コンテンツスタイルの解きほぐし設定におけるさまざまなバイアスの役割を調査し、解きほぐしの程度とタスクのパフォーマンスとの関係を明らかにするために、実証的研究を実施します。特に、次のような設定を検討します。（i）3つの一般的なコンテンツスタイルの解きほぐしモデルの主要な設計上の選択と学習の制約を特定する。（ii）アブレーション方式でそのような制約を緩和または除去する。（iii）2つのメトリックを使用して、解きほぐしの程度を測定し、各タスクのパフォーマンスへの影響を評価します。私たちの実験では、解きほぐし、タスクのパフォーマンス、そして驚くべきことにコンテンツの解釈可能性の間に「スイートスポット」があることが明らかになりました。私たちの調査結果、および使用されたタスクに依存しないメトリックは、コンテンツスタイルの表現が役立つタスクの新しいモデルの設計と選択をガイドするために使用できます。

A recent spate of state-of-the-art semi- and un-supervised solutions disentangle and encode image "content" into a spatial tensor and image appearance or "style" into a vector, to achieve good performance in spatially equivariant tasks (e.g. image-to-image translation). To achieve this, they employ different model design, learning objective, and data biases. While considerable effort has been made to measure disentanglement in vector representations, and assess its impact on task performance, such analysis for (spatial) content - style disentanglement is lacking. In this paper, we conduct an empirical study to investigate the role of different biases in content-style disentanglement settings and unveil the relationship between the degree of disentanglement and task performance. In particular, we consider the setting where we: (i) identify key design choices and learning constraints for three popular content-style disentanglement models; (ii) relax or remove such constraints in an ablation fashion; and (iii) use two metrics to measure the degree of disentanglement and assess its effect on each task performance. Our experiments reveal that there is a "sweet spot" between disentanglement, task performance and - surprisingly - content interpretability, suggesting that blindly forcing for higher disentanglement can hurt model performance and content factors semanticness. Our findings, as well as the used task-independent metrics, can be used to guide the design and selection of new models for tasks where content-style representations are useful.

updated: Wed Sep 15 2021 19:48:26 GMT+0000 (UTC)

published: Thu Aug 27 2020 21:41:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト