Latent Traversals in Generative Models as Potential Flows

Yue Song; Andy Keller; Nicu Sebe; Max Welling

潜在的なフローとしての生成モデルにおける潜在的なトラバーサル

深い生成モデルにおける最近の大きな進歩にもかかわらず、その潜在空間の根底にある構造はまだ十分に理解されていないため、意味的に意味のある潜在トラバーサルを実行するタスクは未解決の研究課題となっています。これまでのほとんどの研究は、潜在構造を線形にモデル化し、対応する線形方向を見つけて「絡み合っていない」世代を生成することにより、この課題を解決することを目的としていました。この作業では、代わりに、学習した動的な潜在的なランドスケープを使用して潜在的な構造をモデル化し、それによってランドスケープの勾配を下るサンプルの流れとして潜在的なトラバーサルを実行することを提案します。物理学、最適輸送、神経科学に着想を得たこれらの潜在的な風景は、物理的に現実的な偏微分方程式として学習されるため、空間と時間の両方で柔軟に変化させることができます。もつれを解くために、複数の可能性が同時に学習され、分類子によって明確で意味的に一貫したものになるように制約されます。実験的に、私たちの方法が最先端のベースラインよりも質的および量的に絡み合っていない軌道を達成することを示しています。さらに、トレーニング中にこの方法を正則化項として統合できることを示します。これにより、構造化表現の学習に対する誘導バイアスとして機能し、最終的に同様に構造化されたデータのモデルの可能性が向上します。

Despite the significant recent progress in deep generative models, the underlying structure of their latent spaces is still poorly understood, thereby making the task of performing semantically meaningful latent traversals an open research challenge. Most prior work has aimed to solve this challenge by modeling latent structures linearly, and finding corresponding linear directions which result in `disentangled' generations. In this work, we instead propose to model latent structures with a learned dynamic potential landscape, thereby performing latent traversals as the flow of samples down the landscape's gradient. Inspired by physics, optimal transport, and neuroscience, these potential landscapes are learned as physically realistic partial differential equations, thereby allowing them to flexibly vary over both space and time. To achieve disentanglement, multiple potentials are learned simultaneously, and are constrained by a classifier to be distinct and semantically self-consistent. Experimentally, we demonstrate that our method achieves both more qualitatively and quantitatively disentangled trajectories than state-of-the-art baselines. Further, we demonstrate that our method can be integrated as a regularization term during training, thereby acting as an inductive bias towards the learning of structured representations, ultimately improving model likelihood on similarly structured data.

updated: Tue Apr 25 2023 15:53:45 GMT+0000 (UTC)

published: Tue Apr 25 2023 15:53:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト