A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models

Yang Wu; Pengxu Wei; Liang Lin

ニューラルエネルギーベースのモデルを学習するためのほぼ最適な勾配フロー

この論文では、エネルギーベースのモデル（EBM）を学習するための勾配流を最適化するための新しい数値スキームを提案します。物理シミュレーションの観点から、最適な輸送 (つまりワッサースタイン) メトリックを利用して勾配流を近似する問題を再定義します。 EBM では、データ分布を段階的にサンプリングして推定する学習プロセスが、現在とターゲットの実分布の間のグローバル相対エントロピーを最小化する関数勾配を実行します。これは、無秩序からターゲット多様体に移動する動的粒子として扱うことができます。以前の学習スキームは、主に、各学習ステップでの連続時間 KL 発散に関するエントロピーを最小化します。ただし、それらは、最適な輸送原理に反する滑らかな多様体内の非滑らかな情報を射影することにより、ローカル KL ダイバージェンスで立ち往生する傾向があります。この問題を解決するために、フォッカー・プランク方程式からグローバル相対エントロピーの 2 次ワッサースタイン勾配フローを導出します。既存のスキームと比較して、Wasserstein 勾配フローは、実際のデータ密度を近似するためのよりスムーズでほぼ最適な数値スキームです。また、この近接スキームを導出し、その数値計算式を提供します。私たちの広範な実験は、複雑な分布をフィッティングし、ニューラル EBM を使用して高品質で高次元のデータを生成する上で提案されたスキームの実用的な優位性と可能性を示しています。

In this paper, we propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs). From a perspective of physical simulation, we redefine the problem of approximating the gradient flow utilizing optimal transport (i.e. Wasserstein) metric. In EBMs, the learning process of stepwise sampling and estimating data distribution performs the functional gradient of minimizing the global relative entropy between the current and target real distribution, which can be treated as dynamic particles moving from disorder to target manifold. Previous learning schemes mainly minimize the entropy concerning the consecutive time KL divergence in each learning step. However, they are prone to being stuck in the local KL divergence by projecting non-smooth information within smooth manifold, which is against the optimal transport principle. To solve this problem, we derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation. Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities. We also derive this near-proximal scheme and provide its numerical computation equations. Our extensive experiments demonstrate the practical superiority and potentials of our proposed scheme on fitting complex distributions and generating high-quality, high-dimensional data with neural EBMs.

updated: Fri Apr 28 2023 16:03:06 GMT+0000 (UTC)

published: Thu Oct 31 2019 02:26:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト