Generalizing Energy-based Generative ConvNets from Particle Evolution Perspective

Yang Wu; Xu Cai; Pengxu Wei; Guanbin Li; Liang Lin

粒子進化の観点からのエネルギーベースの生成ConvNetの一般化

生成的敵対ネットワーク（GAN）と比較して、エネルギーベースの生成モデル（EBM）は2つの魅力的な特性を備えています。i）学習および合成中に補助ネットワークを必要とせずに直接最適化できます。 ii）潜在的な関数を明示的に学習することにより、観測データの基礎となる分布をより適切に近似できます。このホワイトペーパーでは、ボトムアップのConvNetによって定義されるエネルギー関数を最小化する、エネルギーベースの生成ConvNet（GCN）であるEBMのブランチについて検討します。素粒子物理学の観点から、最尤学習中に合成サンプルの品質を損なう可能性のある不安定なエネルギー散逸の問題を解決します。具体的には、最初に古典的なFRAMEモデル[1]と動的物理プロセスの間の接続を確立し、粒子の観点から特定のメトリック尺度で離散フローでGCNを一般化します。 KL消失の問題に対処するために、次に、KL発散メジャーを使用したKL離散フローから、Wasserastein距離メトリックを使用したJordan-Kinderleher-Otto（JKO）離散フローにGCNを再定式化し、Wasserastein GCN（wGCN）を導出します。 GCNのこれらの理論的研究に基づいて、最終的にGeneralized GCN（GGCN）を導出し、モデルの一般化と学習機能をさらに改善します。 GGCNは、学習分布の問題に対処するために参照分布に正規分布を採用することにより、隠れ空間マッピング戦略を導入しています。 GCNでのMCMCサンプリングのため、サンプリング手順が増えると、依然として時間のかかる深刻な問題に悩まされます。したがって、学習効率を改善するために、トレーニング可能な非線形アップサンプリング関数と償却学習が提案されています。提案されたGGCNは、対称的な学習方法でトレーニングされます。この方法は、モデルの安定性と、広く使用されているいくつかの顔および自然画像データセットで生成されたサンプルの品質の両方で既存のモデルを上回ります。

Compared with Generative Adversarial Networks (GAN), Energy-Based generative Models (EBMs) possess two appealing properties: i) they can be directly optimized without requiring an auxiliary network during the learning and synthesizing; ii) they can better approximate underlying distribution of the observed data by learning explicitly potential functions. This paper studies a branch of EBMs, i.e., energy-based Generative ConvNets (GCNs), which minimize their energy function defined by a bottom-up ConvNet. From the perspective of particle physics, we solve the problem of unstable energy dissipation that might damage the quality of the synthesized samples during the maximum likelihood learning. Specifically, we firstly establish a connection between classical FRAME model [1] and dynamic physics process and generalize the GCN in discrete flow with a certain metric measure from particle perspective. To address KL-vanishing issue, we then reformulate GCN from the KL discrete flow with KL divergence measure to a Jordan-Kinderleher-Otto (JKO) discrete flow with Wasserastein distance metric and derive a Wasserastein GCN (wGCN). Based on these theoretical studies on GCN, we finally derive a Generalized GCN (GGCN) to further improve the model generalization and learning capability. GGCN introduces a hidden space mapping strategy by employing a normal distribution for the reference distribution to address the learning bias issue. Due to MCMC sampling in GCNs, it still suffers from a serious time-consuming issue when sampling steps increase; thus a trainable non-linear upsampling function and an amortized learning are proposed to improve the learning efficiency. Our proposed GGCN is trained in a symmetrical learning manner. Our method surpass the existing models in both model stability and the quality of generated samples on several widely-used face and natural image datasets.

updated: Sun Aug 30 2020 05:55:46 GMT+0000 (UTC)

published: Thu Oct 31 2019 02:26:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト