The Role of ImageNet Classes in Fréchet Inception Distance

Tuomas Kynkäänniemi; Tero Karras; Miika Aittala; Timo Aila; Jaakko Lehtinen

フレシェ開始距離における ImageNet クラスの役割

フレシェ開始距離 (FID) は、データ駆動型の生成モデリングでモデルをランク付けするための主要なメトリックです。この指標は非常に成功していますが、人間の判断と一致しない場合があることが知られています。これらの不一致の根本原因を調査し、生成された画像で FID が「見ている」ものを視覚化します。 FID が (通常) 計算される特徴空間は ImageNet 分類に非常に近いため、生成された画像と実際の画像のセット間でトップ N 分類のヒストグラムを調整すると、結果の品質を実際に改善することなく、FID を大幅に削減できることを示します。 .したがって、FIDは意図的または偶発的な歪みを起こしやすいと結論付けています。偶発的な歪みの実際の例として、ImageNet の事前トレーニング済み FastGAN が StyleGAN2 に匹敵する FID を達成する一方で、人間の評価の点では劣っている場合について説明します。

Fréchet Inception Distance (FID) is the primary metric for ranking models in data-driven generative modeling. While remarkably successful, the metric is known to sometimes disagree with human judgement. We investigate a root cause of these discrepancies, and visualize what FID "looks at" in generated images. We show that the feature space that FID is (typically) computed in is so close to the ImageNet classifications that aligning the histograms of Top-N classifications between sets of generated and real images can reduce FID substantially -- without actually improving the quality of results. Thus we conclude that FID is prone to intentional or accidental distortions. As a practical example of an accidental distortion, we discuss a case where an ImageNet pre-trained FastGAN achieves a FID comparable to StyleGAN2, while being worse in terms of human evaluation

updated: Wed Sep 07 2022 07:29:27 GMT+0000 (UTC)

published: Fri Mar 11 2022 15:50:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト