The Intrinsic Dimension of Images and Its Impact on Learning

Phillip Pope; Chen Zhu; Ahmed Abdelkader; Micah Goldblum; Tom Goldstein

画像の内在次元とその学習への影響

自然画像データは、従来のピクセル表現の高次元性にもかかわらず、低次元構造を示すと広く信じられています。このアイデアは、コンピュータービジョンにおけるディープラーニングの目覚ましい成功に対する共通の直感の根底にあります。この作業では、人気のあるデータセットに次元推定ツールを適用し、深層学習における低次元構造の役割を調査します。一般的な自然画像データセットは、画像内のピクセル数が多いのに比べて、実際に内在次元が非常に低いことがわかります。さらに、低次元のデータセットはニューラルネットワークが学習しやすく、これらのタスクを解決するモデルはトレーニングからテストデータまで一般化が進んでいることがわかりました。その過程で、GANによって生成された合成データで次元推定ツールを検証する手法を開発し、画像生成プロセスを制御することで内在次元を積極的に操作できるようにします。実験のコードはhttps://github.com/ppope/dimensionsにあります。

It is widely believed that natural image data exhibits low-dimensional structure despite the high dimensionality of conventional pixel representations. This idea underlies a common intuition for the remarkable success of deep learning in computer vision. In this work, we apply dimension estimation tools to popular datasets and investigate the role of low-dimensional structure in deep learning. We find that common natural image datasets indeed have very low intrinsic dimension relative to the high number of pixels in the images. Additionally, we find that low dimensional datasets are easier for neural networks to learn, and models solving these tasks generalize better from training to test data. Along the way, we develop a technique for validating our dimension estimation tools on synthetic data generated by GANs allowing us to actively manipulate the intrinsic dimension by controlling the image generation process. Code for our experiments may be found here https://github.com/ppope/dimensions.

updated: Sun Apr 18 2021 16:29:23 GMT+0000 (UTC)

published: Sun Apr 18 2021 16:29:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト