Relative stability toward diffeomorphisms in deep nets indicates performance

Leonardo Petrini; Alessandro Favero; Mario Geiger; Matthieu Wyart

深いネットの微分同相写像に対する相対的な安定性はパフォーマンスを示します

ディープネットがデータを大規模に分類できる理由を理解することは、依然として課題です。それらは微分同相写像に対して安定することによってそうすることが提案されているが、既存の経験的測定はそれがしばしばそうではないことを支持している。微分同相写像の最大エントロピー分布を定義することにより、この質問を再検討します。これにより、特定のノルムの典型的な微分同相写像を研究できます。微分同相写像に対する安定性は、画像の4つのベンチマークデータセットのパフォーマンスと強く相関しないことを確認します。対照的に、一般的な変換R_fと比較した微分同相写像に対する安定性は、テストエラーϵ_tと著しく相関していることがわかります。初期化時にはほぼ統一されていますが、最先端のアーキテクチャのトレーニング中に数十年減少します。 CIFAR10および15の既知のアーキテクチャの場合、ϵ_t≈0.2R_fが見つかります。これは、良好なパフォーマンスを実現するには、小さなR_fを取得することが重要であることを示しています。 R_fがトレーニングセットのサイズにどのように依存するかを研究し、それを不変学習の単純なモデルと比較します。

Understanding why deep nets can classify data in large dimensions remains a challenge. It has been proposed that they do so by becoming stable to diffeomorphisms, yet existing empirical measurements support that it is often not the case. We revisit this question by defining a maximum-entropy distribution on diffeomorphisms, that allows to study typical diffeomorphisms of a given norm. We confirm that stability toward diffeomorphisms does not strongly correlate to performance on four benchmark data sets of images. By contrast, we find that the stability toward diffeomorphisms relative to that of generic transformations R_f correlates remarkably with the test error ϵ_t. It is of order unity at initialization but decreases by several decades during training for state-of-the-art architectures. For CIFAR10 and 15 known architectures, we find ϵ_t≈0.2R_f, suggesting that obtaining a small R_f is important to achieve good performance. We study how R_f depends on the size of the training set and compare it to a simple model of invariant learning.

updated: Thu May 06 2021 07:03:30 GMT+0000 (UTC)

published: Thu May 06 2021 07:03:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト