A Law of Data Separation in Deep Learning

Hangfeng He; Weijie J. Su

深層学習におけるデータ分離の法則

ディープラーニングは科学の多くの分野で大幅な進歩を可能にしましたが、そのブラックボックスの性質は、将来の人工知能アプリケーションのためのアーキテクチャ設計や、一か八かの意思決定のための解釈を妨げます。私たちは、ディープニューラルネットワークが中間層のデータをどのように処理するかという基本的な問題を研究することで、この問題に対処しました。私たちの発見は、ディープニューラルネットワークが分類のためにすべての層にわたってクラスメンバーシップに従ってデータをどのように分離するかを決定する単純かつ定量的な法則です。この法則は、各層が一定の幾何学的速度でデータ分離を改善し、その出現がトレーニング中のネットワークアーキテクチャとデータセットのコレクションで観察されることを示しています。この法則は、アーキテクチャの設計、モデルの堅牢性とサンプル外のパフォーマンスの向上、予測の解釈に関する実用的なガイドラインを提供します。

While deep learning has enabled significant advances in many areas of science, its black-box nature hinders architecture design for future artificial intelligence applications and interpretation for high-stakes decision makings. We addressed this issue by studying the fundamental question of how deep neural networks process data in the intermediate layers. Our finding is a simple and quantitative law that governs how deep neural networks separate data according to class membership throughout all layers for classification. This law shows that each layer improves data separation at a constant geometric rate, and its emergence is observed in a collection of network architectures and datasets during training. This law offers practical guidelines for designing architectures, improving model robustness and out-of-sample performance, as well as interpreting the predictions.

updated: Fri Aug 11 2023 00:47:30 GMT+0000 (UTC)

published: Mon Oct 31 2022 02:25:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト