Expressive Power and Loss Surfaces of Deep Learning Models

Simant Dube

深層学習モデルの表現力と損失面

このホワイトペーパーの目的は2つあります。最初の目標は、深層学習の成功の理由についての幾何学的な直感を強調する深層学習モデルの動作に関する説明チュートリアルとして機能することです。 2番目の目標は、深層学習モデルの表現力とその損失面に関する現在の結果を、新しい洞察と結果で補完することです。特に、特に乗算ニューロンが導入されたときに、ディープニューラルネットワークが多様体をどのように切り分けるかについて説明します。乗算は、内積と注意メカニズムで使用され、カプセルネットワークと自己注意ベースのトランスで使用されます。また、損失面のランダム多項式、ランダム行列、スピングラス、および計算の複雑さの観点がどのように相互接続されているかについても説明します。

The goals of this paper are two-fold. The first goal is to serve as an expository tutorial on the working of deep learning models which emphasizes geometrical intuition about the reasons for success of deep learning. The second goal is to complement the current results on the expressive power of deep learning models and their loss surfaces with novel insights and results. In particular, we describe how deep neural networks carve out manifolds especially when the multiplication neurons are introduced. Multiplication is used in dot products and the attention mechanism and it is employed in capsule networks and self-attention based transformers. We also describe how random polynomial, random matrix, spin glass and computational complexity perspectives on the loss surfaces are interconnected.

updated: Tue Aug 10 2021 01:34:42 GMT+0000 (UTC)

published: Sun Aug 08 2021 06:28:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト