Reframing Neural Networks: Deep Structure in Overcomplete Representations

Calvin Murdock; Simon Lucey

ニューラルネットワークのリフレーミング：過剰な表現における深層構造

従来の浅い表現学習手法と比較して、ディープニューラルネットワークはほぼすべてのアプリケーションベンチマークで優れたパフォーマンスを実現しています。しかし、それらの明確な経験的利点にもかかわらず、それらがそれほど効果的である理由はまだよく理解されていません。この質問に取り組むために、構造化されたオーバーコンプリートフレームを使用した表現学習の統合フレームワークであるディープフレーム近似を導入します。正確な推論には反復最適化が必要ですが、フィードフォワードディープニューラルネットワークの操作によって近似される場合があります。次に、モデルの容量が、深さ、幅、スキップ接続などのアーキテクチャのハイパーパラメータによって引き起こされるフレーム構造にどのように関連しているかを間接的に分析します。これらの構造の違いを、表現の一意性と安定性にリンクされたデータに依存しないコヒーレンスの尺度であるディープフレームポテンシャルで定量化します。モデル選択の基準として、ResNetsやDenseNetsなどのさまざまな一般的なディープネットワークアーキテクチャでの汎化誤差との相関関係を示します。また、反復最適化アルゴリズムを実装するリカレントネットワークがフィードフォワード近似に匹敵するパフォーマンスをどのように達成するかを示します。過剰な表現の確立された理論とのこの関係は、アドホックエンジニアリングへの依存度が低い原理的なディープネットワークアーキテクチャ設計の有望な新しい方向性を示唆しています。

In comparison to classical shallow representation learning techniques, deep neural networks have achieved superior performance in nearly every application benchmark. But despite their clear empirical advantages, it is still not well understood what makes them so effective. To approach this question, we introduce deep frame approximation, a unifying framework for representation learning with structured overcomplete frames. While exact inference requires iterative optimization, it may be approximated by the operations of a feed-forward deep neural network. We then indirectly analyze how model capacity relates to the frame structure induced by architectural hyperparameters such as depth, width, and skip connections. We quantify these structural differences with the deep frame potential, a data-independent measure of coherence linked to representation uniqueness and stability. As a criterion for model selection, we show correlation with generalization error on a variety of common deep network architectures such as ResNets and DenseNets. We also demonstrate how recurrent networks implementing iterative optimization algorithms achieve performance comparable to their feed-forward approximations. This connection to the established theory of overcomplete representations suggests promising new directions for principled deep network architecture design with less reliance on ad-hoc engineering.

updated: Wed Mar 10 2021 01:15:14 GMT+0000 (UTC)

published: Wed Mar 10 2021 01:15:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト