ReduNet: A White-box Deep Network from the Principle of Maximizing Rate Reduction

Kwan Ho Ryan Chan; Yaodong Yu; Chong You; Haozhi Qi; John Wright; Yi Ma

ReduNet：レート削減を最大化するという原則からのホワイトボックスディープネットワーク

この作業は、データ圧縮と識別表現の原理から現代の深い（畳み込み）ネットワークを解釈することを目的としたもっともらしい理論的フレームワークを提供しようとします。高次元のマルチクラスデータの場合、最適な線形識別表現は、データセット全体とすべてのサブセットの平均との間のコーディングレートの差を最大化すると主張します。レート削減目標を最適化するための基本的な反復勾配上昇スキームが、現代のディープネットワークの共通の特性を共有するReduNetという名前の多層ディープネットワークに自然につながることを示します。ディープレイヤードアーキテクチャ、線形および非線形演算子、さらにはネットワークのパラメーターでさえ、バックプロパゲーションによる微調整は可能ですが、すべてフォワードプロパゲーションを介してレイヤーごとに明示的に構築されます。そのようにして得られた「ホワイトボックス」ネットワークのすべてのコンポーネントは、正確な最適化、統計的、および幾何学的な解釈を備えています。さらに、そのように導出されたネットワークのすべての線形演算子は、分類を厳密にシフト不変にするように強制すると、当然マルチチャネル畳み込みになります。不変設定での導出は、スパース性と不変の間のトレードオフを示唆し、また、そのような深い畳み込みネットワークがスペクトル領域で構築および学習するのに非常に効率的であることを示しています。私たちの予備的なシミュレーションと実験は、レート削減目標と関連するReduNetの両方の有効性を明確に検証します。すべてのコードとデータはhttps://github.com/Ma-Lab-Berkeleyで入手できます。

This work attempts to provide a plausible theoretical framework that aims to interpret modern deep (convolutional) networks from the principles of data compression and discriminative representation. We argue that for high-dimensional multi-class data, the optimal linear discriminative representation maximizes the coding rate difference between the whole dataset and the average of all the subsets. We show that the basic iterative gradient ascent scheme for optimizing the rate reduction objective naturally leads to a multi-layer deep network, named ReduNet, which shares common characteristics of modern deep networks. The deep layered architectures, linear and nonlinear operators, and even parameters of the network are all explicitly constructed layer-by-layer via forward propagation, although they are amenable to fine-tuning via back propagation. All components of so-obtained "white-box" network have precise optimization, statistical, and geometric interpretation. Moreover, all linear operators of the so-derived network naturally become multi-channel convolutions when we enforce classification to be rigorously shift-invariant. The derivation in the invariant setting suggests a trade-off between sparsity and invariance, and also indicates that such a deep convolution network is significantly more efficient to construct and learn in the spectral domain. Our preliminary simulations and experiments clearly verify the effectiveness of both the rate reduction objective and the associated ReduNet. All code and data are available at https://github.com/Ma-Lab-Berkeley.

updated: Mon Nov 29 2021 01:48:29 GMT+0000 (UTC)

published: Fri May 21 2021 16:29:57 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト