m-RevNet: Deep Reversible Neural Networks with Momentum

Duo Li; Shang-Hua Gao

m-RevNet：勢いのあるディープリバーシブルニューラルネットワーク

近年、深い残差ネットワークと一次常微分方程式（ODE）との関係が明らかになりました。この作業では、深層ニューラルアーキテクチャの設計を2次ODEとさらに橋渡しし、残りのブロックに運動量の更新を挿入することを特徴とする、m-RevNetと呼ばれる新しい可逆ニューラルネットワークを提案します。リバーシブルプロパティにより、フォワードパスのアクティベーション値にアクセスせずにバックワードパスを実行できるため、トレーニング中のストレージの負担が大幅に軽減されます。さらに、2次ODEに基づく理論的基盤により、m-RevNetは、バニラ残差ネットワークよりも強力な表現力を得ることができます。これは、パフォーマンスの向上を説明する可能性があります。特定の学習シナリオでは、m-RevNetは成功し、標準のResNetは失敗することを分析的および経験的に明らかにします。さまざまな画像分類とセマンティックセグメンテーションベンチマークに関する包括的な実験により、メモリ効率と認識パフォーマンスの両方に関して、ResNetに対するm-RevNetの優位性が実証されています。

In recent years, the connections between deep residual networks and first-order Ordinary Differential Equations (ODEs) have been disclosed. In this work, we further bridge the deep neural architecture design with the second-order ODEs and propose a novel reversible neural network, termed as m-RevNet, that is characterized by inserting momentum update to residual blocks. The reversible property allows us to perform backward pass without access to activation values of the forward pass, greatly relieving the storage burden during training. Furthermore, the theoretical foundation based on second-order ODEs grants m-RevNet with stronger representational power than vanilla residual networks, which potentially explains its performance gains. For certain learning scenarios, we analytically and empirically reveal that our m-RevNet succeeds while standard ResNet fails. Comprehensive experiments on various image classification and semantic segmentation benchmarks demonstrate the superiority of our m-RevNet over ResNet, concerning both memory efficiency and recognition performance.

updated: Mon Aug 16 2021 13:04:04 GMT+0000 (UTC)

published: Thu Aug 12 2021 17:14:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト