Regularizing Neural Networks via Minimizing Hyperspherical Energy

Rongmei Lin; Weiyang Liu; Zhen Liu; Chen Feng; Zhiding Yu; James M. Rehg; Li Xiong; Le Song

超球エネルギーの最小化によるニューラルネットワークの正則化

単位球上の複数の推進電子の分布がポテンシャルエネルギーを最小化することでモデル化できる物理学のトムソン問題に触発され、超球形エネルギーの最小化は、ニューラルネットワークを正則化し、それらの汎化力を向上させる可能性を示しています。この論文では、まず、トレーニングのダイナミクスを分析することにより、ニューラルネットワークのトレーニングで超球エネルギーが果たす重要な役割を研究します。次に、空間次元が高くなり、一般化をさらに改善する可能性を制限するため、超球形エネルギーを単純に最小化すると、高度に非線形および非凸の最適化が原因でいくつかの問題が発生することを示します。これらの問題に対処するために、ニューラルネットワークのより効果的な正則化として圧縮最小超球エネルギー（CoMHE）を提案します。具体的には、CoMHEはプロジェクションマッピングを利用してニューロンの次元数を減らし、その超球エネルギーを最小限に抑えます。プロジェクションマッピングのさまざまな設計に応じて、いくつかの明確でありながらパフォーマンスの高いバリアントを提案し、それらの有効性を正当化するための理論的な保証を提供します。私たちの実験は、CoMHEが既存の正則化手法よりも常に優れており、さまざまなニューラルネットワークに簡単に適用できることを示しています。

Inspired by the Thomson problem in physics where the distribution of multiple propelling electrons on a unit sphere can be modeled via minimizing some potential energy, hyperspherical energy minimization has demonstrated its potential in regularizing neural networks and improving their generalization power. In this paper, we first study the important role that hyperspherical energy plays in neural network training by analyzing its training dynamics. Then we show that naively minimizing hyperspherical energy suffers from some difficulties due to highly non-linear and non-convex optimization as the space dimensionality becomes higher, therefore limiting the potential to further improve the generalization. To address these problems, we propose the compressive minimum hyperspherical energy (CoMHE) as a more effective regularization for neural networks. Specifically, CoMHE utilizes projection mappings to reduce the dimensionality of neurons and minimizes their hyperspherical energy. According to different designs for the projection mapping, we propose several distinct yet well-performing variants and provide some theoretical guarantees to justify their effectiveness. Our experiments show that CoMHE consistently outperforms existing regularization methods, and can be easily applied to different neural networks.

updated: Thu Apr 09 2020 16:04:06 GMT+0000 (UTC)

published: Wed Jun 12 2019 02:12:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト