Persistent Neurons

Yimeng Min

永続的なニューロン

Persistent Neurons

ニューラルネットワーク（NN）ベースの学習アルゴリズムは、初期化とデータ分散の選択に大きく影響されます。学習軌道を改善し、より良い最適化を見つけるために、さまざまな最適化戦略が提案されています。ただし、改善された最適化戦略を設計することは、従来のランドスケープビューでは困難な作業です。ここでは、永続的なニューロン、以前の収束ソリューションからの情報を使用して学習タスクを最適化する軌道ベースの戦略を提案します。より正確には、軌道の終わりを利用し、同じ初期化の下でモデルが収束してから以前のソリューションにペナルティを課すことにより、パラメーターが新しいランドスケープを探索できるようにします。永続ニューロンは、個々の更新が決定論的誤差項によって破損する、情報に基づくバイアスを伴う確率的勾配法と見なすことができます。具体的には、特定のデータ分布の下で永続ニューロンがより最適なソリューションに収束できる一方で、一般的なフレームワークの下での初期化が悪い極小値を見つけることを示します。さらに、永続的なニューロンが、初期化の良し悪しの両方でモデルのパフォーマンスを向上させるのに役立つことを示します。完全および部分的な永続モデルを評価し、AlexNetや残余ニューラルネットワーク（ResNet）などのさまざまなNN構造のパフォーマンスを向上させるために使用できることを示します。

Neural networks (NN)-based learning algorithms are strongly affected by the choices of initialization and data distribution. Different optimization strategies have been proposed for improving the learning trajectory and finding a better optima. However, designing improved optimization strategies is a difficult task under the conventional landscape view. Here, we propose persistent neurons, a trajectory-based strategy that optimizes the learning task using information from previous converged solutions. More precisely, we utilize the end of trajectories and let the parameters explore new landscapes by penalizing the model from converging to the previous solutions under the same initialization. Persistent neurons can be regarded as a stochastic gradient method with informed bias where individual updates are corrupted by deterministic error terms. Specifically, we show that persistent neurons, under certain data distribution, is able to converge to more optimal solutions while initializations under popular framework find bad local minima. We further demonstrate that persistent neurons helps improve the model's performance under both good and poor initializations. We evaluate the full and partial persistent model and show it can be used to boost the performance on a range of NN structures, such as AlexNet and residual neural network (ResNet).

updated: Thu Mar 18 2021 09:16:24 GMT+0000 (UTC)

published: Thu Jul 02 2020 22:36:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト