Residual networks classify inputs based on their neural transient dynamics

Fereshteh Lagzi

残余ネットワークは、神経過渡ダイナミクスに基づいて入力を分類します

本研究では、分類段階の前に出力活動から残余ダイナミクスを解きほぐすことにより、動的システムの観点から残余ネットワークの入出力動作を分析します。連続するすべてのレイヤー間で単純なスキップ接続を使用するネットワーク、ロジスティック活性化関数、およびレイヤー間で重みを共有する場合、各入力次元に対応する残差間に協調と競合のダイナミクスがあることを分析的に示します。これらの種類のネットワークを非線形フィルターとして解釈すると、アトラクタネットワークの場合の残差の定常状態値は、ネットワークがトレーニング中に観察し、それらのコンポーネントにエンコードされた、異なる入力次元間の共通の特徴を示します。残差がアトラクタ状態に収束しない場合、それらの内部ダイナミクスは入力クラスごとに分離可能であり、ネットワークは出力を確実に近似できます。残余ネットワークが残余の過渡ダイナミクスの統合に基づいて入力を分類するという分析的および経験的証拠をもたらし、ネットワークが入力摂動にどのように応答するかを示します。 ResNetと多層パーセプトロンのネットワークダイナミクスを比較し、内部ダイナミクスとノイズの進化がこれらのネットワークで根本的に異なり、ResNetがノイズの多い入力に対してより堅牢であることを示します。これらの発見に基づいて、トレーニング中に残余ネットワークの深さを調整する新しい方法も開発します。結局のところ、このアルゴリズムを使用してResNetの深さをプルーニングした後でも、ネットワークは入力を高精度で分類することができます。

In this study, we analyze the input-output behavior of residual networks from a dynamical system point of view by disentangling the residual dynamics from the output activities before the classification stage. For a network with simple skip connections between every successive layer, and for logistic activation function, and shared weights between layers, we show analytically that there is a cooperation and competition dynamics between residuals corresponding to each input dimension. Interpreting these kind of networks as nonlinear filters, the steady state value of the residuals in the case of attractor networks are indicative of the common features between different input dimensions that the network has observed during training, and has encoded in those components. In cases where residuals do not converge to an attractor state, their internal dynamics are separable for each input class, and the network can reliably approximate the output. We bring analytical and empirical evidence that residual networks classify inputs based on the integration of the transient dynamics of the residuals, and will show how the network responds to input perturbations. We compare the network dynamics for a ResNet and a Multi-Layer Perceptron and show that the internal dynamics, and the noise evolution are fundamentally different in these networks, and ResNets are more robust to noisy inputs. Based on these findings, we also develop a new method to adjust the depth for residual networks during training. As it turns out, after pruning the depth of a ResNet using this algorithm,the network is still capable of classifying inputs with a high accuracy.

updated: Fri Jan 08 2021 13:54:37 GMT+0000 (UTC)

published: Fri Jan 08 2021 13:54:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト