Depth and Representation in Vision Models

Benjamin L. Badger

視覚モデルにおける奥行きと表現

深層学習モデルは、シーケンシャルレイヤーで入力の連続表現を作成し、その最後のレイヤーで最終表現を出力にマッピングします。ここでは、さまざまなレイヤーに存在する埋め込みを使用してモデルの入力を自動エンコードする畳み込み画像分類モデルの機能を観察することにより、これらの表現の情報コンテンツを調査します。レイヤーが深いほど、トレーニング前のレイヤーの入力の表現の精度が低くなることがわかります。不正確な表現は、さまざまな異なる入力がほぼ同じ埋め込みを与える非一意性に起因します。一意でない表現は、フォワードパスに存在する変換の正確な非可逆性と近似的非可逆性の両方の結果です。自然の画像を分類することを学ぶと、後期ではなく初期のレイヤーの表現の明瞭さが向上し、代わりに抽象的な画像が形成されます。分類に必要な入力に存在する特徴を単純に選択するのではなく、トレーニング中に学習した多様体に任意の入力がマッピングされるように、トレーニングデータの表現と一致するように入力を変換する深層表現が見つかります。この作業は、画像認識と入力生成のタスクは、分類専用にトレーニングされたモデルであっても切り離せないという理論をサポートします。

Deep learning models develop successive representations of their input in sequential layers, the last of which maps the final representation to the output. Here we investigate the informational content of these representations by observing the ability of convolutional image classification models to autoencode the model's input using embeddings existing in various layers. We find that the deeper the layer, the less accurate that layer's representation of the input is before training. Inaccurate representation results from non-uniqueness in which various distinct inputs give approximately the same embedding. Non-unique representation is a consequence of both exact and approximate non-invertibility of transformations present in the forward pass. Learning to classify natural images leads to an increase in representation clarity for early but not late layers, which instead form abstract images. Rather than simply selecting for features present in the input necessary for classification, deep layer representations are found to transform the input so that it matches representations of the training data such that arbitrary inputs are mapped to manifolds learned during training. This work provides support for the theory that the tasks of image recognition and input generation are inseparable even for models trained exclusively to classify.

updated: Fri Nov 11 2022 22:16:40 GMT+0000 (UTC)

published: Fri Nov 11 2022 22:16:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト