The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes

Peter Kocsis; Peter Súkeník; Guillem Brasó; Matthias Nießner; Laura Leal-Taixé; Ismail Elezi

低データ体制のための完全に接続されたレイヤーの不合理な効果

畳み込みニューラルネットワークは、MLP ベースのアーキテクチャのトランスフォーマーが競争力のあるパフォーマンスを示し始めた最近まで、多くのコンピュータービジョンタスクを解決するための標準でした。これらのアーキテクチャには通常、膨大な数の重みがあり、大量のデータセットでトレーニングする必要があります。したがって、データ量の少ない体制での使用には適していません。この作業では、少量のデータから一般化を改善するためのシンプルで効果的なフレームワークを提案します。完全に接続された (FC) レイヤーで最新の CNN を拡張し、このアーキテクチャの変更が低データ体制に与える大きな影響を示します。さらに、トレーニング時に余分な FC レイヤーを利用するが、テスト時にはそれらを回避するための、オンラインの共同知識抽出方法を提示します。これにより、テスト時に重みの数を増やすことなく、CNN ベースのモデルの一般化を改善できます。教師あり学習と能動学習に関する、広範囲のネットワークバックボーンといくつかの標準的なデータセットの分類実験を実行します。私たちの実験は、完全に接続されたレイヤーのないネットワークよりも大幅に優れており、推論中に追加のパラメーターを追加することなく、教師あり設定で最大 16% の検証精度の相対的な改善に達しました。

Convolutional neural networks were the standard for solving many computer vision tasks until recently, when Transformers of MLP-based architectures have started to show competitive performance. These architectures typically have a vast number of weights and need to be trained on massive datasets; hence, they are not suitable for their use in low-data regimes. In this work, we propose a simple yet effective framework to improve generalization from small amounts of data. We augment modern CNNs with fully-connected (FC) layers and show the massive impact this architectural change has in low-data regimes. We further present an online joint knowledge-distillation method to utilize the extra FC layers at train time but avoid them during test time. This allows us to improve the generalization of a CNN-based model without any increase in the number of weights at test time. We perform classification experiments for a large range of network backbones and several standard datasets on supervised learning and active learning. Our experiments significantly outperform the networks without fully-connected layers, reaching a relative improvement of up to 16% validation accuracy in the supervised setting without adding any extra parameters during inference.

updated: Thu Oct 13 2022 06:32:21 GMT+0000 (UTC)

published: Tue Oct 11 2022 17:55:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト