The Foes of Neural Network's Data Efficiency Among Unnecessary Input Dimensions

Vanessa D'Amario; Sanjana Srivastava; Tomotake Sasaki; Xavier Boix

不必要な入力次元におけるニューラルネットワークのデータ効率の敵

多くの場合、データセットには、出力ラベルを予測するために不要な入力ディメンションが含まれています。たとえば、オブジェクト認識の背景など、よりトレーニング可能なパラメーターにつながります。ディープニューラルネットワーク（DNN）は、隠れ層のパラメーターの数を増やすことに対して堅牢ですが、これが入力層に当てはまるかどうかは不明です。このレターでは、不要な入力ディメンションがDNNの中心的な問題であるデータ効率に与える影響を調査します。特定の一般化パフォーマンスを達成するために必要な例の量。私たちの結果は、タスクに関係のない不要な入力ディメンションがデータ効率を大幅に低下させることを示しています。これは、データ効率の向上を可能にするために、タスクに関係のない次元を削除するメカニズムの必要性を浮き彫りにします。

Datasets often contain input dimensions that are unnecessary to predict the output label, e.g. background in object recognition, which lead to more trainable parameters. Deep Neural Networks (DNNs) are robust to increasing the number of parameters in the hidden layers, but it is unclear whether this holds true for the input layer. In this letter, we investigate the impact of unnecessary input dimensions on a central issue of DNNs: their data efficiency, ie. the amount of examples needed to achieve certain generalization performance. Our results show that unnecessary input dimensions that are task-unrelated substantially degrade data efficiency. This highlights the need for mechanisms that remove task-unrelated dimensions to enable data efficiency gains.

updated: Tue Jul 13 2021 21:52:02 GMT+0000 (UTC)

published: Tue Jul 13 2021 21:52:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト