Hierarchical Simplicity Bias of Neural Networks

Zhehang Du

Neural networks often exhibit simplicity bias, favoring simpler features over more complex ones, even when both are equally predictive. We introduce a novel method called imbalanced label coupling to explore and extend this simplicity bias across multiple hierarchical levels. Our approach demonstrates that trained networks sequentially consider features of increasing complexity based on their correlation with labels in the training set, regardless of their actual predictive power. For example, in CIFAR-10, simple spurious features can cause misclassifications where most cats are predicted as dogs and most trucks as automobiles. We empirically show that last-layer retraining with target data distribution kirichenko2022last is insufficient to fully recover core features when spurious features perfectly correlate with target labels in our synthetic datasets. Our findings deepen the understanding of the implicit biases inherent in neural networks.

updated: Tue Oct 22 2024 01:41:21 GMT+0000 (UTC)

published: Sun Nov 05 2023 11:27:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト