Disentangling Model Multiplicity in Deep Learning

Ari Heljakka; Martin Trapp; Juho Kannala; Arno Solin

深層学習におけるモデルの多重性の解きほぐし

モデルの多重度は、機械学習モデルの一般化の保証を弱体化させる、よく知られているがあまり理解されていない現象です。これは、トレーニング時間のパフォーマンスが似ている 2 つのモデルが、予測と実際のパフォーマンス特性が異なる場合に表示されます。この観測された「予測」多重度 (PM) は、モデルの内部、つまり「表現」多重度 (RM) の捉えどころのない違いも意味します。特異ベクトル正準相関分析 (SVCCA) を介して活性化の類似性を測定することにより、RM を分析するための概念的および実験的なセットアップを紹介します。トレーニング方法の特定の違いは、体系的に他のものよりも大きな RM をもたらし、一般化可能性の予測因子として有限サンプルで RM と PM を評価することを示します。さらに、4 つの標準画像データセットにおける iid および分布外テスト予測の分散によって測定された PM と RM を関連付けます。最後に、RM を排除しようとする代わりに、RM の体系的な測定と最大限の暴露を求めます。

Model multiplicity is a well-known but poorly understood phenomenon that undermines the generalisation guarantees of machine learning models. It appears when two models with similar training-time performance differ in their predictions and real-world performance characteristics. This observed 'predictive' multiplicity (PM) also implies elusive differences in the internals of the models, their 'representational' multiplicity (RM). We introduce a conceptual and experimental setup for analysing RM by measuring activation similarity via singular vector canonical correlation analysis (SVCCA). We show that certain differences in training methods systematically result in larger RM than others and evaluate RM and PM over a finite sample as predictors for generalizability. We further correlate RM with PM measured by the variance in i.i.d. and out-of-distribution test predictions in four standard image data sets. Finally, instead of attempting to eliminate RM, we call for its systematic measurement and maximal exposure.

updated: Tue Jan 31 2023 10:40:52 GMT+0000 (UTC)

published: Fri Jun 17 2022 16:53:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト