No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems

Nimit S. Sohoni; Jared A. Dunnmon; Geoffrey Angus; Albert Gu; Christopher Ré

サブクラスが残されていない：粗粒度の分類問題における細粒度の堅牢性

実際の分類タスクでは、各クラスは多くの場合、複数のよりきめ細かい「サブクラス」で構成されます。サブクラスラベルは頻繁に使用できないため、より粗いクラスラベルのみを使用してトレーニングされたモデルは、さまざまなサブクラス間でパフォーマンスが大きく変動することがよくあります。隠れた層化として知られるこの現象は、医療などのセーフティクリティカルなアプリケーションで展開されるモデルに重要な結果をもたらします。サブクラスのラベルが不明な場合でも、隠れた層化を測定および軽減する方法であるGEORGEを提案します。最初に、ラベルのないサブクラスがディープニューラルネットワークの特徴空間で分離可能であることが多く、この事実を利用して、クラスタリング手法を介してトレーニングデータのサブクラスラベルを推定します。次に、これらの近似サブクラスラベルを、分布的にロバストな最適化目標のノイズの多い監視の形式として使用します。理論的には、任意のサブクラスにわたる最悪の場合の汎化誤差の観点から、GEORGEのパフォーマンスを特徴付けます。実世界とベンチマークの画像分類データセットを組み合わせてGEORGEを経験的に検証し、サブクラスに関する事前情報を必要とせずに、このアプローチが標準のトレーニング手法と比較して最悪の場合のサブクラスの精度を最大22パーセント向上させることを示します。

In real-world classification tasks, each class often comprises multiple finer-grained "subclasses." As the subclass labels are frequently unavailable, models trained using only the coarser-grained class labels often exhibit highly variable performance across different subclasses. This phenomenon, known as hidden stratification, has important consequences for models deployed in safety-critical applications such as medicine. We propose GEORGE, a method to both measure and mitigate hidden stratification even when subclass labels are unknown. We first observe that unlabeled subclasses are often separable in the feature space of deep neural networks, and exploit this fact to estimate subclass labels for the training data via clustering techniques. We then use these approximate subclass labels as a form of noisy supervision in a distributionally robust optimization objective. We theoretically characterize the performance of GEORGE in terms of the worst-case generalization error across any subclass. We empirically validate GEORGE on a mix of real-world and benchmark image classification datasets, and show that our approach boosts worst-case subclass accuracy by up to 22 percentage points compared to standard training techniques, without requiring any prior information about the subclasses.

updated: Sun Apr 10 2022 23:01:14 GMT+0000 (UTC)

published: Wed Nov 25 2020 18:50:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト