No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems

Nimit S. Sohoni; Jared A. Dunnmon; Geoffrey Angus; Albert Gu; Christopher Ré

サブクラスが残されていない：粗視化分類問題における細粒度ロバスト性

実際の分類タスクでは、各クラスは多くの場合、複数のよりきめ細かい「サブクラス」で構成されます。サブクラスラベルは頻繁に使用できないため、より粗いクラスラベルのみを使用してトレーニングされたモデルは、さまざまなサブクラス間でパフォーマンスが大きく変動することがよくあります。隠れ層化として知られるこの現象は、医療などの安全性が重要なアプリケーションに展開されるモデルに重要な結果をもたらします。サブクラスラベルが不明な場合でも、隠れた層化を測定および軽減する方法であるGEORGEを提案します。最初に、ラベルのないサブクラスが深いモデルの特徴空間で分離可能であることが多く、この事実を利用して、クラスタリング手法を介してトレーニングデータのサブクラスラベルを推定します。次に、これらの近似サブクラスラベルを、分布的にロバストな最適化目的でのノイズの多い監視の形式として使用します。理論的には、サブクラス全体の最悪の場合の汎化誤差の観点から、GEORGEのパフォーマンスを特徴付けます。実世界とベンチマークの画像分類データセットを組み合わせてGEORGEを経験的に検証し、サブクラスに関する情報を必要とせずに、このアプローチが標準のトレーニング手法と比較して最悪の場合のサブクラスの精度を最大22パーセントポイント向上させることを示します。

In real-world classification tasks, each class often comprises multiple finer-grained "subclasses." As the subclass labels are frequently unavailable, models trained using only the coarser-grained class labels often exhibit highly variable performance across different subclasses. This phenomenon, known as hidden stratification, has important consequences for models deployed in safety-critical applications such as medicine. We propose GEORGE, a method to both measure and mitigate hidden stratification even when subclass labels are unknown. We first observe that unlabeled subclasses are often separable in the feature space of deep models, and exploit this fact to estimate subclass labels for the training data via clustering techniques. We then use these approximate subclass labels as a form of noisy supervision in a distributionally robust optimization objective. We theoretically characterize the performance of GEORGE in terms of the worst-case generalization error across any subclass. We empirically validate GEORGE on a mix of real-world and benchmark image classification datasets, and show that our approach boosts worst-case subclass accuracy by up to 22 percentage points compared to standard training techniques, without requiring any information about the subclasses.

updated: Wed Nov 25 2020 18:50:32 GMT+0000 (UTC)

published: Wed Nov 25 2020 18:50:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト