Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization

Zhuo Huang; Miaoxi Zhu; Xiaobo Xia; Li Shen; Jun Yu; Chen Gong; Bo Han; Bo Du; Tongliang Liu

最悪の場合のシャープネス最小化によるフォトン制限破損に対するロバストな一般化

ロバストな一般化は、トレーニングセットではまれであり、重大なノイズ、つまりフォトン制限の破損を含む、最も困難なデータ分布に取り組むことを目的としています。分布的にロバストな最適化 (DRO) などの一般的なソリューションは、最悪のケースの経験的リスクに焦点を当てて、まれなノイズの多い分布でのトレーニングエラーを低く抑えます。ただし、過剰にパラメータ化されたモデルが乏しい最悪のケースのデータで最適化されているため、DRO は滑らかな損失状況を生成できず、テストセットへの一般化に苦労しています。したがって、最悪の場合のリスクの最小化に焦点を当てるのではなく、学習パラメータの近隣の損失の変化を測定する最悪の場合の分布の鋭さにペナルティを課すことによって SharpDRO を提案します。最悪の場合の鋭さの最小化により、提案された方法は、破損した分布で平坦な損失曲線を生成することに成功し、ロバストな一般化を実現します。さらに、分布アノテーションが利用可能かどうかを検討することにより、SharpDRO を 2 つの問題設定に適用し、堅牢な一般化のための最悪のケースの選択プロセスを設計します。理論的には、SharpDRO には優れた収束保証があることを示しています。実験的に、CIFAR10/100 および ImageNet30 データセットを使用して光子制限の破損をシミュレートし、SharpDRO が深刻な破損に対して強力な一般化能力を示し、よく知られているベースラインメソッドを大幅なパフォーマンス向上で上回ることを示します。

Robust generalization aims to tackle the most challenging data distributions which are rare in the training set and contain severe noises, i.e., photon-limited corruptions. Common solutions such as distributionally robust optimization (DRO) focus on the worst-case empirical risk to ensure low training error on the uncommon noisy distributions. However, due to the over-parameterized model being optimized on scarce worst-case data, DRO fails to produce a smooth loss landscape, thus struggling on generalizing well to the test set. Therefore, instead of focusing on the worst-case risk minimization, we propose SharpDRO by penalizing the sharpness of the worst-case distribution, which measures the loss changes around the neighbor of learning parameters. Through worst-case sharpness minimization, the proposed method successfully produces a flat loss curve on the corrupted distributions, thus achieving robust generalization. Moreover, by considering whether the distribution annotation is available, we apply SharpDRO to two problem settings and design a worst-case selection process for robust generalization. Theoretically, we show that SharpDRO has a great convergence guarantee. Experimentally, we simulate photon-limited corruptions using CIFAR10/100 and ImageNet30 datasets and show that SharpDRO exhibits a strong generalization ability against severe corruptions and exceeds well-known baseline methods with large performance gains.

updated: Thu Mar 23 2023 07:58:48 GMT+0000 (UTC)

published: Thu Mar 23 2023 07:58:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト