Learning Fair Classifiers with Partially Annotated Group Labels

Sangwon Jung; Sanghyuk Chun; Taesup Moon

部分的に注釈が付けられたグループラベルを使用した公正な分類器の学習

最近、公平性を意識した学習がますます重要になっていますが、これらの方法のほとんどは、完全に注釈が付けられた人口統計グループラベルの可用性を前提として機能します。グループラベルの注釈は高価であり、プライバシーの問題と競合する可能性があるため、このような仮定は実際のアプリケーションでは非現実的であることを強調します。このホワイトペーパーでは、部分的に注釈が付けられたグループラベル（Fair-PG）を使用したAlgorithmicGroupFairnessと呼ばれるより実用的なシナリオを検討します。グループの公平性を達成するための既存の方法は、Fair-PGの下でターゲットラベルのみを使用して完全なデータを使用するバニラトレーニングよりもパフォーマンスがさらに悪いことがわかります。この問題に対処するために、公平性を意識した学習方法に容易に適用できる、単純な信頼ベースのグループラベル割り当て（CGL）戦略を提案します。 CGLは、補助グループ分類子を使用して疑似グループラベルを割り当てます。ここで、ランダムラベルは信頼性の低いサンプルに割り当てられます。最初に、公平性基準の観点から、メソッド設計がバニラ疑似ラベル付け戦略よりも優れていることを理論的に示します。次に、いくつかのベンチマークデータセットで、CGLと最先端の公平性を意識した処理方法を組み合わせることにより、ベースラインと比較して目標精度と公平性メトリックを共同で改善できることを経験的に示します。さらに、CGLを使用すると、特定のグループラベル付きデータセットを外部ターゲットラベルのみのデータセットで自然に拡張できるため、精度と公平性の両方を向上させることができます。コードはhttps://github.com/naver-ai/cgl_fairnessで入手できます。

Recently, fairness-aware learning have become increasingly crucial, but most of those methods operate by assuming the availability of fully annotated demographic group labels. We emphasize that such assumption is unrealistic for real-world applications since group label annotations are expensive and can conflict with privacy issues. In this paper, we consider a more practical scenario, dubbed as Algorithmic Group Fairness with the Partially annotated Group labels (Fair-PG). We observe that the existing methods to achieve group fairness perform even worse than the vanilla training, which simply uses full data only with target labels, under Fair-PG. To address this problem, we propose a simple Confidence-based Group Label assignment (CGL) strategy that is readily applicable to any fairness-aware learning method. CGL utilizes an auxiliary group classifier to assign pseudo group labels, where random labels are assigned to low confident samples. We first theoretically show that our method design is better than the vanilla pseudo-labeling strategy in terms of fairness criteria. Then, we empirically show on several benchmark datasets that by combining CGL and the state-of-the-art fairness-aware in-processing methods, the target accuracies and the fairness metrics can be jointly improved compared to the baselines. Furthermore, we convincingly show that CGL enables to naturally augment the given group-labeled dataset with external target label-only datasets so that both accuracy and fairness can be improved. Code is available at https://github.com/naver-ai/cgl_fairness.

updated: Fri Apr 01 2022 00:30:44 GMT+0000 (UTC)

published: Mon Nov 29 2021 15:11:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト