Fairness via Adversarial Attribute Neighbourhood Robust Learning

Qi Qi; Shervin Ardeshir; Yi Xu; Tianbao Yang

Adversarial Attribute Neighborhood Robust Learning による公平性

特権を持つグループと特権を持たない機密属性グループ (人種、性別など) の間の公平性を改善することは、多くの注目を集めています。モデルがさまざまな機密属性で均一にうまく機能するようにするために、分類ヘッドのバイアスを取り除き、さまざまな機密属性グループ間でより公平な表現分布を促進するために、原則に基づいた堅牢な敵対的属性近隣 (RAAN) 損失を提案します。 RAAN の重要なアイデアは、各サンプルに敵対的ロバストな重みを割り当てることによって、異なる機密属性グループ間の偏った表現の違いを軽減することです。これは、敵対的属性の近隣の表現、つまり、異なる保護グループからのサンプルで定義されます。効率的な最適化アルゴリズムを提供するために、RAAN を結合合成関数の和にキャストし、証明可能な理論的保証を備えた確率的適応 (Adam スタイル) および非適応 (SGD スタイル) アルゴリズムフレームワーク SCRAAN を提案します。公平性関連のベンチマークデータセットに関する広範な実証研究により、提案された方法の有効性が検証されます。

Improving fairness between privileged and less-privileged sensitive attribute groups (e.g, race, gender) has attracted lots of attention. To enhance the model performs uniformly well in different sensitive attributes, we propose a principled Robust Adversarial Attribute Neighbourhood (RAAN) loss to debias the classification head and promote a fairer representation distribution across different sensitive attribute groups. The key idea of RAAN is to mitigate the differences of biased representations between different sensitive attribute groups by assigning each sample an adversarial robust weight, which is defined on the representations of adversarial attribute neighbors, i.e, the samples from different protected groups. To provide efficient optimization algorithms, we cast the RAAN into a sum of coupled compositional functions and propose a stochastic adaptive (Adam-style) and non-adaptive (SGD-style) algorithm framework SCRAAN with provable theoretical guarantee. Extensive empirical studies on fairness-related benchmark datasets verify the effectiveness of the proposed method.

updated: Wed Oct 12 2022 23:39:28 GMT+0000 (UTC)

published: Wed Oct 12 2022 23:39:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト