Understanding Gender and Racial Disparities in Image Recognition Models

Rohan Mahadev; Anindya Chakravarti

画像認識モデルにおける性別と人種の格差を理解する

Imagenetなどの一般的なデータセットの上でトレーニングされた大規模な画像分類モデルは、人口統計のさまざまなサブセクション間で予測精度の不一致につながる分布の偏りがあることが示されています。トレーニングの前後およびトレーニング中にモデルを変更する方法を使用して、この分布の偏りを解決するために多くのアプローチが行われてきました。 OpenImagesV6データセットのサブセットであるInclusiveImagesデータセットのマルチラベル分類問題で、損失関数としてクロスエントロピーを使用したマルチラベルソフトマックス損失を使用するアプローチの1つを調査します。。 MR2データセットを使用します。このデータセットには、自己識別された性別と人種の属性を持つ人々の画像が含まれており、モデルの結果の公平性を評価し、モデルのアクティブ化を調べて間違いを解釈し、可能な修正を提案します。

Large scale image classification models trained on top of popular datasets such as Imagenet have shown to have a distributional skew which leads to disparities in prediction accuracies across different subsections of population demographics. A lot of approaches have been made to solve for this distributional skew using methods that alter the model pre, post and during training. We investigate one such approach - which uses a multi-label softmax loss with cross-entropy as the loss function instead of a binary cross-entropy on a multi-label classification problem on the Inclusive Images dataset which is a subset of the OpenImages V6 dataset. We use the MR2 dataset, which contains images of people with self-identified gender and race attributes to evaluate the fairness in the model outcomes and try to interpret the mistakes by looking at model activations and suggest possible fixes.

updated: Tue Jul 20 2021 01:05:31 GMT+0000 (UTC)

published: Tue Jul 20 2021 01:05:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト