A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning

Hugo Berg; Siobhan Mackenzie Hall; Yash Bhalgat; Wonsuk Yang; Hannah Rose Kirk; Aleksandar Shtedritski; Max Bain

迅速な配列がバイアスを遠ざける: 敵対的学習による視覚言語モデルのバイアス緩和

視覚言語モデルは社会的偏見や固定観念をエンコードできますが、測定の堅牢性と機能の低下が不足しているため、これらのマルチモーダルな害を測定および軽減するには課題があります。これらの課題に対処するために、バイアス測定を調査し、画像とテキストの表現にランキング指標を適用します。次に、バイアス緩和方法を調査し、敵対的バイアス緩和と対照的な損失で共同トレーニングされたテキストクエリに学習済み埋め込みを追加すると、画像テキスト表現の劣化を最小限に抑えてさまざまなバイアス測定値が減少することを示します。

Vision-language models can encode societal biases and stereotypes, but there are challenges to measuring and mitigating these multimodal harms due to lacking measurement robustness and feature degradation. To address these challenges, we investigate bias measures and apply ranking metrics for image-text representations. We then investigate debiasing methods and show that prepending learned embeddings to text queries that are jointly trained with adversarial debiasing and a contrastive loss reduces various bias measures with minimal degradation to the image-text representation.

updated: Wed Oct 26 2022 03:19:13 GMT+0000 (UTC)

published: Tue Mar 22 2022 17:59:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト