ImageCaptioner^2: Image Captioner for Image Captioning Bias Amplification Assessment

Eslam Mohamed Bakr; Pengzhan Sun; Li Erran Li; Mohamed Elhoseiny

ImageCaptioner^2: 画像キャプションバイアス増幅評価用の画像キャプショナー

ほとんどの事前トレーニング済み学習システムは、通常、データ、モデル、またはその両方から生じるバイアスに悩まされることが知られています。バイアスとその原因を測定して定量化することは困難な作業であり、画像キャプションで広く研究されてきました。この方向への多大な努力にもかかわらず、既存のメトリクスには視覚信号を含める点で一貫性が欠けていることがわかりました。このホワイトペーパーでは、画像キャプション用の ImageCaptioner^2 と呼ばれる新しいバイアス評価メトリックを紹介します。モデルまたはデータの絶対的な偏りを測定する代わりに、ImageCaptioner^2 は、偏りの増幅と呼ばれる、データの偏りに対してモデルによって導入された偏りに注意を払います。生成されたキャプションのみに基づいて画像キャプションアルゴリズムのみを評価する既存の方法とは異なり、ImageCaptioner^2 はバイアスを測定しながら画像を組み込みます。さらに、言語分類器を使用する代わりに、プロンプトベースの画像キャプションとして、生成されたキャプションのバイアスを測定するための定式化を設計します。最後に、3 つの異なるデータセット (MS-COCO キャプションデータセット、Artemis V1、および Artemis V2) と、3 つの異なる保護属性 (性別、人種、感情) の 11 の異なる画像キャプションアーキテクチャに ImageCaptioner^2 メトリクスを適用します。 .その結果、バイアスメトリックの新しい人間評価パラダイムである AnonymousBench を提案することにより、ImageCaptioner^2 メトリックの有効性を検証します。私たちの指標は、最近のバイアス指標よりも大幅に優れていることを示しています。 LIC、人間のアライメントに関して、相関スコアは、メトリックと LIC でそれぞれ 80% と 54% です。コードは https://eslambakr.github.io/imagecaptioner2.github.io/ で入手できます。

Most pre-trained learning systems are known to suffer from bias, which typically emerges from the data, the model, or both. Measuring and quantifying bias and its sources is a challenging task and has been extensively studied in image captioning. Despite the significant effort in this direction, we observed that existing metrics lack consistency in the inclusion of the visual signal. In this paper, we introduce a new bias assessment metric, dubbed ImageCaptioner^2, for image captioning. Instead of measuring the absolute bias in the model or the data, ImageCaptioner^2 pay more attention to the bias introduced by the model w.r.t the data bias, termed bias amplification. Unlike the existing methods, which only evaluate the image captioning algorithms based on the generated captions only, ImageCaptioner^2 incorporates the image while measuring the bias. In addition, we design a formulation for measuring the bias of generated captions as prompt-based image captioning instead of using language classifiers. Finally, we apply our ImageCaptioner^2 metric across 11 different image captioning architectures on three different datasets, i.e., MS-COCO caption dataset, Artemis V1, and Artemis V2, and on three different protected attributes, i.e., gender, race, and emotions. Consequently, we verify the effectiveness of our ImageCaptioner^2 metric by proposing AnonymousBench, which is a novel human evaluation paradigm for bias metrics. Our metric shows significant superiority over the recent bias metric; LIC, in terms of human alignment, where the correlation scores are 80% and 54% for our metric and LIC, respectively. The code is available at https://eslambakr.github.io/imagecaptioner2.github.io/.

updated: Mon Jun 05 2023 22:06:07 GMT+0000 (UTC)

published: Mon Apr 10 2023 21:40:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト