Addressing Bias in Face Detectors using Decentralised Data collection with incentives

M. R. Ahan; Robin Lehmann; Richard Blythman

インセンティブ付きの分散型データ収集を使用して、顔検出器のバイアスに対処する

機械学習の最近の発展は、成功するモデルが膨大な量のデータだけに依存するのではなく、適切な種類のデータに依存することを示しています。このホワイトペーパーでは、このデータ中心のアプローチを分散型の方法で促進して、アルゴリズムの効率的なデータ収集を可能にする方法を示します。顔検出器は、多種多様なデータを処理する必要があるため、バイアスの問題が深刻なモデルのクラスです。また、FaceNet Embeddings を備えたハイブリッド MultiTask Cascaded CNN を使用して顔検出と匿名化のアプローチを提案し、複数のデータセットをベンチマークして、さまざまな民族、性別、年齢層に対するモデルのバイアスを記述および評価し、分散型システムで公平性を高める方法も提案します。モデルの再トレーニングのための堅牢なパイプラインを作成するための、ユーザーによるデータのラベル付け、修正、および検証。

Recent developments in machine learning have shown that successful models do not rely only on huge amounts of data but the right kind of data. We show in this paper how this data-centric approach can be facilitated in a decentralized manner to enable efficient data collection for algorithms. Face detectors are a class of models that suffer heavily from bias issues as they have to work on a large variety of different data. We also propose a face detection and anonymization approach using a hybrid MultiTask Cascaded CNN with FaceNet Embeddings to benchmark multiple datasets to describe and evaluate the bias in the models towards different ethnicities, gender, and age groups along with ways to enrich fairness in a decentralized system of data labeling, correction, and verification by users to create a robust pipeline for model retraining.

updated: Fri Oct 28 2022 09:54:40 GMT+0000 (UTC)

published: Fri Oct 28 2022 09:54:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト