Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets

Yuan-Hong Liao; Amlan Kar; Sanja Fidler

大規模な画像分類データセットに効率的に注釈を付けるためのグッドプラクティスに向けて

データは現代のコンピュータービジョンのエンジンであり、大規模なデータセットを収集する必要があります。これは高価であり、ラベルの品質を保証することは大きな課題です。この論文では、画像の大規模なコレクションのマルチクラス分類ラベルを収集するための効率的な注釈戦略を調査します。学習したモデルを利用してラベル付けする方法は存在しますが、驚くほど普及しているアプローチは、データごとに固定数のラベルを人間に照会し、それらを集約することです。これはコストがかかります。人間の注釈と機械で生成された信念のオンライン共同確率モデリングに関する以前の作業に基づいて、人間のラベル付けの労力を最小限に抑えることを目的とした変更とベストプラクティスを提案します。具体的には、自己教師あり学習の進歩を利用し、注釈を半教師あり学習の問題と見なし、落とし穴を特定して軽減し、いくつかの重要な設計上の選択を排除して、ラベル付けの効果的なガイドラインを提案します。私たちの分析は、人間のラベラーにクエリを実行することを含む、より現実的なシミュレーションで行われます。これにより、既存のワーカーシミュレーション方法を使用した評価の問題が明らかになります。 ImageNet100の125k画像サブセットでシミュレートされた実験は、画像あたり平均0.35の注釈で、80％のトップ1精度で注釈を付けることができることを示しています。これは、以前の作業と手動の注釈に比べて、それぞれ2.7倍と6.7倍の改善です。プロジェクトページ：https：//fidler-lab.github.io/efficient-annotation-cookbook

Data is the engine of modern computer vision, which necessitates collecting large-scale datasets. This is expensive, and guaranteeing the quality of the labels is a major challenge. In this paper, we investigate efficient annotation strategies for collecting multi-class classification labels for a large collection of images. While methods that exploit learnt models for labeling exist, a surprisingly prevalent approach is to query humans for a fixed number of labels per datum and aggregate them, which is expensive. Building on prior work on online joint probabilistic modeling of human annotations and machine-generated beliefs, we propose modifications and best practices aimed at minimizing human labeling effort. Specifically, we make use of advances in self-supervised learning, view annotation as a semi-supervised learning problem, identify and mitigate pitfalls and ablate several key design choices to propose effective guidelines for labeling. Our analysis is done in a more realistic simulation that involves querying human labelers, which uncovers issues with evaluation using existing worker simulation methods. Simulated experiments on a 125k image subset of the ImageNet100 show that it can be annotated to 80% top-1 accuracy with 0.35 annotations per image on average, a 2.7x and 6.7x improvement over prior work and manual annotation, respectively. Project page: https://fidler-lab.github.io/efficient-annotation-cookbook

updated: Mon Apr 26 2021 16:29:32 GMT+0000 (UTC)

published: Mon Apr 26 2021 16:29:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト