A Benchmark of Long-tailed Instance Segmentation with Noisy Labels

Guanlin Li; Guowen Xu; Tianwei Zhang

ノイズの多いラベルを使用したロングテールインスタンスセグメンテーションのベンチマーク

この論文では、ラベルノイズ、つまりアノテーションの一部が正しくない、を含むロングテールデータセットに対するインスタンスセグメンテーションタスクを検討します。このケースが現実的である主な理由は 2 つあります。まず、現実世界から収集されたデータセットは通常、ロングテール分布に従います。 2 番目に、たとえばセグメンテーションデータセットでは、1 つの画像内に多数のインスタンスがあり、そのうちのいくつかは小さいため、注釈にノイズが入り込みやすくなります。具体的には、新しいデータセットを提案します。これは、セグメンテーションなどのラベルノイズを含む大規模な語彙のロングテールデータセットです。さらに、このデータセットに対して以前に提案されたインスタンスセグメンテーションアルゴリズムを評価します。結果は、トレーニングデータセット内のノイズがモデルのまれなカテゴリの学習を妨げ、全体的なパフォーマンスを低下させることを示しており、この現実的な課題に対処するためのより効果的なアプローチを模索するきっかけとなります。コードとデータセットは https://github.com/GuanlinLee/Noisy-LVIS で入手できます。

In this paper, we consider the instance segmentation task on a long-tailed dataset, which contains label noise, i.e., some of the annotations are incorrect. There are two main reasons making this case realistic. First, datasets collected from real world usually obey a long-tailed distribution. Second, for instance segmentation datasets, as there are many instances in one image and some of them are tiny, it is easier to introduce noise into the annotations. Specifically, we propose a new dataset, which is a large vocabulary long-tailed dataset containing label noise for instance segmentation. Furthermore, we evaluate previous proposed instance segmentation algorithms on this dataset. The results indicate that the noise in the training dataset will hamper the model in learning rare categories and decrease the overall performance, and inspire us to explore more effective approaches to address this practical challenge. The code and dataset are available in https://github.com/GuanlinLee/Noisy-LVIS.

updated: Sat Jul 15 2023 08:42:40 GMT+0000 (UTC)

published: Thu Nov 24 2022 06:34:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト