Learning with Free Object Segments for Long-Tailed Instance Segmentation

Cheng Zhang; Tai-Yu Pan; Tianle Chen; Jike Zhong; Wenjin Fu; Wei-Lun Chao

ロングテールインスタンスセグメンテーションのためのフリーオブジェクトセグメントによる学習

複雑なシーンで多数のクラスのインスタンスセグメンテーションモデルを構築する際の基本的な課題の1つは、特にまれなオブジェクトのトレーニング例が不足していることです。このホワイトペーパーでは、面倒なデータ収集と注釈なしでトレーニング例を増やす可能性を探ります。 2つの洞察によれば、オブジェクト中心の画像から大量のインスタンスセグメントを自由に取得できる可能性があることがわかります。（i）オブジェクト中心の画像には通常、単純な背景に1つの顕著なオブジェクトが含まれています。（ii）同じクラスのオブジェクトは、多くの場合、背景と同様の外観または同様のコントラストを共有します。これらの洞察に動機付けられて、ロングテールインスタンスセグメンテーションでのモデルトレーニングを容易にするために、これらの「フリー」オブジェクトフォアグラウンドセグメントを抽出および活用するためのシンプルでスケーラブルなフレームワークFreeSegを提案します。具体的には、既成のオブジェクトフォアグラウンド抽出手法（画像のコセグメンテーションなど）を使用してインスタンスマスク候補を生成し、続いてセグメントの改良とランク付けを行います。結果として得られる高品質のオブジェクトセグメントを使用して、既存のロングテールデータセットを拡張できます。たとえば、セグメントをコピーして元のトレーニング画像に貼り付けることができます。 LVISベンチマークでは、FreeSegが強力なベースラインに加えて大幅な改善をもたらし、まれなオブジェクトカテゴリをセグメント化するための最先端の精度を達成することを示しています。

One fundamental challenge in building an instance segmentation model for a large number of classes in complex scenes is the lack of training examples, especially for rare objects. In this paper, we explore the possibility to increase the training examples without laborious data collection and annotation. We find that an abundance of instance segments can potentially be obtained freely from object-centric im-ages, according to two insights: (i) an object-centric image usually contains one salient object in a simple background; (ii) objects from the same class often share similar appearances or similar contrasts to the background. Motivated by these insights, we propose a simple and scalable framework FreeSeg for extracting and leveraging these "free" object foreground segments to facilitate model training in long-tailed instance segmentation. Concretely, we employ off-the-shelf object foreground extraction techniques (e.g., image co-segmentation) to generate instance mask candidates, followed by segments refinement and ranking. The resulting high-quality object segments can be used to augment the existing long-tailed dataset, e.g., by copying and pasting the segments onto the original training images. On the LVIS benchmark, we show that FreeSeg yields substantial improvements on top of strong baselines and achieves state-of-the-art accuracy for segmenting rare object categories.

updated: Tue Feb 22 2022 19:06:16 GMT+0000 (UTC)

published: Tue Feb 22 2022 19:06:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト