Semi-Supervised Semantic Image Segmentation with Self-correcting Networks

Mostafa S. Ibrahim; Arash Vahdat; Mani Ranjbar; William G. Macready

自己修正ネットワークによる半教師付きセマンティック画像セグメンテーション

セマンティックセグメンテーションのために高品質のオブジェクトマスクを使用して大きな画像データセットを構築するには、費用と時間がかかります。この論文では、完全に監視された小さな画像セット（セマンティックセグメンテーションラベルとボックスラベルを持つ）とオブジェクトバウンディングボックスラベルのみを持つ画像セット（弱いセットと呼ばれる）のみを使用する、原則的な半監視フレームワークを紹介します。私たちのフレームワークは、弱いセットの初期セグメンテーションラベルを生成する補助モデルと、ますます正確なプライマリモデルを使用してトレーニング中に生成されたラベルを改善する自己修正モジュールを使用して、プライマリセグメンテーションモデルをトレーニングします。線形関数または畳み込み関数を使用した自己修正モジュールの2つのバリアントを紹介します。 PASCAL VOC 2012およびCityscapeデータセットでの実験では、小さな完全監視セットでトレーニングされたモデルが、完全に監視された大きなセットでトレーニングされたモデルと同等以上のパフォーマンスを発揮する一方で、注釈の労力が7倍少ないことが示されています。

Building a large image dataset with high-quality object masks for semantic segmentation is costly and time consuming. In this paper, we introduce a principled semi-supervised framework that only uses a small set of fully supervised images (having semantic segmentation labels and box labels) and a set of images with only object bounding box labels (we call it the weak set). Our framework trains the primary segmentation model with the aid of an ancillary model that generates initial segmentation labels for the weak set and a self-correction module that improves the generated labels during training using the increasingly accurate primary model. We introduce two variants of the self-correction module using either linear or convolutional functions. Experiments on the PASCAL VOC 2012 and Cityscape datasets show that our models trained with a small fully supervised set perform similar to, or better than, models trained with a large fully supervised set while requiring ~7x less annotation effort.

updated: Wed Feb 26 2020 04:58:15 GMT+0000 (UTC)

published: Sat Nov 17 2018 01:20:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト