Robust Instance Segmentation through Reasoning about Multi-Object Occlusion

Xiaoding Yuan; Adam Kortylewski; Yihong Sun; Alan Yuille

マルチオブジェクトオクルージョンに関する推論によるロバストなインスタンスセグメンテーション

ディープニューラルネットワークを使用して複雑なシーンを分析することは、特に画像に互いに部分的に隠れている複数のオブジェクトが含まれている場合、困難な作業です。画像分析への既存のアプローチは、ほとんどの場合、オブジェクトを独立して処理し、近くのオブジェクトの相対的なオクルージョンを考慮していません。この論文では、オクルージョンに対してロバストであり、バウンディングボックスの監視からのみトレーニングできるマルチオブジェクトインスタンスセグメンテーションのためのディープネットワークを提案します。私たちの仕事は、オクルーダーを特定し、オクルージョンされていない部分に基づいてオブジェクトを分類するための神経機能活性化の生成モデルを学習する構成ネットワークに基づいています。それらの生成モデルを拡張して複数のオブジェクトを含め、困難な咬合シナリオで効率的な推論を行うためのフレームワークを導入します。特に、オブジェクトクラスとそのインスタンスおよびオクルーダーセグメンテーションのフィードフォワード予測を取得します。誤ったセグメンテーションを特定し、それらを修正するためにオクルージョンの順序を推定するオクルージョン推論モジュール（ORM）を紹介します。改善されたセグメンテーションマスクは、画像分類を改善するためにトップダウン方式でネットワークに統合されます。 KITTI INStanceデータセット（KINS）と合成オクルージョンデータセットでの実験は、オクルージョン下のマルチオブジェクトインスタンスセグメンテーションでのモデルの有効性と堅牢性を示しています。

Analyzing complex scenes with Deep Neural Networks is a challenging task, particularly when images contain multiple objects that partially occlude each other. Existing approaches to image analysis mostly process objects independently and do not take into account the relative occlusion of nearby objects. In this paper, we propose a deep network for multi-object instance segmentation that is robust to occlusion and can be trained from bounding box supervision only. Our work builds on Compositional Networks, which learn a generative model of neural feature activations to locate occluders and to classify objects based on their non-occluded parts. We extend their generative model to include multiple objects and introduce a framework for the efficient inference in challenging occlusion scenarios. In particular, we obtain feed-forward predictions of the object classes and their instance and occluder segmentations. We introduce an Occlusion Reasoning Module (ORM) that locates erroneous segmentations and estimates the occlusion ordering to correct them. The improved segmentation masks are, in turn, integrated into the network in a top-down manner to improve the image classification. Our experiments on the KITTI INStance dataset (KINS) and a synthetic occlusion dataset demonstrate the effectiveness and robustness of our model at multi-object instance segmentation under occlusion.

updated: Mon Mar 01 2021 13:22:12 GMT+0000 (UTC)

published: Thu Dec 03 2020 17:41:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト