Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss

Jia Li; Jinming Su; Changqun Xia; Mingcan Ma; Yonghong Tian

精製メカニズムと構造的類似性の喪失を伴う顕著な物体検出

画像の特徴に適応的に重みを付ける注意メカニズムの助けを借りて、最近の高度な深層学習ベースのモデルは、予測結果が可能な限り大きな予測可能な領域でグラウンドトゥルースマスクを近似することを奨励し、最先端のパフォーマンスを実現します。ただし、これらの方法では、予測を誤る傾向のある小さな領域に十分な注意を払っていません。このように、前景と背景が区別できない領域や、複雑または微細な構造の領域が存在するため、顕著なオブジェクトを正確に特定することは依然として困難です。これらの問題に対処するために、浄化メカニズムと構造的類似性の喪失を伴う新しい畳み込みニューラルネットワークを提案します。具体的には、予備的な顕著なオブジェクトをより適切に見つけるために、最初に、顕著な領域への注意を促進するための空間的およびチャネル注意メカニズムに基づく促進注意を導入します。続いて、1つのモデルのエラーが発生しやすい領域と見なすことができる区別できない領域を復元する目的で、誤った予測の領域から学習した修正注意を提案し、ネットワークをエラーが発生しやすい領域に集中するように導きます。エラーの修正。これら2つの注意により、浄化メカニズムを使用して、顕著なオブジェクト全体のさまざまな領域に厳密な重みを課し、区別が難しい領域からの結果を浄化して、顕著なオブジェクトの位置と詳細を正確に予測します。これらの区別が難しい領域に異なる注意を払うことに加えて、複雑な領域の構造的制約も考慮し、構造的類似性の喪失を提案します。実験では、提案されたアプローチは、6つのデータセットで19の最先端の方法を上回り、単一のNVIDIA 1080TiGPUで27FPSを超える顕著なマージンがあります。

By the aid of attention mechanisms to weight the image features adaptively, recent advanced deep learning-based models encourage the predicted results to approximate the ground-truth masks with as large predictable areas as possible, thus achieving the state-of-the-art performance. However, these methods do not pay enough attention to small areas prone to misprediction. In this way, it is still tough to accurately locate salient objects due to the existence of regions with indistinguishable foreground and background and regions with complex or fine structures. To address these problems, we propose a novel convolutional neural network with purificatory mechanism and structural similarity loss. Specifically, in order to better locate preliminary salient objects, we first introduce the promotion attention, which is based on spatial and channel attention mechanisms to promote attention to salient regions. Subsequently, for the purpose of restoring the indistinguishable regions that can be regarded as error-prone regions of one model, we propose the rectification attention, which is learned from the areas of wrong prediction and guide the network to focus on error-prone regions thus rectifying errors. Through these two attentions, we use the Purificatory Mechanism to impose strict weights with different regions of the whole salient objects and purify results from hard-to-distinguish regions, thus accurately predicting the locations and details of salient objects. In addition to paying different attention to these hard-to-distinguish regions, we also consider the structural constraints on complex regions and propose the Structural Similarity Loss. In experiments, the proposed approach outperforms 19 state-of-the-art methods on six datasets with a notable margin at over 27FPS on a single NVIDIA 1080Ti GPU.

updated: Mon Jul 19 2021 10:19:50 GMT+0000 (UTC)

published: Wed Dec 18 2019 05:49:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト