Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation

Zhaohui Zheng; Ping Wang; Dongwei Ren; Wei Liu; Rongguang Ye; Qinghua Hu; Wangmeng Zuo

オブジェクト検出とインスタンスセグメンテーションのためのモデル学習と推論における幾何学的要因の強化

ディープラーニングベースのオブジェクト検出とインスタンスセグメンテーションは、前例のない進歩を遂げました。この論文では、バウンディングボックス回帰と非最大抑制（NMS）の両方で幾何学的要因を強化するために、Complete-IoU（CIoU）損失とCluster-NMSを提案し、平均精度（AP）と平均リコール（AR）の顕著な向上をもたらします。）、推論効率を犠牲にすることなく。特に、3つの幾何学的要因、つまり、オーバーラップ領域、正規化された中心点距離、アスペクト比を考慮します。これらは、オブジェクト検出とインスタンスセグメンテーションでバウンディングボックス回帰を測定するために重要です。次に、3つの幾何学的要因がCIoU損失に組み込まれ、困難な回帰ケースをより適切に区別します。 CIoU損失を使用したディープモデルのトレーニングは、広く採用されているℓ_nノルム損失およびIoUベースの損失と比較して一貫したAPおよびARの改善をもたらします。さらに、クラスターNMSを提案します。この場合、推論中のNMSは、検出されたボックスを暗黙的にクラスタリングすることによって実行され、通常は反復回数が少なくて済みます。 Cluster-NMSは、純粋なGPU実装により非常に効率的であり、幾何学的要素を組み込んでAPとARの両方を改善できます。実験では、CIoU損失とCluster-NMSが、最先端のインスタンスセグメンテーション（例：YOLACTとBlendMask-RT）、およびオブジェクト検出（例：YOLO v3、SSD、Faster R-CNN）モデルに適用されました。。 MS COCOでのYOLACTを例にとると、この方法では、1つのNVIDIA GTX 1080TiGPUで27.1FPSを使用して、オブジェクト検出で+ 1.7APと+ 6.2AR_100、インスタンスセグメンテーションで+ 0.9APと+ 3.5AR_100のパフォーマンス向上を実現します。すべてのソースコードとトレーニング済みモデルは、https：//github.com/Zzh-tju/CIoUで入手できます。

Deep learning-based object detection and instance segmentation have achieved unprecedented progress. In this paper, we propose Complete-IoU (CIoU) loss and Cluster-NMS for enhancing geometric factors in both bounding box regression and Non-Maximum Suppression (NMS), leading to notable gains of average precision (AP) and average recall (AR), without the sacrifice of inference efficiency. In particular, we consider three geometric factors, i.e., overlap area, normalized central point distance and aspect ratio, which are crucial for measuring bounding box regression in object detection and instance segmentation. The three geometric factors are then incorporated into CIoU loss for better distinguishing difficult regression cases. The training of deep models using CIoU loss results in consistent AP and AR improvements in comparison to widely adopted ℓ_n-norm loss and IoU-based loss. Furthermore, we propose Cluster-NMS, where NMS during inference is done by implicitly clustering detected boxes and usually requires less iterations. Cluster-NMS is very efficient due to its pure GPU implementation, and geometric factors can be incorporated to improve both AP and AR. In the experiments, CIoU loss and Cluster-NMS have been applied to state-of-the-art instance segmentation (e.g., YOLACT and BlendMask-RT), and object detection (e.g., YOLO v3, SSD and Faster R-CNN) models. Taking YOLACT on MS COCO as an example, our method achieves performance gains as +1.7 AP and +6.2 AR_100 for object detection, and +0.9 AP and +3.5 AR_100 for instance segmentation, with 27.1 FPS on one NVIDIA GTX 1080Ti GPU. All the source code and trained models are available at https://github.com/Zzh-tju/CIoU

updated: Mon Jul 05 2021 08:21:41 GMT+0000 (UTC)

published: Thu May 07 2020 16:00:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト