DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection

Steven Lang; Fabrizio Ventola; Kristian Kersting

DAFNe：方向付けられたオブジェクト検出のための1ステージアンカーフリーディープモデル

物体検出は、コンピュータビジョンの基本的なタスクです。軸に沿ったバウンディングボックス検出のアプローチは近年大幅に進歩していますが、空中写真や防犯カメラの映像など、いくつかの現実のシナリオで一般的な方向付けられたオブジェクトではパフォーマンスが低下します。これらの場合、予測された境界ボックスの大部分は、望ましくないことに、オブジェクトに関連しない領域をカバーします。したがって、オブジェクト検出を任意の方向に一般化することを目的として、方向付けられたオブジェクト検出が出現した。これにより、方向付けられたオブジェクトにぴったりとフィットし、特にオブジェクトが密集している場合に、境界ボックスをより適切に分離できます。この分野での作業の大部分は、複雑な2段階のアンカーベースのアプローチに焦点を合わせています。アンカーはバウンディングボックスの形状の事前分布として機能し、データセットごとに注意深いハイパーパラメータの微調整、モデルサイズの増加、および計算のオーバーヘッドが必要です。この作業では、DAFNe：指向性オブジェクト検出のための高密度の1ステージアンカーフリーディープネットワークを紹介します。 1ステージモデルとして、DAFNeは入力画像上で密なグリッド上で予測を実行し、アーキテクチャ的に単純かつ高速であり、2ステージのモデルよりも最適化が容易です。さらに、アンカーフリーモデルとして、DAFNeはバウンディングボックスアンカーの使用を控えることにより、予測の複雑さを軽減します。さらに、任意に方向付けられたバウンディングボックスの中心性関数の方向認識一般化を導入して、低品質の予測を軽量化し、オブジェクトのローカリゼーションパフォーマンスを向上させる中心から角へのバウンディングボックス予測戦略を導入します。 DAFNeは、DOTA 1.0でのこれまでの最高の1ステージアンカーフリーモデルの結果よりも4.65％mAPだけ予測精度を向上させ、76.95％mAPを達成することで新しい最先端の結果を設定します。

Object detection is a fundamental task in computer vision. While approaches for axis-aligned bounding box detection have made substantial progress in recent years, they perform poorly on oriented objects which are common in several real-world scenarios such as aerial view imagery and security camera footage. In these cases, a large part of a predicted bounding box will, undesirably, cover non-object related areas. Therefore, oriented object detection has emerged with the aim of generalizing object detection to arbitrary orientations. This enables a tighter fit to oriented objects, leading to a better separation of bounding boxes especially in case of dense object distributions. The vast majority of the work in this area has focused on complex two-stage anchor-based approaches. Anchors act as priors on the bounding box shape and require attentive hyper-parameter fine-tuning on a per-dataset basis, increased model size, and come with computational overhead. In this work, we present DAFNe: A Dense one-stage Anchor-Free deep Network for oriented object detection. As a one-stage model, DAFNe performs predictions on a dense grid over the input image, being architecturally simpler and faster, as well as easier to optimize than its two-stage counterparts. Furthermore, as an anchor-free model, DAFNe reduces the prediction complexity by refraining from employing bounding box anchors. Moreover, we introduce an orientation-aware generalization of the center-ness function for arbitrarily oriented bounding boxes to down-weight low-quality predictions and a center-to-corner bounding box prediction strategy that improves object localization performance. DAFNe improves the prediction accuracy over the previous best one-stage anchor-free model results on DOTA 1.0 by 4.65% mAP, setting the new state-of-the-art results by achieving 76.95% mAP.

updated: Mon Sep 13 2021 17:37:20 GMT+0000 (UTC)

published: Mon Sep 13 2021 17:37:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト