Rank & Sort Loss for Object Detection and Instance Segmentation

Kemal Oksuz; Baris Can Cam; Emre Akbas; Sinan Kalkan

オブジェクト検出とインスタンスセグメンテーションのランクと並べ替えの損失

ランク付けベースの損失関数として、ランク＆ソート（RS）損失を提案し、深いオブジェクト検出およびインスタンスセグメンテーション方法（つまり、視覚的検出器）をトレーニングします。 RS Lossは、これらのメソッドのサブネットワークである分類子を監視して、すべてのネガティブよりも各ポジティブをランク付けし、継続的なローカリゼーション品質（例：Intersection-over-Union-IoU）に関してポジティブをソートします。。ランク付けと並べ替えの差別化できない性質に取り組むために、エラー駆動型更新とバックプロパゲーションの組み込みをID更新として再定式化します。これにより、ポジティブ間の新しい並べ替えエラーをモデル化できます。 RS Lossを使用すると、トレーニングが大幅に簡素化されます。（i）並べ替えの目的のおかげで、追加の補助ヘッドなしで分類子によってポジティブが優先されます（たとえば、中心性、IoU、マスク-IoU）、（ii）そのランク付けのため- RS Lossはクラスの不均衡に対してロバストであるため、サンプリングヒューリスティックは必要ありません。（iii）チューニング不要のタスクバランシング係数を使用して、視覚検出器のマルチタスクの性質に対処します。 RS Lossを使用して、学習率を調整するだけで7つの多様な視覚検出器をトレーニングし、ベースラインを一貫して上回っていることを示します。たとえば、RS Lossは（i）R-CNNを最大3ボックスAPおよびaLRP Loss（ランキングベースのベースライン）改善します。）COCOデータセットの〜2ボックスAP、（ii）LVISデータセットの3.5マスクAP（レアクラスの場合は〜7 AP）によるリピートファクターサンプリング（RFS）でR-CNNをマスクします。また、すべてのカウンターパートを上回ります。 https://github.com/kemaloksuz/RankSortLossで入手可能なコード

We propose Rank & Sort (RS) Loss, as a ranking-based loss function to train deep object detection and instance segmentation methods (i.e. visual detectors). RS Loss supervises the classifier, a sub-network of these methods, to rank each positive above all negatives as well as to sort positives among themselves with respect to (wrt.) their continuous localisation qualities (e.g. Intersection-over-Union - IoU). To tackle the non-differentiable nature of ranking and sorting, we reformulate the incorporation of error-driven update with backpropagation as Identity Update, which enables us to model our novel sorting error among positives. With RS Loss, we significantly simplify training: (i) Thanks to our sorting objective, the positives are prioritized by the classifier without an additional auxiliary head (e.g. for centerness, IoU, mask-IoU), (ii) due to its ranking-based nature, RS Loss is robust to class imbalance, and thus, no sampling heuristic is required, and (iii) we address the multi-task nature of visual detectors using tuning-free task-balancing coefficients. Using RS Loss, we train seven diverse visual detectors only by tuning the learning rate, and show that it consistently outperforms baselines: e.g. our RS Loss improves (i) Faster R-CNN by ~ 3 box AP and aLRP Loss (ranking-based baseline) by ~ 2 box AP on COCO dataset, (ii) Mask R-CNN with repeat factor sampling (RFS) by 3.5 mask AP (~ 7 AP for rare classes) on LVIS dataset; and also outperforms all counterparts. Code available at https://github.com/kemaloksuz/RankSortLoss

updated: Sat Jul 24 2021 18:44:44 GMT+0000 (UTC)

published: Sat Jul 24 2021 18:44:44 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト