PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices

Guanghua Yu; Qinyao Chang; Wenyu Lv; Chang Xu; Cheng Cui; Wei Ji; Qingqing Dang; Kaipeng Deng; Guanzhong Wang; Yuning Du; Baohua Lai; Qiwen Liu; Xiaoguang Hu; Dianhai Yu; Yanjun Ma

PP-PicoDet：モバイルデバイスでのより優れたリアルタイムオブジェクト検出器

精度と効率のトレードオフの向上は、オブジェクト検出における困難な問題でした。この作業では、精度と効率を向上させるために、オブジェクト検出のための主要な最適化とニューラルネットワークアーキテクチャの選択を研究することに専念しています。軽量オブジェクト検出モデルでのアンカーフリー戦略の適用性を調査します。バックボーン構造を強化し、ネックの軽量構造を設計することで、ネットワークの特徴抽出能力を向上させます。ラベル割り当て戦略と損失関数を改善して、トレーニングをより安定して効率的にします。これらの最適化を通じて、PP-PicoDetという名前のリアルタイムオブジェクト検出器の新しいファミリを作成します。これにより、モバイルデバイスのオブジェクト検出で優れたパフォーマンスが実現します。私たちのモデルは、他の一般的なモデルと比較して、精度と待ち時間の間のより良いトレードオフを実現します。わずか0.99Mのパラメーターを持つPicoDet-Sは30.6％のmAPを達成します。これは、モバイルCPUの推論待ち時間をYOLOX-Nanoと比較して55％削減しながら、mAPで絶対4.8％の改善であり、NanoDetと比較してmAPで絶対7.1％の改善です。入力サイズが320の場合、モバイルARMCPUで123FPS（PaddleLiteを使用して150FPS）に達します。3.3MパラメーターのみのPicoDet-Lは40.9％mAPを達成します。これは、mAPが3.7％向上し、YOLOv5sより44％高速です。。図1に示すように、私たちのモデルは、軽量オブジェクト検出の最先端の結果をはるかに上回っています。コードと事前トレーニング済みモデルは、https：//github.com/PaddlePaddle/PaddleDetectionで入手できます。

The better accuracy and efficiency trade-off has been a challenging problem in object detection. In this work, we are dedicated to studying key optimizations and neural network architecture choices for object detection to improve accuracy and efficiency. We investigate the applicability of the anchor-free strategy on lightweight object detection models. We enhance the backbone structure and design the lightweight structure of the neck, which improves the feature extraction ability of the network. We improve label assignment strategy and loss function to make training more stable and efficient. Through these optimizations, we create a new family of real-time object detectors, named PP-PicoDet, which achieves superior performance on object detection for mobile devices. Our models achieve better trade-offs between accuracy and latency compared to other popular models. PicoDet-S with only 0.99M parameters achieves 30.6% mAP, which is an absolute 4.8% improvement in mAP while reducing mobile CPU inference latency by 55% compared to YOLOX-Nano, and is an absolute 7.1% improvement in mAP compared to NanoDet. It reaches 123 FPS (150 FPS using Paddle Lite) on mobile ARM CPU when the input size is 320. PicoDet-L with only 3.3M parameters achieves 40.9% mAP, which is an absolute 3.7% improvement in mAP and 44% faster than YOLOv5s. As shown in Figure 1, our models far outperform the state-of-the-art results for lightweight object detection. Code and pre-trained models are available at https://github.com/PaddlePaddle/PaddleDetection.

updated: Mon Nov 01 2021 12:53:17 GMT+0000 (UTC)

published: Mon Nov 01 2021 12:53:17 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト