MAE-DET: Revisiting Maximum Entropy Principle in Zero-Shot NAS for Efficient Object Detection

Zhenhong Sun; Ming Lin; Xiuyu Sun; Zhiyu Tan; Hao Li; Rong Jin

MAE-DET：効率的なオブジェクト検出のためのゼロショットNASの最大エントロピー原理の再検討

オブジェクト検出では、検出バックボーンが全体的な推論コストの半分以上を消費します。最近の研究では、Neural Architecture Search（NAS）を使用してバックボーンアーキテクチャを最適化することにより、このコストを削減しようとしています。ただし、オブジェクト検出用の既存のNASメソッドは、数百から数千時間のGPU検索を必要とするため、ペースの速い研究開発では実用的ではありません。この作業では、この問題に対処するための新しいゼロショットNAS手法を提案します。 MAE-DETという名前の提案された方法は、ネットワークパラメータをトレーニングすることなく、最大エントロピー原理を介して効率的な検出バックボーンを自動的に設計し、アーキテクチャ設計コストをほぼゼロに削減しながら、最先端の（SOTA）パフォーマンスを提供します。内部的には、MAE-DETは検出バックボーンの微分エントロピーを最大化し、同じ計算バジェットの下でオブジェクト検出のためのより優れた特徴抽出器をもたらします。たった1GPUの完全自動設計の後、MAE-DETは、人間の介入をほとんど必要とせずに、複数の検出ベンチマークデータセットでSOTA検出バックボーンを革新します。 ResNet-50バックボーンと比較すると、MAE-DETは、同じ量のFLOP/パラメーターを使用した場合のmAPで+2.0％優れており、同じmAPのNVIDIAV100で1.54倍高速です。コードと事前トレーニング済みモデルは、https：//github.com/alibaba/lightweight-neuralarchitecture-searchで入手できます。

In object detection, the detection backbone consumes more than half of the overall inference cost. Recent researches attempt to reduce this cost by optimizing the backbone architecture with the help of Neural Architecture Search (NAS). However, existing NAS methods for object detection require hundreds to thousands of GPU hours of searching, making them impractical in fast-paced research and development. In this work, we propose a novel zero-shot NAS method to address this issue. The proposed method, named MAE-DET, automatically designs efficient detection backbones via the Maximum Entropy Principle without training network parameters, reducing the architecture design cost to nearly zero yet delivering the state-of-the-art (SOTA) performance. Under the hood, MAE-DET maximizes the differential entropy of detection backbones, leading to a better feature extractor for object detection under the same computational budgets. After merely one GPU day of fully automatic design, MAE-DET innovates SOTA detection backbones on multiple detection benchmark datasets with little human intervention. Comparing to ResNet-50 backbone, MAE-DET is +2.0% better in mAP when using the same amount of FLOPs/parameters, and is 1.54 times faster on NVIDIA V100 at the same mAP. Code and pre-trained models are available at https://github.com/alibaba/lightweight-neuralarchitecture-search.

updated: Wed Jun 15 2022 07:42:44 GMT+0000 (UTC)

published: Fri Nov 26 2021 07:18:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト