AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

Dilin Wang; Meng Li; Chengyue Gong; Vikas Chandra

AttentiveNAS：注意深いサンプリングによるニューラルアーキテクチャ検索の改善

ニューラルアーキテクチャ検索（NAS）は、正確で効率的な最先端の（SOTA）モデルの設計に大きな期待を寄せています。最近、BigNASなどの2ステージNASは、モデルのトレーニングと検索プロセスを切り離し、驚くべき検索効率と精度を実現しています。 2ステージNASは、トレーニング中に検索スペースからのサンプリングを必要とします。これは、最終的に検索されるモデルの精度に直接影響します。均一サンプリングはその単純さのために広く使用されてきましたが、検索プロセスの主な焦点であるモデルのパフォーマンスパレートフロントにとらわれないため、モデルの精度をさらに向上させる機会を逃しています。この作業では、より良いパフォーマンスのパレートを達成するためにサンプリング戦略を改善することに焦点を当てたAttentiveNASを提案します。また、トレーニング中にパレート上のネットワークを効率的かつ効果的に識別するためのアルゴリズムを提案します。余分な再トレーニングや後処理を行わなくても、さまざまなFLOPで多数のネットワークを同時に取得できます。発見されたモデルファミリであるAttentiveNASモデルは、ImageNetで77.3％から80.7％のトップ1の精度を達成し、BigNASやOnce-for-AllネットワークなどのSOTAモデルよりも優れています。また、わずか491 MFLOPで80.1％のImageNet精度を達成します。トレーニングコードと事前トレーニング済みモデルは、https：//github.com/facebookresearch/AttentiveNASで入手できます。

Neural architecture search (NAS) has shown great promise in designing state-of-the-art (SOTA) models that are both accurate and efficient. Recently, two-stage NAS, e.g. BigNAS, decouples the model training and searching process and achieves remarkable search efficiency and accuracy. Two-stage NAS requires sampling from the search space during training, which directly impacts the accuracy of the final searched models. While uniform sampling has been widely used for its simplicity, it is agnostic of the model performance Pareto front, which is the main focus in the search process, and thus, misses opportunities to further improve the model accuracy. In this work, we propose AttentiveNAS that focuses on improving the sampling strategy to achieve better performance Pareto. We also propose algorithms to efficiently and effectively identify the networks on the Pareto during training. Without extra re-training or post-processing, we can simultaneously obtain a large number of networks across a wide range of FLOPs. Our discovered model family, AttentiveNAS models, achieves top-1 accuracy from 77.3% to 80.7% on ImageNet, and outperforms SOTA models, including BigNAS and Once-for-All networks. We also achieve ImageNet accuracy of 80.1% with only 491 MFLOPs. Our training code and pretrained models are available at https://github.com/facebookresearch/AttentiveNAS.

updated: Tue Apr 13 2021 19:17:16 GMT+0000 (UTC)

published: Wed Nov 18 2020 00:15:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト