Multi-objective Neural Architecture Search with Almost No Training

Shengran Hu; Ran Cheng; Cheng He; Zhichao Lu

トレーニングがほとんどない多目的ニューラルアーキテクチャ検索

近年、神経構造検索（NAS）は、学界と産業界の両方からますます注目を集めています。印象的な経験的結果が絶え間なく流れているにもかかわらず、確率的勾配降下法（SGD）トレーニングの反復にはコストがかかるため、ほとんどの既存のNASアルゴリズムの実行は計算上禁止されています。この作業では、ネットワークアーキテクチャのパフォーマンスを迅速に推定するために、効果的な代替手段であるランダムウェイト評価（RWE）を提案します。 RWEは、最後の線形分類レイヤーをトレーニングするだけで、アーキテクチャを評価するための計算コストを数時間から数秒に削減します。進化的な多目的アルゴリズムに統合されると、RWEは、単一のGPUカードで2時間未満の検索で、CIFAR-10で最先端のパフォーマンスを備えた一連の効率的なアーキテクチャを取得します。ランク順相関に関するアブレーション研究とImageNetへの学習実験の転送により、RWEの有効性がさらに検証されました。

In the recent past, neural architecture search (NAS) has attracted increasing attention from both academia and industries. Despite the steady stream of impressive empirical results, most existing NAS algorithms are computationally prohibitive to execute due to the costly iterations of stochastic gradient descent (SGD) training. In this work, we propose an effective alternative, dubbed Random-Weight Evaluation (RWE), to rapidly estimate the performance of network architectures. By just training the last linear classification layer, RWE reduces the computational cost of evaluating an architecture from hours to seconds. When integrated within an evolutionary multi-objective algorithm, RWE obtains a set of efficient architectures with state-of-the-art performance on CIFAR-10 with less than two hours' searching on a single GPU card. Ablation studies on rank-order correlations and transfer learning experiments to ImageNet have further validated the effectiveness of RWE.

updated: Fri Nov 27 2020 07:39:17 GMT+0000 (UTC)

published: Fri Nov 27 2020 07:39:17 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト