ZARTS: On Zero-order Optimization for Neural Architecture Search

Xiaoxing Wang; Wenxuan Guo; Junchi Yan; Jianlin Su; Xiaokang Yang

ZARTS：ニューラルアーキテクチャ検索のゼロ次最適化について

微分可能アーキテクチャ検索（DARTS）は、その効率が高いため、NASで人気のあるワンショットパラダイムです。候補操作の重要性を表すトレーニング可能なアーキテクチャパラメータを導入し、勾配を推定するための1次/ 2次近似を提案し、勾配降下アルゴリズムによってNASを解決できるようにします。しかし、私たちの詳細な経験的結果は、近似が損失の状況を歪めることが多く、アーキテクチャパラメータの最適化と不正確な勾配推定の偏った目的につながることを示しています。この作業はゼロ次最適化に向けられ、上記の近似を強制せずに検索するためのZARTSと呼ばれる新しいNASスキームを提案します。具体的には、RS、MGS、GLDの3つの代表的なゼロ次最適化手法が導入されています。これらの中でMGSは、精度と速度のバランスをとることで最高のパフォーマンスを発揮します。さらに、RS / MGSと勾配降下アルゴリズムの関係を調査し、ZARTSがDARTSの堅牢な勾配のない対応物と見なすことができることを示します。複数のデータセットと検索スペースでの広範な実験は、私たちの方法の驚くべきパフォーマンスを示しています。特に、12のベンチマークの結果は、ZARTSの卓越した堅牢性を検証しています。この場合、DARTSのパフォーマンスは、既知の不安定性の問題のために低下します。また、DARTSの検索スペースを検索してピアメソッドと比較すると、発見されたアーキテクチャは、最先端のパフォーマンスであるCIFAR-10で97.54％の精度、ImageNetで75.7％のトップ1の精度を達成しています。

Differentiable architecture search (DARTS) has been a popular one-shot paradigm for NAS due to its high efficiency. It introduces trainable architecture parameters to represent the importance of candidate operations and proposes first/second-order approximation to estimate their gradients, making it possible to solve NAS by gradient descent algorithm. However, our in-depth empirical results show that the approximation will often distort the loss landscape, leading to the biased objective to optimize and in turn inaccurate gradient estimation for architecture parameters. This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation. Specifically, three representative zero-order optimization methods are introduced: RS, MGS, and GLD, among which MGS performs best by balancing the accuracy and speed. Moreover, we explore the connections between RS/MGS and gradient descent algorithm and show that our ZARTS can be seen as a robust gradient-free counterpart to DARTS. Extensive experiments on multiple datasets and search spaces show the remarkable performance of our method. In particular, results on 12 benchmarks verify the outstanding robustness of ZARTS, where the performance of DARTS collapses due to its known instability issue. Also, we search on the search space of DARTS to compare with peer methods, and our discovered architecture achieves 97.54% accuracy on CIFAR-10 and 75.7% top-1 accuracy on ImageNet, which are state-of-the-art performance.

updated: Sun Oct 10 2021 09:35:15 GMT+0000 (UTC)

published: Sun Oct 10 2021 09:35:15 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト