FEAR: A Simple Lightweight Method to Rank Architectures

Debadeepta Dey; Shital Shah; Sebastien Bubeck

FEAR: アーキテクチャをランク付けするシンプルで軽量な方法

Neural Architecture Search (NAS) の基本的な問題は、指定された検索スペースから高性能のアーキテクチャを効率的に見つけることです。任意の検索空間でアーキテクチャをランク付けするために、FEAR と呼ばれる単純だが強力な方法を提案します。 FEAR は、ニューラルネットワークが強力な非線形特徴抽出器であるという観点を活用しています。まず、検索空間内の異なるアーキテクチャを同じトレーニングまたは検証エラーにトレーニングします。次に、各アーキテクチャによって抽出された特徴の有用性を比較します。私たちは、アーキテクチャのほとんどを凍結したまま、迅速なトレーニングを行います。これにより、相対的なパフォーマンスをすばやく見積もることができます。競合するベースラインに対して 3 つの異なるデータセットの Natsbench トポロジ検索スペースの FEAR を検証し、特に最近提案されたゼロコスト手法と比較して強いランキング相関を示します。 FEAR は、検索領域での高性能アーキテクチャのランク付けに特に優れています。ランダム検索などの離散検索アルゴリズムの内側のループで使用すると、FEAR は精度を失うことなく検索時間を約 2.4 分の 1 に短縮できます。さらに、最近提案されたランキングのゼロコスト対策を経験的に研究し、トレーニングが進むにつれてランキングのパフォーマンスが低下し、データセットを無視したデータに依存しないランキングスコアは、異なるデータセット間で一般化されないことを発見しました。

The fundamental problem in Neural Architecture Search (NAS) is to efficiently find high-performing architectures from a given search space. We propose a simple but powerful method which we call FEAR, for ranking architectures in any search space. FEAR leverages the viewpoint that neural networks are powerful non-linear feature extractors. First, we train different architectures in the search space to the same training or validation error. Then, we compare the usefulness of the features extracted by each architecture. We do so with a quick training keeping most of the architecture frozen. This gives fast estimates of the relative performance. We validate FEAR on Natsbench topology search space on three different datasets against competing baselines and show strong ranking correlation especially compared to recently proposed zero-cost methods. FEAR particularly excels at ranking high-performance architectures in the search space. When used in the inner loop of discrete search algorithms like random search, FEAR can cut down the search time by approximately 2.4X without losing accuracy. We additionally empirically study very recently proposed zero-cost measures for ranking and find that they breakdown in ranking performance as training proceeds and also that data-agnostic ranking scores which ignore the dataset do not generalize across dissimilar datasets.

updated: Mon Jun 07 2021 23:38:21 GMT+0000 (UTC)

published: Mon Jun 07 2021 23:38:21 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト