GENNAPE: Towards Generalized Neural Architecture Performance Estimators

Keith G. Mills; Fred X. Han; Jialin Zhang; Fabian Chudak; Ali Safari Mamaghani; Mohammad Salameh; Wei Lu; Shangling Jui; Di Niu

GENNAPE: 一般化されたニューラルアーキテクチャパフォーマンスエスティメータに向けて

ニューラルアーキテクチャのパフォーマンスを予測することは困難な作業であり、ニューラルアーキテクチャの設計と検索に不可欠です。既存のアプローチは、特定の一連の演算子と接続ルールを含む事前定義された設計空間でアーキテクチャをモデル化することに限定されているニューラルパフォーマンス予測因子に依存しており、目に見えないアーキテクチャに一般化できないか、常に正確であるとは限らないゼロコストプロキシに頼っています。このホワイトペーパーでは、GENNAPE を提案します。GENNAPE は、オープンニューラルアーキテクチャベンチマークで事前トレーニングされており、ネットワーク表現、対照的な事前トレーニング、およびファジークラスタリングベースの予測子アンサンブルの革新を組み合わせることで、まったく目に見えないアーキテクチャに一般化することを目的としています。具体的には、GENNAPE は、任意のアーキテクチャをモデル化できるアトミック操作の計算グラフ (CG) として、特定のニューラルネットワークを表します。最初にコントラストラーニングを介してグラフエンコーダーを学習し、トポロジー機能によるネットワーク分離を促進します。次に、ニューラルネットワークのファジーメンバーシップに従ってソフト集約された複数の予測子ヘッドをトレーニングします。実験によると、NAS-Bench-101 で事前トレーニングされた GENNAPE は、NAS-Bench-201、NAS-Bench-301、MobileNet、および ResNet ファミリを含む 5 つの異なるパブリックニューラルネットワークベンチマークに対して、微調整なしまたは最小限の微調整で優れた転送可能性を達成できることが示されています。さらに、新たにラベル付けされた 3 つのチャレンジングなニューラルネットワークベンチマーク、HiAML、Inception、および Two-Path を紹介します。これらは、狭い精度範囲に集中できます。広範な実験により、GENNAPE がこれらのファミリの高性能アーキテクチャを正しく識別できることが示されています。最後に、検索アルゴリズムと組み合わせると、GENNAPE は 3 つのファミリで FLOP を減らしながら精度を向上させるアーキテクチャを見つけることができます。

Predicting neural architecture performance is a challenging task and is crucial to neural architecture design and search. Existing approaches either rely on neural performance predictors which are limited to modeling architectures in a predefined design space involving specific sets of operators and connection rules, and cannot generalize to unseen architectures, or resort to zero-cost proxies which are not always accurate. In this paper, we propose GENNAPE, a Generalized Neural Architecture Performance Estimator, which is pretrained on open neural architecture benchmarks, and aims to generalize to completely unseen architectures through combined innovations in network representation, contrastive pretraining, and fuzzy clustering-based predictor ensemble. Specifically, GENNAPE represents a given neural network as a Computation Graph (CG) of atomic operations which can model an arbitrary architecture. It first learns a graph encoder via Contrastive Learning to encourage network separation by topological features, and then trains multiple predictor heads, which are soft-aggregated according to the fuzzy membership of a neural network. Experiments show that GENNAPE pretrained on NAS-Bench-101 can achieve superior transferability to 5 different public neural network benchmarks, including NAS-Bench-201, NAS-Bench-301, MobileNet and ResNet families under no or minimum fine-tuning. We further introduce 3 challenging newly labelled neural network benchmarks: HiAML, Inception and Two-Path, which can concentrate in narrow accuracy ranges. Extensive experiments show that GENNAPE can correctly discern high-performance architectures in these families. Finally, when paired with a search algorithm, GENNAPE can find architectures that improve accuracy while reducing FLOPs on three families.

updated: Mon Apr 24 2023 20:01:14 GMT+0000 (UTC)

published: Wed Nov 30 2022 18:27:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト