Self-supervised Representation Learning for Evolutionary Neural Architecture Search

Chen Wei; Yiping Tang; Chuang Niu; Haihong Hu; Yue Wang; Jimin Liang

進化的ニューラルアーキテクチャ検索のための自己教師あり表現学習

最近提案されたニューラルアーキテクチャ検索（NAS）アルゴリズムは、アーキテクチャ検索を高速化するためにニューラル予測子を採用しています。ニューラルアーキテクチャのパフォーマンスメトリックを正確に予測するニューラルプレディクタの機能はNASにとって重要であり、ニューラルプレディクタのトレーニングデータセットの取得には時間がかかります。少量のトレーニングデータを使用して高い予測精度で神経予測子を取得する方法は、神経予測子ベースのNASの中心的な問題です。ここでは、まず、既存のベクトルベースのアーキテクチャエンコーディングスキームの欠点を克服して、ニューラルアーキテクチャのグラフ編集距離を計算する新しいアーキテクチャエンコーディングスキームを設計します。神経予測器の予測性能を向上させるために、異なる視点から2つの自己教師あり学習方法を考案し、神経予測器の一部を埋め込むアーキテクチャを事前トレーニングして、神経アーキテクチャの意味のある表現を生成します。 1つ目は、慎重に設計された2分岐グラフニューラルネットワークモデルをトレーニングして、2つの入力ニューラルアーキテクチャのグラフ編集距離を予測することです。 2番目の方法は、一般的に対照的な学習に触発されており、正のペアと負のペアを対比するためのプロキシとして中央の特徴ベクトルを利用する新しい対照学習アルゴリズムを提示します。実験結果は、事前にトレーニングされた神経予測子が、トレーニングサンプルが数分の1である監視対象の対応するものと比較して、同等または優れたパフォーマンスを達成できることを示しています。事前にトレーニングされた神経予測子を進化的なNASアルゴリズムと統合すると、NASBench-101およびNASBench201ベンチマークで最先端のパフォーマンスを実現します。

Recently proposed neural architecture search (NAS) algorithms adopt neural predictors to accelerate the architecture search. The capability of neural predictors to accurately predict the performance metrics of neural architecture is critical to NAS, and the acquisition of training datasets for neural predictors is time-consuming. How to obtain a neural predictor with high prediction accuracy using a small amount of training data is a central problem to neural predictor-based NAS. Here, we firstly design a new architecture encoding scheme that overcomes the drawbacks of existing vector-based architecture encoding schemes to calculate the graph edit distance of neural architectures. To enhance the predictive performance of neural predictors, we devise two self-supervised learning methods from different perspectives to pre-train the architecture embedding part of neural predictors to generate a meaningful representation of neural architectures. The first one is to train a carefully designed two branch graph neural network model to predict the graph edit distance of two input neural architectures. The second method is inspired by the prevalently contrastive learning, and we present a new contrastive learning algorithm that utilizes a central feature vector as a proxy to contrast positive pairs against negative pairs. Experimental results illustrate that the pre-trained neural predictors can achieve comparable or superior performance compared with their supervised counterparts with several times less training samples. We achieve state-of-the-art performance on the NASBench-101 and NASBench201 benchmarks when integrating the pre-trained neural predictors with an evolutionary NAS algorithm.

updated: Sat Oct 31 2020 04:57:16 GMT+0000 (UTC)

published: Sat Oct 31 2020 04:57:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト