NAS-Bench-360: Benchmarking Diverse Tasks for Neural Architecture Search

Renbo Tu; Mikhail Khodak; Nicholas Roberts; Ameet Talwalkar

NAS-Bench-360：ニューラルアーキテクチャ検索のための多様なタスクのベンチマーク

ほとんどの既存のニューラルアーキテクチャ検索（NAS）ベンチマークとアルゴリズムは、CIFARやImageNetでの画像分類など、十分に研究されたタスクのパフォーマンスを優先します。これにより、より多様な分野でのNASアプローチの適用性が十分に理解されていません。このホワイトペーパーでは、畳み込みニューラルネットワーク（CNN）の最先端のNAS手法を評価するためのベンチマークスイートであるNAS-Bench-360を紹介します。それを構築するために、アプリケーションドメイン、データセットサイズ、問題の次元、および学習目標の多様な配列にまたがる10のタスクのコレクションをキュレートします。最新のCNNベースの検索方法と相互運用できるが、元の開発ドメインからも遠く離れているタスクを慎重に選択することで、NAS-Bench-360を使用して、次の中心的な質問を調査できます。 -アートNASメソッドはさまざまなタスクでうまく機能しますか？私たちの実験は、画像分類用に設計された最新のNAS手順が、他の次元や学習目的を持つタスクに適したアーキテクチャを実際に見つけることができることを示しています。ただし、同じ方法は、よりタスク固有の方法と格闘し、非ビジョンドメインでの分類で壊滅的に不十分に実行されます。 NASの堅牢性のケースは、リソースに制約のある設定ではさらに悲惨になります。最近のNASの方法では、はるかに単純なベースラインに比べてほとんどまたはまったくメリットがありません。これらの結果は、真に堅牢で自動化されたパイプラインの重要なコンポーネントであるさまざまなタスクでうまく機能するNASアプローチの開発を支援するNAS-Bench-360などのベンチマークの必要性を示しています。最後に、一連のタスクで可能になる将来の研究の種類のデモンストレーションを行います。すべてのデータとコードは公開されています。

Most existing neural architecture search (NAS) benchmarks and algorithms prioritize performance on well-studied tasks, e.g., image classification on CIFAR and ImageNet. This makes the applicability of NAS approaches in more diverse areas inadequately understood. In this paper, we present NAS-Bench-360, a benchmark suite for evaluating state-of-the-art NAS methods for convolutional neural networks (CNNs). To construct it, we curate a collection of ten tasks spanning a diverse array of application domains, dataset sizes, problem dimensionalities, and learning objectives. By carefully selecting tasks that can both interoperate with modern CNN-based search methods but that are also far-afield from their original development domain, we can use NAS-Bench-360 to investigate the following central question: do existing state-of-the-art NAS methods perform well on diverse tasks? Our experiments show that a modern NAS procedure designed for image classification can indeed find good architectures for tasks with other dimensionalities and learning objectives; however, the same method struggles against more task-specific methods and performs catastrophically poorly on classification in non-vision domains. The case for NAS robustness becomes even more dire in a resource-constrained setting, where a recent NAS method provides little-to-no benefit over much simpler baselines. These results demonstrate the need for a benchmark such as NAS-Bench-360 to help develop NAS approaches that work well on a variety of tasks, a crucial component of a truly robust and automated pipeline. We conclude with a demonstration of the kind of future research our suite of tasks will enable. All data and code is made publicly available.

updated: Sat Oct 16 2021 00:52:02 GMT+0000 (UTC)

published: Tue Oct 12 2021 01:13:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト