Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces

Bert Moons; Parham Noorzad; Andrii Skliar; Giovanni Mariani; Dushyant Mehta; Chris Lott; Tijmen Blankevoort

最適なニューラルネットワークの蒸留：多様な空間での迅速な検索

今日、最先端のニューラルアーキテクチャ検索（NAS）メソッドは、低いトレーニングコストで多くのハードウェアプラットフォームやシナリオに拡張できないか、多様性がなく、制約の厳しいアーキテクチャ検索スペースしか処理できません。これらの問題を解決するために、DONNA（Distilling Optimal Neural Network Architectures）を紹介します。これは、多くのユーザーシナリオに対応できる、高速で多様なNASの新しいパイプラインです。 DONNAでは、検索は3つのフェーズで構成されます。まず、ブロックごとの知識蒸留を使用して精度予測子を作成します。この予測子を使用すると、レイヤータイプやアテンションメカニズムなどのさまざまなマクロアーキテクチャパラメータや、ブロックリピートや拡張率などのマイクロアーキテクチャパラメータを使用して、さまざまなネットワークを検索できます。第2に、急速な進化的検索フェーズでは、精度予測子とデバイス上の測定値を使用して、あらゆるシナリオに最適なパレートアーキテクチャのセットを見つけます。第三に、最適なモデルは、ゼロからのトレーニングの精度にすばやく微調整されます。このアプローチにより、DONNAは、デバイス上で最先端のアーキテクチャを見つける際に、MNasNetよりも最大100倍高速になります。 ImageNetを分類すると、DONNAアーキテクチャはNvidia V100 GPUのEfficientNet-B0およびMobileNetV2よりも20％速く、Samsung S20スマートフォンのMobileNetV2-1.4xよりも0.5％高い精度で10％高速です。 NASに加えて、DONNAは、検索スペースの拡張と探索、およびハードウェア対応のモデル圧縮に使用されます。

Today, state-of-the-art Neural Architecture Search (NAS) methods cannot scale to many hardware platforms or scenarios at a low training costs and/or can only handle non-diverse, heavily constrained architectural search-spaces. To solve these issues, we present DONNA (Distilling Optimal Neural Network Architectures), a novel pipeline for rapid and diverse NAS, that scales to many user scenarios. In DONNA, a search consists of three phases. First, an accuracy predictor is built using blockwise knowledge distillation. This predictor enables searching across diverse networks with varying macro-architectural parameters such as layer types and attention mechanisms as well as across micro-architectural parameters such as block repeats and expansion rates. Second, a rapid evolutionary search phase finds a set of Pareto-optimal architectures for any scenario using the accuracy predictor and on-device measurements. Third, optimal models are quickly finetuned to training-from-scratch accuracy. With this approach, DONNA is up to 100x faster than MNasNet in finding state-of-the-art architectures on-device. Classifying ImageNet, DONNA architectures are 20% faster than EfficientNet-B0 and MobileNetV2 on a Nvidia V100 GPU and 10% faster with 0.5% higher accuracy than MobileNetV2-1.4x on a Samsung S20 smartphone. In addition to NAS, DONNA is used for search-space extension and exploration, as well as hardware-aware model compression.

updated: Fri May 14 2021 08:14:26 GMT+0000 (UTC)

published: Wed Dec 16 2020 11:00:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト