Discovering Parametric Activation Functions

Garrett Bingham; Risto Miikkulainen

パラメトリック活性化関数の発見

最近の研究では、活性化関数の選択が深層学習ネットワークのパフォーマンスに大きな影響を与える可能性があることが示されています。ただし、新しい活性化関数の利点は一貫性がなく、タスクに依存しているため、正規化線形ユニット（ReLU）が依然として最も一般的に使用されています。この論文では、活性化関数を自動的にカスタマイズして、信頼性の高いパフォーマンスの向上をもたらす手法を提案します。進化的検索は、関数の一般的な形式を発見するために使用され、勾配降下法は、ネットワークのさまざまな部分および学習プロセス全体でそのパラメーターを最適化するために使用されます。 CIFAR-10およびCIFAR-100画像分類データセットでの4つの異なるニューラルネットワークアーキテクチャの実験は、このアプローチが効果的であることを示しています。さまざまなアーキテクチャの一般的なアクティベーション機能と特殊な機能の両方を検出し、ReLUやその他のアクティベーション機能よりも大幅に精度を一貫して向上させます。したがって、このアプローチは、ディープラーニングを新しいタスクに適用する際の自動化された最適化ステップとして使用できます。

Recent studies have shown that the choice of activation function can significantly affect the performance of deep learning networks. However, the benefits of novel activation functions have been inconsistent and task dependent, and therefore the rectified linear unit (ReLU) is still the most commonly used. This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance. Evolutionary search is used to discover the general form of the function, and gradient descent to optimize its parameters for different parts of the network and over the learning process. Experiments with four different neural network architectures on the CIFAR-10 and CIFAR-100 image classification datasets show that this approach is effective. It discovers both general activation functions and specialized functions for different architectures, consistently improving accuracy over ReLU and other activation functions by significant margins. The approach can therefore be used as an automated optimization step in applying deep learning to new tasks.

updated: Fri Jan 21 2022 19:39:36 GMT+0000 (UTC)

published: Fri Jun 05 2020 00:25:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト