Recurrent Parameter Generators

Jiayun Wang; Yubei Chen; Stella X. Yu; Brian Cheung; Yann LeCun

リカレントパラメータジェネレータ

深いネットワークを構築するために、多くの異なる畳み込み層に同じパラメーターを繰り返し使用するための一般的な方法を示します。具体的には、ネットワークの場合、各畳み込み層のパラメーターが生成される反復パラメータージェネレーター（RPG）を作成します。反復モデルを使用して深い畳み込みニューラルネットワーク（CNN）を構築することはまったく新しいことではありませんが、私たちの方法は、既存の作業と比較して大幅なパフォーマンスの向上を実現します。さまざまなアプリケーションやデータセットで他の従来のCNNモデルと比較して同様のパフォーマンスを実現するために、1層ニューラルネットワークを構築する方法を示します。このような方法により、任意の量のパラメーターを使用して任意に複雑なニューラルネットワークを構築できます。たとえば、モデルパラメータを400倍以上削減したResNet34を構築しましたが、それでも41.6％のImageNetトップ1精度を達成しています。さらに、RPGはレイヤー、ブロック、さらにはサブネットワークなど、さまざまなスケールで適用できることを示します。具体的には、RPGを使用して、従来のResNetの1つの畳み込み層に相当する重みの数でResNet18ネットワークを構築し、このモデルが67.2％のImageNetトップ1精度を達成できることを示します。提案された方法は、モデル圧縮への逆アプローチと見なすことができます。大きなモデルから未使用のパラメーターを削除するのではなく、より多くの情報を少数のパラメーターに絞り込むことを目的としています。提案された反復パラメータジェネレータの能力を実証するために、広範な実験結果が提供されています。

We present a generic method for recurrently using the same parameters for many different convolution layers to build a deep network. Specifically, for a network, we create a recurrent parameter generator (RPG), from which the parameters of each convolution layer are generated. Though using recurrent models to build a deep convolutional neural network (CNN) is not entirely new, our method achieves significant performance gain compared to the existing works. We demonstrate how to build a one-layer neural network to achieve similar performance compared to other traditional CNN models on various applications and datasets. Such a method allows us to build an arbitrarily complex neural network with any amount of parameters. For example, we build a ResNet34 with model parameters reduced by more than 400 times, which still achieves 41.6% ImageNet top-1 accuracy. Furthermore, we demonstrate the RPG can be applied at different scales, such as layers, blocks, or even sub-networks. Specifically, we use the RPG to build a ResNet18 network with the number of weights equivalent to one convolutional layer of a conventional ResNet and show this model can achieve 67.2% ImageNet top-1 accuracy. The proposed method can be viewed as an inverse approach to model compression. Rather than removing the unused parameters from a large model, it aims to squeeze more information into a small number of parameters. Extensive experiment results are provided to demonstrate the power of the proposed recurrent parameter generator.

updated: Thu Jul 15 2021 04:23:59 GMT+0000 (UTC)

published: Thu Jul 15 2021 04:23:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト