Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement Learning

Dongseok Shim; H. Jin Kim

ガウスRAM：確率的網膜に触発された垣間見ることと強化学習による軽量画像分類

画像分類に関するこれまでの研究は、リアルタイム操作やモデル圧縮ではなく、主にネットワークのパフォーマンスに焦点を合わせていました。ガウスディープリカレントビジュアルアテンションモデル（GDRAM）を提案します。これは、画像全体を入力として使用する従来のCNN（畳み込みニューラルネットワーク）よりも優れた、大規模画像分類のための強化学習ベースの軽量ディープニューラルネットワークです。生物学的視覚認識プロセスに非常に触発された私たちのモデルは、ガウス分布で網膜の確率的位置を模倣しています。幅と高さの両方で128にサイズ変更されたLargecluttered MNIST、Large CIFAR-10、およびLargeCIFAR-100データセットでモデルを評価します。

Previous studies on image classification have mainly focused on the performance of the networks, not on real-time operation or model compression. We propose a Gaussian Deep Recurrent visual Attention Model (GDRAM)- a reinforcement learning based lightweight deep neural network for large scale image classification that outperforms the conventional CNN (Convolutional Neural Network) which uses the entire image as input. Highly inspired by the biological visual recognition process, our model mimics the stochastic location of the retina with Gaussian distribution. We evaluate the model on Large cluttered MNIST, Large CIFAR-10 and Large CIFAR-100 datasets which are resized to 128 in both width and height.

updated: Thu Nov 12 2020 04:27:06 GMT+0000 (UTC)

published: Thu Nov 12 2020 04:27:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト