CNN with large memory layers

Rasul Karimov; Yury Malkov; Karim Iskakov; Victor Lempitsky

大容量メモリレイヤーを備えたCNN

この作業は、最近提案されたプロダクトキーメモリ構造large_memoryを中心としており、多くのコンピュータビジョンアプリケーションに実装されています。メモリ構造は、ほぼすべてのニューラルネットワークアーキテクチャに拡張するのに適した単純な計算プリミティブと見なすことができます。メモリブロックを使用すると、メモリ容量に関して平方根の複雑さをスケーリングして、メモリへのスパースアクセスを実装できます。後者のスケーリングは、最近傍探索の鍵空間のデカルト積空間分解が組み込まれているために可能です。分類、画像の再構築、再ローカリゼーションの問題についてメモリレイヤーをテストしたところ、一部のメモリレイヤーでは、Key-Value要素の使用率が高いため、速度と精度が大幅に向上することがわかりました。 -チューニングし、キーが死ぬことに苦しんでいます。後者の問題に取り組むために、メモリの再初期化の簡単な手法を導入しました。これは、メモリから未使用のキーと値のペアを削除し、それらを再度トレーニングに使用するのに役立ちます。さまざまな実験を実施し、分類およびPoseNet再ローカリゼーションモデルの速度と精度を改善しました。再初期化がランダムにラベル付けされたデータのおもちゃの例に大きな影響を与えることを示し、画像分類タスクのパフォーマンスがいくらか向上することを観察しました。また、画像と選択されたメモリセル間の空間相関を観察しながら、再ローカリゼーション問題に対する大容量メモリ層の一般化プロパティの忍耐力を示しました。

This work is centred around the recently proposed product key memory structure large_memory, implemented for a number of computer vision applications. The memory structure can be regarded as a simple computation primitive suitable to be augmented to nearly all neural network architectures. The memory block allows implementing sparse access to memory with square root complexity scaling with respect to the memory capacity. The latter scaling is possible due to the incorporation of Cartesian product space decomposition of the key space for the nearest neighbour search. We have tested the memory layer on the classification, image reconstruction and relocalization problems and found that for some of those, the memory layers can provide significant speed/accuracy improvement with the high utilization of the key-value elements, while others require more careful fine-tuning and suffer from dying keys. To tackle the later problem we have introduced a simple technique of memory re-initialization which helps us to eliminate unused key-value pairs from the memory and engage them in training again. We have conducted various experiments and got improvements in speed and accuracy for classification and PoseNet relocalization models. We showed that the re-initialization has a huge impact on a toy example of randomly labeled data and observed some gains in performance on the image classification task. We have also demonstrated the generalization property perseverance of the large memory layers on the relocalization problem, while observing the spatial correlations between the images and the selected memory cells.

updated: Mon Apr 26 2021 09:42:58 GMT+0000 (UTC)

published: Wed Jan 27 2021 20:58:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト