Training DNNs in O(1) memory with MEM-DFA using Random Matrices

Tien Chu; Kamil Mykitiuk; Miron Szewczyk; Adam Wiktor; Zbigniew Wojna

ランダム行列を使用したMEM-DFAを使用したO（1）メモリでのDNNのトレーニング

この作業は、ディープニューラルネットワークをトレーニングするときにメモリ消費を一定の複雑さに減らす方法を示しています。このアルゴリズムは、バックプロパゲーション（BP）の生物学的にもっともらしい代替案であるダイレクトフィードバックアラインメント（DFA）とフィードバックアラインメント（FA）に基づいており、ランダム行列を使用してエラーを伝播します。提案された方法であるメモリ効率の高いダイレクトフィードバックアラインメント（MEM-DFA）は、DFAのレイヤーの独立性を高め、標準のBP、FA、DFAとは異なり、すべてのアクティベーションベクトルを一度に保存することを回避できます。したがって、私たちのアルゴリズムのメモリ使用量は、ニューラルネットワークの層の数に関係なく一定です。この方法では、1つの余分なフォワードパスの定数係数によってのみ計算コストが増加します。 MEM-DFA、BP、FA、およびDFAは、さまざまなニューラルネットワークモデルのMNISTおよびCIFAR-10データセットのメモリプロファイルとともに評価されました。私たちの実験は私たちの理論的結果と一致しており、他のアルゴリズムと比較してMEM-DFAのメモリコストが大幅に減少していることを示しています。

This work presents a method for reducing memory consumption to a constant complexity when training deep neural networks. The algorithm is based on the more biologically plausible alternatives of the backpropagation (BP): direct feedback alignment (DFA) and feedback alignment (FA), which use random matrices to propagate error. The proposed method, memory-efficient direct feedback alignment (MEM-DFA), uses higher independence of layers in DFA and allows avoiding storing at once all activation vectors, unlike standard BP, FA, and DFA. Thus, our algorithm's memory usage is constant regardless of the number of layers in a neural network. The method increases the computational cost only by a constant factor of one extra forward pass. The MEM-DFA, BP, FA, and DFA were evaluated along with their memory profiles on MNIST and CIFAR-10 datasets on various neural network models. Our experiments agree with our theoretical results and show a significant decrease in the memory cost of MEM-DFA compared to the other algorithms.

updated: Mon Dec 21 2020 23:27:40 GMT+0000 (UTC)

published: Mon Dec 21 2020 23:27:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト