DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training

Joya Chen; Kai Xu; Yuhui Wang; Yifei Cheng; Angela Yao

DropIT: メモリ効率の高い DNN トレーニングのための中間テンソルの削除

ディープニューラルネットワークをトレーニングする際の標準的なハードウェアボトルネックは、GPU メモリです。メモリの大部分は、バックワードパスでの勾配計算のために中間テンソルをキャッシュすることによって占有されます。このフットプリントを削減するための新しい方法、中間テンソルの削除 (DropIT) を提案します。 DropIT は、中間テンソルの min-k 要素を削除し、バックワードパスでスパース化されたテンソルから勾配を近似します。理論的には、DropIT は推定勾配のノイズを低減するため、vanilla-SGD よりも収束率が高くなります。実験では、さまざまなタスク (分類、オブジェクト検出、インスタンスセグメンテーションなど) でビジュアルトランスフォーマーと畳み込みニューラルネットワークのより高いテスト精度を達成しながら、完全に接続された畳み込みレイヤーで中間テンソル要素の最大 90% を削除できることが示されています。私たちのコードとモデルは、https://github.com/chenjoya/dropit で入手できます。

A standard hardware bottleneck when training deep neural networks is GPU memory. The bulk of memory is occupied by caching intermediate tensors for gradient computation in the backward pass. We propose a novel method to reduce this footprint - Dropping Intermediate Tensors (DropIT). DropIT drops min-k elements of the intermediate tensors and approximates gradients from the sparsified tensors in the backward pass. Theoretically, DropIT reduces noise on estimated gradients and therefore has a higher rate of convergence than vanilla-SGD. Experiments show that we can drop up to 90% of the intermediate tensor elements in fully-connected and convolutional layers while achieving higher testing accuracy for Visual Transformers and Convolutional Neural Networks on various tasks (e.g., classification, object detection, instance segmentation). Our code and models are available at https://github.com/chenjoya/dropit.

updated: Thu Mar 02 2023 13:44:43 GMT+0000 (UTC)

published: Mon Feb 28 2022 14:12:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト