A Survey on Efficient Training of Transformers

Bohan Zhuang; Jing Liu; Zizheng Pan; Haoyu He; Yuetian Weng; Chunhua Shen

変圧器の効率的なトレーニングに関する調査

Transformer の最近の進歩には、コンピューティングリソースに対する膨大な要件が伴います。これは、Transformer のトレーニングをより高速に、低コストで、計算とメモリリソースを効率的に使用してより高い精度にするための効率的なトレーニング手法を開発することの重要性を強調しています。この調査では、トランスフォーマーの効率的なトレーニングの最初の体系的な概要を提供し、前者に焦点を当てて、加速演算とハードウェアの最近の進歩をカバーしています。トレーニング中に中間テンソルの計算とメモリのコストを節約する方法を、ハードウェア/アルゴリズムの協調設計に関する手法とともに分析および比較します。最後に、今後の研究の課題と有望な分野について説明します。

Recent advances in Transformers have come with a huge requirement on computing resources, highlighting the importance of developing efficient training techniques to make Transformer training faster, at lower cost, and to higher accuracy by the efficient use of computation and memory resources. This survey provides the first systematic overview of the efficient training of Transformers, covering the recent progress in acceleration arithmetic and hardware, with a focus on the former. We analyze and compare methods that save computation and memory costs for intermediate tensors during training, together with techniques on hardware/algorithm co-design. We finally discuss challenges and promising areas for future research.

updated: Thu Feb 02 2023 13:58:18 GMT+0000 (UTC)

published: Thu Feb 02 2023 13:58:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト