Learn Faster and Forget Slower via Fast and Stable Task Adaptation

Farshid Varno; Lucas May Petry; Lisa Di Jorio; Stan Matwin

速くて安定したタスクの適応により、より速く学び、より遅く忘れる

ディープニューラルネットワーク（DNN）のトレーニングは、依然として非常に時間と計算集約的です。事前にトレーニングされたモデルを採用すると、このプロセスが大幅に加速する可能性があることが示されています。分類に焦点を当てて、現在の微調整技術により、新しいタスクについて何かが学習される前であっても、事前にトレーニングされたモデルが転送された知識を壊滅的に忘れることを示します。このような急速な知識の喪失は、転移学習のメリットを損ない、最大量の知識が活用される場合と比較して、収束速度がはるかに遅くなる可能性があります。この問題の原因をさまざまな観点から調査し、それを軽減するために、適用が簡単な微調整アルゴリズムであるFast And Stable Task-adaptation（FAST）を導入します。このホワイトペーパーでは、ソースタスクとターゲットタスクの損失状況がさまざまな転移学習戦略でどのようにリンクされているかについて、新しい幾何学的な視点を提供します。経験的に、一般的な微調整手法と比較して、FASTはターゲットタスクをより速く学習し、ソースタスクをより遅く忘れることを示しています。

Training Deep Neural Networks (DNNs) is still highly time-consuming and compute-intensive. It has been shown that adapting a pretrained model may significantly accelerate this process. With a focus on classification, we show that current fine-tuning techniques make the pretrained models catastrophically forget the transferred knowledge even before anything about the new task is learned. Such rapid knowledge loss undermines the merits of transfer learning and may result in a much slower convergence rate compared to when the maximum amount of knowledge is exploited. We investigate the source of this problem from different perspectives and to alleviate it, introduce Fast And Stable Task-adaptation (FAST), an easy to apply fine-tuning algorithm. The paper provides a novel geometric perspective on how the loss landscape of source and target tasks are linked in different transfer learning strategies. We empirically show that compared to prevailing fine-tuning practices, FAST learns the target task faster and forgets the source task slower.

updated: Sun Nov 29 2020 16:01:50 GMT+0000 (UTC)

published: Thu Jul 02 2020 21:13:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト