Forget-free Continual Learning with Soft-Winning SubNetworks

Haeyong Kang; Jaehong Yoon; Sultan Rizky Madjid; Sung Ju Hwang; Chang D. Yoo

Soft-Winning SubNetworks による忘却の継続的学習

競争力のある滑らかな (非バイナリ) サブネットワークが継続的な学習タスクの密なネットワーク内に存在することを示す正規化宝くじ仮説 (RLTH) に触発されて、提案された 2 つのアーキテクチャベースの継続的学習方法を調査します。 WSN) および各タスクの非バイナリソフトサブネットワーク (SoftNet)。 WSN と SoftNet は、各タスクに関連付けられたサブネットワークの正則化されたモデルの重みとタスク適応型の非バイナリマスクを共同で学習し、前のサブネットワークの重みを再利用してアクティブ化する (勝利チケット) 重みの小さなセットを選択しようとします。提案された WSN と SoftNet は、選択された各サブネットワークモデルが Task Incremental Learning (TIL) の他のサブネットワークを侵害しないため、壊滅的な忘却の影響を本質的に受けません。 TIL では、当選チケットごとに生成されたバイナリマスクが 1 つの N ビットバイナリディジットマスクにエンコードされ、ハフマンコーディングを使用して圧縮され、ネットワーク容量がタスク数に対してサブリニアに増加します。驚くべきことに、推論ステップで、取得した WSN のバックグラウンド (WSN のフォアグラウンドを保持) に小さなノイズを注入することによって生成された SoftNet は、TIL の将来のタスクに優れた転送能力を提供します。 SoftNet は、オーバーフィッティングに対処するためにパラメーターを正則化する際に、WSN よりも有効であることを示しています。いくつかの例は、Few-shot Class Incremental Learning (FSCIL) で示されています。

Inspired by Regularized Lottery Ticket Hypothesis (RLTH), which states that competitive smooth (non-binary) subnetworks exist within a dense network in continual learning tasks, we investigate two proposed architecture-based continual learning methods which sequentially learn and select adaptive binary- (WSN) and non-binary Soft-Subnetworks (SoftNet) for each task. WSN and SoftNet jointly learn the regularized model weights and task-adaptive non-binary masks of subnetworks associated with each task whilst attempting to select a small set of weights to be activated (winning ticket) by reusing weights of the prior subnetworks. Our proposed WSN and SoftNet are inherently immune to catastrophic forgetting as each selected subnetwork model does not infringe upon other subnetworks in Task Incremental Learning (TIL). In TIL, binary masks spawned per winning ticket are encoded into one N-bit binary digit mask, then compressed using Huffman coding for a sub-linear increase in network capacity to the number of tasks. Surprisingly, in the inference step, SoftNet generated by injecting small noises to the backgrounds of acquired WSN (holding the foregrounds of WSN) provides excellent forward transfer power for future tasks in TIL. SoftNet shows its effectiveness over WSN in regularizing parameters to tackle the overfitting, to a few examples in Few-shot Class Incremental Learning (FSCIL).

updated: Mon Mar 27 2023 07:53:23 GMT+0000 (UTC)

published: Mon Mar 27 2023 07:53:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト