Continual Learning with Dependency Preserving Hypernetworks

Dupati Srikar Chandra; Sakshi Varshney; P. K. Srijith; Sunil Gupta

依存関係を維持するハイパーネットワークによる継続的な学習

人間は、多様な知識を蓄積し、将来のタスクのために微調整することにより、一生を通じて継続的に学習します。同様の目標が与えられた場合、連続するタスク全体のデータ分布が学習の過程で定常的でない場合、ニューラルネットワークは壊滅的な忘却に悩まされます。このような継続的学習 (CL) の問題に対処する効果的なアプローチは、ターゲットネットワークのタスクに依存する重みを生成するハイパーネットワークを使用することです。ただし、既存のハイパーネットワークベースのアプローチの継続的な学習パフォーマンスは、パラメーターの効率を維持するためにレイヤー全体で重みが独立しているという仮定の影響を受けます。この制限に対処するために、依存関係を保持するハイパーネットワークを使用して、パラメーターの効率を維持しながらターゲットネットワークの重みを生成する新しいアプローチを提案します。層の重みを効率的に生成できるリカレントニューラルネットワーク (RNN) ベースのハイパーネットワークを使用することを提案します。さらに、継続的な学習パフォーマンスをさらに向上させるために、RNN ベースのハイパーネットワークの新しい正則化とネットワーク成長手法を提案します。提案された方法の有効性を実証するために、いくつかの画像分類の継続的な学習タスクと設定について実験を行いました。 RNNハイパーネットワークに基づく提案された方法は、これらすべてのCL設定とタスクでベースラインよりも優れていることがわかりました。

Humans learn continually throughout their lifespan by accumulating diverse knowledge and fine-tuning it for future tasks. When presented with a similar goal, neural networks suffer from catastrophic forgetting if data distributions across sequential tasks are not stationary over the course of learning. An effective approach to address such continual learning (CL) problems is to use hypernetworks which generate task dependent weights for a target network. However, the continual learning performance of existing hypernetwork based approaches are affected by the assumption of independence of the weights across the layers in order to maintain parameter efficiency. To address this limitation, we propose a novel approach that uses a dependency preserving hypernetwork to generate weights for the target network while also maintaining the parameter efficiency. We propose to use recurrent neural network (RNN) based hypernetwork that can generate layer weights efficiently while allowing for dependencies across them. In addition, we propose novel regularisation and network growth techniques for the RNN based hypernetwork to further improve the continual learning performance. To demonstrate the effectiveness of the proposed methods, we conducted experiments on several image classification continual learning tasks and settings. We found that the proposed methods based on the RNN hypernetworks outperformed the baselines in all these CL settings and tasks.

updated: Fri Sep 16 2022 04:42:21 GMT+0000 (UTC)

published: Fri Sep 16 2022 04:42:21 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト