SPACE: Structured Compression and Sharing of Representational Space for Continual Learning

Gobinda Saha; Isha Garg; Aayush Ankit; Kaushik Roy

スペース：継続的な学習のための表現スペースの構造化された圧縮と共有

人間は生涯を通じて適応的かつ効率的に学習します。ただし、タスクを段階的に学習すると、人工ニューラルネットワークが古いタスクについて学習した関連情報を上書きし、「壊滅的な忘却」が発生します。この現象を克服するための取り組みは、ネットワークアーキテクチャを拡張したり、パラメトリック重要度スコアを保存したり、タスク間のデータプライバシーを侵害したりするなど、リソースの活用が不十分な場合がよくあります。これに取り組むために、学習した空間を、以前に学習したタスクの凝縮された知識ベースとして機能するコア空間と、に類似した残余空間に分割することにより、ネットワークが継続的かつ効率的に学習できるようにするアルゴリズムであるSPACEを提案します。現在のタスクを学習するためのスクラッチスペース。各タスクを学習した後、残余は、それ自体の内部と学習されたコアスペースの両方で冗長性について分析されます。現在のタスクを説明するために必要な最小限の追加ディメンションがコアスペースに追加され、残りの残余は次のタスクを学習するために解放されます。 P-MNIST、CIFAR、および8つの異なるデータセットのシーケンスでアルゴリズムを評価し、壊滅的な忘却を克服しながら、最先端の方法と同等の精度を実現します。さらに、私たちのアルゴリズムは実際の使用に非常に適しています。パーティショニングアルゴリズムは、すべてのレイヤーを1回のショットで分析し、より深いネットワークへのスケーラビリティを保証します。さらに、次元の分析はフィルターレベルのスパース性に変換され、結果として得られるアーキテクチャの構造化された性質により、現在の最先端技術に比べてタスクの推論中にエネルギー効率が最大5倍向上します。

Humans learn adaptively and efficiently throughout their lives. However, incrementally learning tasks causes artificial neural networks to overwrite relevant information learned about older tasks, resulting in 'Catastrophic Forgetting'. Efforts to overcome this phenomenon often utilize resources poorly, for instance, by growing the network architecture or needing to save parametric importance scores, or violate data privacy between tasks. To tackle this, we propose SPACE, an algorithm that enables a network to learn continually and efficiently by partitioning the learnt space into a Core space, that serves as the condensed knowledge base over previously learned tasks, and a Residual space, which is akin to a scratch space for learning the current task. After learning each task, the Residual is analyzed for redundancy, both within itself and with the learnt Core space. A minimal number of extra dimensions required to explain the current task are added to the Core space and the remaining Residual is freed up for learning the next task. We evaluate our algorithm on P-MNIST, CIFAR and a sequence of 8 different datasets, and achieve comparable accuracy to the state-of-the-art methods while overcoming catastrophic forgetting. Additionally, our algorithm is well suited for practical use. The partitioning algorithm analyzes all layers in one shot, ensuring scalability to deeper networks. Moreover, the analysis of dimensions translates to filter-level sparsity, and the structured nature of the resulting architecture gives us up to 5x improvement in energy efficiency during task inference over the current state-of-the-art.

updated: Wed Feb 03 2021 06:23:33 GMT+0000 (UTC)

published: Thu Jan 23 2020 16:40:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト