Continual Learning with Optimal Transport based Mixture Model

Quyen Tran; Hoang Phan; Khoat Than; Dinh Phung; Trung Le

最適輸送ベースの混合モデルによる継続学習

オンラインクラスのインクリメンタルラーニング (CIL) は、継続的学習 (CL) の挑戦的な設定です。新しいタスクのデータが受信ストリームに到着し、オンライン学習モデルは、以前のデータストリームを再訪することなく受信データストリームを処理する必要があります。既存の研究では、着信データストリームに適応した単一のセントロイドを使用して、クラスを特徴付けていました。このアプローチは、クラスの受信データストリームが自然にマルチモーダルである場合、制限を明らかにする可能性があります。この問題に対処するために、この作業では、最初に、成熟した最適輸送理論 (OT-MM) の優れた特性に基づいたオンライン混合モデル学習アプローチを提案します。具体的には、混合モデルの重心と共分散行列は、受信データストリームに従って段階的に適応されます。利点は 2 つあります。(i) より正確に複雑なデータストリームを特徴付けることができ、(ii) OT-MM によって生成された各クラスの重心を使用することで、目に見えない例と各クラスの類似性をより合理的に推定できます。推論。さらに、CIL シナリオで壊滅的な忘却と戦うために、動的保存をさらに提案します。特に、データストリーム全体で動的保存手法を実行した後、古いタスクと新しいタスクのクラスの潜在的な表現は、それ自体がより凝縮され、互いに分離されます。収縮特徴抽出器と一緒に、この手法はモデルが壊滅的な忘却を軽減するのを容易にします。実世界のデータセットでの実験結果は、提案された方法が現在の最先端のベースラインよりも大幅に優れていることを示しています。

Online Class Incremental learning (CIL) is a challenging setting in Continual Learning (CL), wherein data of new tasks arrive in incoming streams and online learning models need to handle incoming data streams without revisiting previous ones. Existing works used a single centroid adapted with incoming data streams to characterize a class. This approach possibly exposes limitations when the incoming data stream of a class is naturally multimodal. To address this issue, in this work, we first propose an online mixture model learning approach based on nice properties of the mature optimal transport theory (OT-MM). Specifically, the centroids and covariance matrices of the mixture model are adapted incrementally according to incoming data streams. The advantages are two-fold: (i) we can characterize more accurately complex data streams and (ii) by using centroids for each class produced by OT-MM, we can estimate the similarity of an unseen example to each class more reasonably when doing inference. Moreover, to combat the catastrophic forgetting in the CIL scenario, we further propose Dynamic Preservation. Particularly, after performing the dynamic preservation technique across data streams, the latent representations of the classes in the old and new tasks become more condensed themselves and more separate from each other. Together with a contraction feature extractor, this technique facilitates the model in mitigating the catastrophic forgetting. The experimental results on real-world datasets show that our proposed method can significantly outperform the current state-of-the-art baselines.

updated: Mon Dec 05 2022 16:59:34 GMT+0000 (UTC)

published: Wed Nov 30 2022 06:40:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト