HePCo: Data-Free Heterogeneous Prompt Consolidation for Continual Federated Learning

Shaunak Halbe; James Seale Smith; Junjiao Tian; Zsolt Kira

HePCo: 継続的なフェデレーテッドラーニングのためのデータフリーの異種プロンプト統合

このペーパーでは、継続的フェデレーテッドラーニング (CFL) という重要だが十分に研究されていない問題に焦点を当てます。CFL では、サーバーが一連のクライアントと通信して、データを共有したり保存したりすることなく、時間の経過とともに新しい概念を段階的に学習します。この問題の複雑さは、継続学習と連合学習の両方の観点からの課題によってさらに複雑になります。具体的には、CFL セットアップでトレーニングされたモデルは壊滅的な忘却に悩まされ、クライアント間でのデータの異質性によってさらに悪化します。この問題に対する既存の試みは、クライアントや通信チャネルに大きなオーバーヘッドを課すか、保存されたデータへのアクセスを必要とする傾向があり、プライバシーの観点から現実世界での使用には適していません。この論文では、オーバーヘッドコストを最小限に抑えながら、保存されたデータへのアクセスを必要とせずに、忘却と異質性に取り組むことを試みます。これは、プロンプトベースのアプローチ (プロンプトと分類子ヘッドのみを通信する必要があるなど) を活用し、サーバーでクライアントモデルを統合するための斬新で軽量な生成および蒸留スキームを提案することによって実現されます。私たちはこの問題を画像分類のために定式化し、比較のための強力なベースラインを確立し、CIFAR-100 だけでなく、ImageNet-R や DomainNet などの挑戦的な大規模データセットでも実験を実施します。当社のアプローチは、既存の手法と独自のベースラインの両方を 7% も上回るパフォーマンスを示し、通信コストとクライアントレベルの計算コストを大幅に削減します。

In this paper, we focus on the important yet understudied problem of Continual Federated Learning (CFL), where a server communicates with a set of clients to incrementally learn new concepts over time without sharing or storing any data. The complexity of this problem is compounded by challenges from both the Continual and Federated Learning perspectives. Specifically, models trained in a CFL setup suffer from catastrophic forgetting which is exacerbated by data heterogeneity across clients. Existing attempts at this problem tend to impose large overheads on clients and communication channels or require access to stored data which renders them unsuitable for real-world use due to privacy. In this paper, we attempt to tackle forgetting and heterogeneity while minimizing overhead costs and without requiring access to any stored data. We achieve this by leveraging a prompting based approach (such that only prompts and classifier heads have to be communicated) and proposing a novel and lightweight generation and distillation scheme to consolidate client models at the server. We formulate this problem for image classification and establish strong baselines for comparison, conduct experiments on CIFAR-100 as well as challenging, large-scale datasets like ImageNet-R and DomainNet. Our approach outperforms both existing methods and our own baselines by as much as 7% while significantly reducing communication and client-level computation costs.

updated: Fri Jun 16 2023 17:02:12 GMT+0000 (UTC)

published: Fri Jun 16 2023 17:02:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト