Conquering the Communication Constraints to Enable Large Pre-Trained Models in Federated Learning

Guangyu Sun; Umar Khalid; Matias Mendieta; Taojiannan Yang; Chen Chen

フェデレーテッドラーニングで大規模な事前トレーニング済みモデルを有効にするための通信制約の克服

フェデレーテッドラーニング (FL) は、ローカルデバイス上の生データへの集中アクセスなしでモデルの共同トレーニングを可能にするための有望なパラダイムとして浮上しています。典型的な FL パラダイム (FedAvg など) では、モデルの重みはサーバーとの間でラウンドごとに参加クライアントに送信されます。最近、小さな事前トレーニング済みモデルの使用が、連合学習の最適化と収束の改善に効果的であることが示されました。ただし、最近の最先端の事前トレーニング済みモデルは機能が向上していますが、パラメーターも増えています。従来の FL では、膨大なモデルの重みを共有すると、システムに大きな通信負荷がかかる可能性があります。特に、より高性能なモデルが採用されている場合はなおさらです。強力ですぐに利用できる FL の事前トレーニング済みモデルが優れたパフォーマンスを達成すると同時に、通信の負担を軽減できるようにするソリューションを見つけることができるでしょうか?この目的のために、連合学習におけるパラメーター効率の高い微調整の使用を調査し、新しいフレームワークである FedPEFT を導入します。具体的には、さまざまなクライアントの安定性、データ配信、および差分プライバシー設定全体で FedPEFT のパフォーマンスを体系的に評価します。ローカルでのみ調整し、モデルの重みのごく一部をグローバルに共有することで、幅広いフェデレーテッドラーニングシナリオで競争力を維持したり、さらに優れたパフォーマンスを維持したりしながら、総通信オーバーヘッドを大幅に削減できます。効果的な連合システム。

Federated learning (FL) has emerged as a promising paradigm for enabling the collaborative training of models without centralized access to the raw data on local devices. In the typical FL paradigm (e.g., FedAvg), model weights are sent to and from the server each round to participating clients. Recently, the use of small pre-trained models has been shown effective in federated learning optimization and improving convergence. However, recent state-of-the-art pre-trained models are getting more capable but also have more parameters. In conventional FL, sharing the enormous model weights can quickly put a massive communication burden on the system, especially if more capable models are employed. Can we find a solution to enable those strong and readily-available pre-trained models in FL to achieve excellent performance while simultaneously reducing the communication burden? To this end, we investigate the use of parameter-efficient fine-tuning in federated learning and thus introduce a new framework: FedPEFT. Specifically, we systemically evaluate the performance of FedPEFT across a variety of client stability, data distribution, and differential privacy settings. By only locally tuning and globally sharing a small portion of the model weights, significant reductions in the total communication overhead can be achieved while maintaining competitive or even better performance in a wide range of federated learning scenarios, providing insight into a new paradigm for practical and effective federated systems.

updated: Wed Apr 03 2024 15:01:26 GMT+0000 (UTC)

published: Tue Oct 04 2022 16:08:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト