Robust Federated Learning against both Data Heterogeneity and Poisoning Attack via Aggregation Optimization

Yueqi Xie; Weizhong Zhang; Renjie Pi; Fangzhao Wu; Qifeng Chen; Xing Xie; Sunghun Kim

集約最適化によるデータの不均一性とポイズニング攻撃の両方に対するロバストなフェデレーテッドラーニング

クライアント間の非 IID データ分散とポイズニング攻撃は、実際の連合学習 (FL) システムにおける 2 つの主な課題です。どちらも特定の戦略が開発されて大きな研究関心を集めていますが、統一されたフレームワークでそれらに対処する既知のソリューションはありません。両方の課題を普遍的に克服するために、SmartFL を提案します。これは、サブスペーストレーニング技術を介してサービスプロバイダー自体によって収集された少量のプロキシデータを使用して、サーバー側の集計プロセスを最適化する一般的なアプローチです。具体的には、各ラウンドで参加している各クライアントの集約の重みは、サーバーが収集したプロキシデータを使用して最適化されます。これは、本質的に、クライアントモデルがまたがる凸包におけるグローバルモデルの最適化です。各ラウンドで、サーバー側で最適化された調整可能なパラメーターの数は、参加しているクライアントの数に等しいため (したがって、モデルのサイズとは無関係です)、少量のプロキシデータのみを使用して、大量のパラメーターを使用してグローバルモデルをトレーニングできます (たとえば、約 100 のサンプル)。最適化されたアグリゲーションにより、SmartFL は異種クライアントと悪意のあるクライアントの両方に対する堅牢性を保証します。これは、どちらかまたは両方の問題が発生する可能性がある実際の FL では望ましいことです。 SmartFL の収束および汎化能力の理論的分析を提供します。経験的に、SmartFL は非 IID データ配信を使用する FL と悪意のあるクライアントを使用する FL の両方で最先端のパフォーマンスを実現します。ソースコードが公開されます。

Non-IID data distribution across clients and poisoning attacks are two main challenges in real-world federated learning (FL) systems. While both of them have attracted great research interest with specific strategies developed, no known solution manages to address them in a unified framework. To universally overcome both challenges, we propose SmartFL, a generic approach that optimizes the server-side aggregation process with a small amount of proxy data collected by the service provider itself via a subspace training technique. Specifically, the aggregation weight of each participating client at each round is optimized using the server-collected proxy data, which is essentially the optimization of the global model in the convex hull spanned by client models. Since at each round, the number of tunable parameters optimized on the server side equals the number of participating clients (thus independent of the model size), we are able to train a global model with massive parameters using only a small amount of proxy data (e.g., around one hundred samples). With optimized aggregation, SmartFL ensures robustness against both heterogeneous and malicious clients, which is desirable in real-world FL where either or both problems may occur. We provide theoretical analyses of the convergence and generalization capacity for SmartFL. Empirically, SmartFL achieves state-of-the-art performance on both FL with non-IID data distribution and FL with malicious clients. The source code will be released.

updated: Sun Nov 20 2022 05:55:12 GMT+0000 (UTC)

published: Thu Nov 10 2022 13:20:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト