Benchmarking FedAvg and FedCurv for Image Classification Tasks

Bruno Casella; Roberto Esposito; Carlo Cavazzoni; Marco Aldinucci

画像分類タスクの FedAvg と FedCurv のベンチマーク

従来の機械学習手法では、単一のデータレイクで利用可能なデータのトレーニングが必要です。ただし、さまざまな所有者からのデータを集約することは、セキュリティ、プライバシー、機密性など、さまざまな理由から必ずしも便利ではありません。データには、他の人と共有すると消えてしまう可能性のある価値があります。データの共有を回避する機能により、セキュリティとプライバシーが最も重要な産業用アプリケーションが可能になり、ローカルポリシーのみを実装してグローバルモデルをトレーニングできるようになります。このポリシーは、独立して実行でき、エアギャップデータセンターでも実行できます。フェデレーテッドラーニング (FL) は分散型機械学習アプローチであり、データを分散させたままローカル AI モデルのみを共有することで、プライバシーの問題に対処する効果的な方法として浮上しています。 Federated Learning の 2 つの重要な課題は、同じ連合ネットワーク内の異種システムを管理することと、多くの場合、クライアント間で独立して同一に分散されていない (非 IID) 実際のデータを処理することです。本稿では、2 番目の問題、つまり、同じ連合ネットワーク内のデータの統計的不均一性の問題に焦点を当てます。この設定では、局所モデルが完全なデータセットの局所最適から遠く離れている可能性があり、連合モデルの収束を妨げる可能性があります。 FedAvg、FedProx、Federated Curvature (FedCurv) など、非 IID 設定に取り組むことを目的としたいくつかの Federated Learning アルゴリズムが既に提案されています。この作業は、一般的な非 IID シナリオにおける FedAvg と FedCurv の動作の経験的評価を提供します。結果は、ラウンドごとのエポック数が重要なハイパーパラメーターであることを示しており、適切に調整すると、通信コストを削減しながらパフォーマンスを大幅に向上させることができます。この作業の副産物として、FL コミュニティからのさらなる比較を容易にするために、使用したデータセットの非 IID バージョンをリリースします。

Classic Machine Learning techniques require training on data available in a single data lake. However, aggregating data from different owners is not always convenient for different reasons, including security, privacy and secrecy. Data carry a value that might vanish when shared with others; the ability to avoid sharing the data enables industrial applications where security and privacy are of paramount importance, making it possible to train global models by implementing only local policies which can be run independently and even on air-gapped data centres. Federated Learning (FL) is a distributed machine learning approach which has emerged as an effective way to address privacy concerns by only sharing local AI models while keeping the data decentralized. Two critical challenges of Federated Learning are managing the heterogeneous systems in the same federated network and dealing with real data, which are often not independently and identically distributed (non-IID) among the clients. In this paper, we focus on the second problem, i.e., the problem of statistical heterogeneity of the data in the same federated network. In this setting, local models might be strayed far from the local optimum of the complete dataset, thus possibly hindering the convergence of the federated model. Several Federated Learning algorithms, such as FedAvg, FedProx and Federated Curvature (FedCurv), aiming at tackling the non-IID setting, have already been proposed. This work provides an empirical assessment of the behaviour of FedAvg and FedCurv in common non-IID scenarios. Results show that the number of epochs per round is an important hyper-parameter that, when tuned appropriately, can lead to significant performance gains while reducing the communication cost. As a side product of this work, we release the non-IID version of the datasets we used so to facilitate further comparisons from the FL community.

updated: Fri Mar 31 2023 10:13:01 GMT+0000 (UTC)

published: Fri Mar 31 2023 10:13:01 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト