On the Robustness of Split Learning against Adversarial Attacks

Mingyuan Fan; Cen Chen; Chengyu Wang; Wenmeng Zhou; Jun Huang

敵対的攻撃に対する分割学習の堅牢性について

分割学習により、生データとモデルの詳細の直接共有を回避することで、データのプライバシーとモデルのセキュリティを維持しながら、協調的な深層学習モデルのトレーニングが可能になります (つまり、サーバーとクライアントは部分的なサブネットワークのみを保持し、中間計算を交換します)。しかし、既存の研究は主にプライバシー保護の信頼性を調べることに焦点を当てており、モデルのセキュリティについてはほとんど調査されていません。具体的には、完全なモデルを探索することで、攻撃者は敵対的攻撃を仕掛けることができ、分割学習ではモデルの一部を信頼できないサーバーに公開するだけでこの重大な脅威を軽減できます。この論文は、特に最も困難な攻撃における敵対的攻撃に対する分割学習の堅牢性を評価することを目的としています。信頼できないサーバーがモデルの中間層にのみアクセスできる設定です。既存の敵対的攻撃は、ほとんどが協調設定ではなく集中設定に焦点を当てているため、分割学習の堅牢性をより適切に評価するために、SPADV と呼ばれるカスタマイズされた攻撃を開発しました。 1) モデルの欠落部分の問題に対処するシャドウモデルトレーニングと、2) 評価する敵対的な例を生成するローカル敵対的攻撃の 2 つの段階で構成されます。最初の段階では、いくつかのラベルなしの非 IID データのみが必要です。段階では、SPADV は自然サンプルの中間出力を摂動させて、敵対的なサンプルを作成します。提案された攻撃プロセスの全体的なコストは比較的低いですが、経験的な攻撃の有効性は非常に高く、敵対的な攻撃に対する分割学習の驚くべき脆弱性を示しています。

Split learning enables collaborative deep learning model training while preserving data privacy and model security by avoiding direct sharing of raw data and model details (i.e., sever and clients only hold partial sub-networks and exchange intermediate computations). However, existing research has mainly focused on examining its reliability for privacy protection, with little investigation into model security. Specifically, by exploring full models, attackers can launch adversarial attacks, and split learning can mitigate this severe threat by only disclosing part of models to untrusted servers.This paper aims to evaluate the robustness of split learning against adversarial attacks, particularly in the most challenging setting where untrusted servers only have access to the intermediate layers of the model.Existing adversarial attacks mostly focus on the centralized setting instead of the collaborative setting, thus, to better evaluate the robustness of split learning, we develop a tailored attack called SPADV, which comprises two stages: 1) shadow model training that addresses the issue of lacking part of the model and 2) local adversarial attack that produces adversarial examples to evaluate.The first stage only requires a few unlabeled non-IID data, and, in the second stage, SPADV perturbs the intermediate output of natural samples to craft the adversarial ones. The overall cost of the proposed attack process is relatively low, yet the empirical attack effectiveness is significantly high, demonstrating the surprising vulnerability of split learning to adversarial attacks.

updated: Sun Jul 16 2023 01:45:00 GMT+0000 (UTC)

published: Sun Jul 16 2023 01:45:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト