Reinforcement Learning-Based Black-Box Model Inversion Attacks

Gyojin Han; Jaehyun Choi; Haeil Lee; Junmo Kim

強化学習ベースのブラックボックスモデル反転攻撃

モデル反転攻撃は、モデルにアクセスするだけで、機械学習モデルのトレーニングに使用されるプライベートデータを再構築する一種のプライバシー攻撃です。最近、Generative Adversarial Networks (GAN) を利用して公開データセットから知識を抽出するホワイトボックスモデル反転攻撃が、その優れた攻撃性能から大きな注目を集めています。一方、現在の GAN を利用したブラックボックスモデル反転攻撃は、所定のクエリアクセス回数内で攻撃プロセスが完了することを保証できない、ホワイトボックス攻撃と同等の性能を達成できないなどの問題があります。これらの制限を克服するために、強化学習ベースのブラックボックスモデル反転攻撃を提案します。潜在空間探索をマルコフ決定過程 (MDP) 問題として定式化し、強化学習で解決します。私たちの方法は、生成された画像の信頼スコアを利用して、エージェントに報酬を提供します。最後に、MDP でトレーニングされたエージェントが見つけた潜在ベクトルを使用して、プライベートデータを再構築できます。さまざまなデータセットとモデルでの実験結果は、最先端の攻撃性能を達成することにより、攻撃がターゲットモデルの個人情報を正常に回復することを示しています。より高度なブラックボックスモデル反転攻撃を提案することにより、プライバシーを保護する機械学習に関する研究の重要性を強調します。

Model inversion attacks are a type of privacy attack that reconstructs private data used to train a machine learning model, solely by accessing the model. Recently, white-box model inversion attacks leveraging Generative Adversarial Networks (GANs) to distill knowledge from public datasets have been receiving great attention because of their excellent attack performance. On the other hand, current black-box model inversion attacks that utilize GANs suffer from issues such as being unable to guarantee the completion of the attack process within a predetermined number of query accesses or achieve the same level of performance as white-box attacks. To overcome these limitations, we propose a reinforcement learning-based black-box model inversion attack. We formulate the latent space search as a Markov Decision Process (MDP) problem and solve it with reinforcement learning. Our method utilizes the confidence scores of the generated images to provide rewards to an agent. Finally, the private data can be reconstructed using the latent vectors found by the agent trained in the MDP. The experiment results on various datasets and models demonstrate that our attack successfully recovers the private information of the target model by achieving state-of-the-art attack performance. We emphasize the importance of studies on privacy-preserving machine learning by proposing a more advanced black-box model inversion attack.

updated: Mon Apr 10 2023 14:41:16 GMT+0000 (UTC)

published: Mon Apr 10 2023 14:41:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト