Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks

Tribhuvanesh Orekondy; Bernt Schiele; Mario Fritz

予測中毒：DNNモデル窃盗攻撃に対する防御に向けて

高性能のディープニューラルネットワーク（DNN）は、クラウド予測APIなどの多くの実際のアプリケーションにますます導入されています。ブラックボックスアクセスを介したモデル機能盗用攻撃（つまり、入力、予測アウト）の最近の進歩は、そのようなアプリケーションのビジネスモデルを脅かすものであり、開発には多くの時間、お金、労力が必要です。既存の防御は、予測された情報を切り捨てるなど、窃盗攻撃に対して受動的な役割を果たします。このような受動的な防御は、DNNを盗む攻撃に対しては効果がありません。この論文では、攻撃者の訓練目標を中毒することを目的とした予測を積極的に摂動させる最初の防御を提案します。私たちの防御は、さまざまな困難なデータセットとDNNモデルを盗む攻撃に効果的であり、さらに既存の防御よりも優れています。当社の防御は、何万ものクエリに対する非常に正確なモデル盗用攻撃に耐えることができる最初のものであり、無害なユーザーのユーティリティへの影響を最小限に抑えながら、攻撃者のエラー率を最大85倍まで増幅します。

High-performance Deep Neural Networks (DNNs) are increasingly deployed in many real-world applications e.g., cloud prediction APIs. Recent advances in model functionality stealing attacks via black-box access (i.e., inputs in, predictions out) threaten the business model of such applications, which require a lot of time, money, and effort to develop. Existing defenses take a passive role against stealing attacks, such as by truncating predicted information. We find such passive defenses ineffective against DNN stealing attacks. In this paper, we propose the first defense which actively perturbs predictions targeted at poisoning the training objective of the attacker. We find our defense effective across a wide range of challenging datasets and DNN model stealing attacks, and additionally outperforms existing defenses. Our defense is the first that can withstand highly accurate model stealing attacks for tens of thousands of queries, amplifying the attacker's error rate up to a factor of 85× with minimal impact on the utility for benign users.

updated: Tue Mar 03 2020 10:51:12 GMT+0000 (UTC)

published: Wed Jun 26 2019 08:32:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト