Universal Adversarial Perturbations and Image Spam Classifiers

Andy Phung; Mark Stamp

普遍的な敵対的摂動と画像スパム分類器

名前が示すように、画像スパムは画像に埋め込まれたスパムメールです。画像スパムは、テキストベースのフィルターを回避するために開発されました。最新の深層学習ベースの分類器は、実際に見られる典型的な画像スパムの検出に適しています。この章では、ディープラーニングベースの画像スパム分類器を攻撃する目的で、多数の敵対的な手法を評価します。テストされた手法の中で、普遍的な摂動が最もよく機能することがわかります。普遍的な敵対的摂動を使用して、画像スパムに合わせた「自然な摂動」を作成できるようにする、新しい変換ベースの敵対的攻撃を提案および分析します。結果として生じるスパム画像は、集中した自然の特徴と普遍的な敵対的な摂動の両方の恩恵を受けます。提案された手法は、精度の低下、例ごとの計算時間、および摂動距離の点で、既存の敵対的攻撃よりも優れていることを示します。私たちは、敵対的なスパム画像のデータセットを作成するために私たちの技術を適用します。これは、画像スパム検出の将来の研究のためのチャレンジデータセットとして役立つことができます。

As the name suggests, image spam is spam email that has been embedded in an image. Image spam was developed in an effort to evade text-based filters. Modern deep learning-based classifiers perform well in detecting typical image spam that is seen in the wild. In this chapter, we evaluate numerous adversarial techniques for the purpose of attacking deep learning-based image spam classifiers. Of the techniques tested, we find that universal perturbation performs best. Using universal adversarial perturbations, we propose and analyze a new transformation-based adversarial attack that enables us to create tailored "natural perturbations" in image spam. The resulting spam images benefit from both the presence of concentrated natural features and a universal adversarial perturbation. We show that the proposed technique outperforms existing adversarial attacks in terms of accuracy reduction, computation time per example, and perturbation distance. We apply our technique to create a dataset of adversarial spam images, which can serve as a challenge dataset for future research in image spam detection.

updated: Sun Mar 07 2021 14:36:02 GMT+0000 (UTC)

published: Sun Mar 07 2021 14:36:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト