Robust Contrastive Language-Image Pretraining against Adversarial Attacks

Wenhan Yang; Baharan Mirzasoleiman

敵対的攻撃に対するロバストで対照的な言語イメージの事前トレーニング

対照的な視覚言語表現学習は、インターネットからクロールされた何百万もの画像キャプションペアから学習することにより、ゼロショット分類の最先端のパフォーマンスを達成しました。ただし、CLIP などの大規模なマルチモーダルモデルを強化する大量のデータは、標的型攻撃やバックドアデータポイズニング攻撃など、さまざまな種類の敵対的攻撃に対して非常に脆弱です。この脆弱性にもかかわらず、敵対的な攻撃に対する堅牢な対照的な視覚言語の事前トレーニングは未解決のままです。この作業では、堅牢な事前トレーニングとマルチモーダル視覚言語モデルの微調整のための最初の効果的な方法である RoCLIP を提案します。 RoCLIP は、無作為な例のプールを考慮し、(1) すべての画像をプール内のそのキャプションに最も類似したテキストと照合し、(2) すべてのキャプションを画像と照合することにより、汚染された画像とキャプションのペア間の関連付けを効果的に断ち切ります。それはプールでのイメージに最も似ています。私たちの広範な実験は、CLIPの事前トレーニングまたは微調整中に、最先端の標的型データポイズニングとバックドア攻撃を無効にすることを示しています。特に、RoCLIP は、ポイズンおよびバックドア攻撃の成功率を事前トレーニング中に 0%、微調整中に 1% ～ 4% に減らし、モデルのパフォーマンスを効果的に向上させます。

Contrastive vision-language representation learning has achieved state-of-the-art performance for zero-shot classification, by learning from millions of image-caption pairs crawled from the internet. However, the massive data that powers large multimodal models such as CLIP, makes them extremely vulnerable to various types of adversarial attacks, including targeted and backdoor data poisoning attacks. Despite this vulnerability, robust contrastive vision-language pretraining against adversarial attacks has remained unaddressed. In this work, we propose RoCLIP, the first effective method for robust pretraining and fine-tuning multimodal vision-language models. RoCLIP effectively breaks the association between poisoned image-caption pairs by considering a pool of random examples, and (1) matching every image with the text that is most similar to its caption in the pool, and (2) matching every caption with the image that is most similar to its image in the pool. Our extensive experiments show that our method renders state-of-the-art targeted data poisoning and backdoor attacks ineffective during pre-training or fine-tuning of CLIP. In particular, RoCLIP decreases the poison and backdoor attack success rates down to 0% during pre-training and 1%-4% during fine-tuning, and effectively improves the model's performance.

updated: Mon Mar 13 2023 04:49:46 GMT+0000 (UTC)

published: Mon Mar 13 2023 04:49:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト