Prompt-aligned Gradient for Prompt Tuning

Beier Zhu; Yulei Niu; Yucheng Han; Yue Wu; Hanwang Zhang

プロンプトチューニングのためのプロンプト調整されたグラデーション

CLIPのような事前にトレーニングされた大規模な視覚言語モデル（VLM）のおかげで、「プロンプト」によってゼロショット分類器を作成できます。たとえば、「[CLASS]」である画像の信頼スコアは、 VLMは、画像とプロンプト文「[CLASS]の写真」との間の類似性の尺度を提供しました。したがって、プロンプトベースの類似性測度を微調整すると、プロンプトはVLMをダウンストリームタスクに迅速に適応させる大きな可能性を示します。ただし、不適切な微調整は、タスク関連のクラスだけでなく、VLM語彙の他のクラスのプロンプト固有の予測を損なう可能性があるという一般的な失敗を発見しました。既存の方法でも、早期打ち切りやデータ拡張などの従来の過剰適合防止技術を使用することでこの問題に対処していますが、プロンプトに固有の原則的な解決策がありません。プロンプト調整がVLMから学んだ一般的な知識を忘れないようにするために、Prompt-alignedGradientと呼ばれるProGradを紹介します。特に、ProGradは、勾配が「一般的な方向」に位置合わせされている（または競合していない）プロンプトのみを更新します。これは、事前定義されたプロンプト予測のKL損失の勾配として表されます。広範な実験により、最先端のプロンプトチューニング方法よりも強力なProGradの数ショットの一般化機能が実証されています。コードはhttps://github.com/BeierZhu/Prompt-alignで入手できます。

Thanks to the large pre-trained vision-language models (VLMs) like CLIP, we can craft a zero-shot classifier by "prompt", e.g., the confidence score of an image being "[CLASS]" can be obtained by using the VLM provided similarity measure between the image and the prompt sentence "a photo of a [CLASS]". Therefore, prompt shows a great potential for fast adaptation of VLMs to downstream tasks if we fine-tune the prompt-based similarity measure. However, we find a common failure that improper fine-tuning may not only undermine the prompt's inherent prediction for the task-related classes, but also for other classes in the VLM vocabulary. Existing methods still address this problem by using traditional anti-overfitting techniques such as early stopping and data augmentation, which lack a principled solution specific to prompt. We present Prompt-aligned Gradient, dubbed ProGrad, to prevent prompt tuning from forgetting the the general knowledge learned from VLMs. In particular, ProGrad only updates the prompt whose gradient is aligned (or non-conflicting) to the "general direction", which is represented as the gradient of the KL loss of the pre-defined prompt prediction. Extensive experiments demonstrate the stronger few-shot generalization ability of ProGrad over state-of-the-art prompt tuning methods. Codes are available at https://github.com/BeierZhu/Prompt-align.

updated: Wed Jan 10 2024 06:24:46 GMT+0000 (UTC)

published: Mon May 30 2022 06:05:21 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト