Unsupervised Prompt Learning for Vision-Language Models

Tony Huang; Jack Chu; Fangyun Wei

視覚言語モデルの教師なしプロンプト学習

CLIP のような対照的な視覚言語モデルは、転移学習において大きな進歩を遂げています。推論段階では、適切なテキストの説明 (プロンプトとも呼ばれます) を慎重に設計して、指定された画像を正しく分類する必要があります。骨の折れる迅速なエンジニアリングを避けるために、CoOp、CLIP-Adapter、Tip-Adapter などの最近の研究では、ラベル付きデータの小さなセットでの下流の画像認識タスクに視覚言語モデルを適応させることを提案しています。有望な改善が達成されますが、ターゲットデータセットからのラベル付きデータを要求すると、スケーラビリティが制限される可能性があります。このホワイトペーパーでは、ターゲットデータセットのラベルが提供されていない別のシナリオを検討し、プロンプトエンジニアリングを回避すると同時に、CLIP のようなビジョン言語モデルの転送パフォーマンスを向上させる教師なしプロンプトラーニング (UPL) アプローチを提示します。私たちが知る限り、UPL は教師なし学習をプロンプト学習に導入した最初の作品です。実験的に、当社の UPL は、ImageNet および他の 10 個のデータセットでの迅速なエンジニアリングにより、元の CLIP よりも優れています。 UPL の拡張バージョンは、ほとんどのデータセットで 8 ショット CoOp および 8 ショット TIP アダプターとさえ競合します。コードとモデルは https://github.com/tonyhuang2022/UPL で入手できます。

Contrastive vision-language models like CLIP have shown great progress in transfer learning. In the inference stage, the proper text description, also known as prompt, needs to be carefully designed to correctly classify the given images. In order to avoid laborious prompt engineering, recent works such as CoOp, CLIP-Adapter and Tip-Adapter propose to adapt vision-language models for downstream image recognition tasks on a small set of labeled data. Though promising improvements are achieved, requiring labeled data from the target datasets may restrict the scalability. In this paper, we explore a different scenario, in which the labels of the target datasets are unprovided, and we present an unsupervised prompt learning (UPL) approach to avoid prompt engineering while simultaneously improving transfer performance of CLIP-like vision-language models. As far as we know, UPL is the first work to introduce unsupervised learning into prompt learning. Experimentally, our UPL outperforms original CLIP with prompt engineering on ImageNet as well as other 10 datasets. An enhanced version of UPL is even competitive with the 8-shot CoOp and the 8-shot TIP-Adapter on most datasets. Code and models are available at https://github.com/tonyhuang2022/UPL.

updated: Mon Aug 22 2022 08:45:33 GMT+0000 (UTC)

published: Thu Apr 07 2022 17:59:57 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト