Self-Supervised Convolutional Visual Prompts

Yun-Yun Tsai; Chengzhi Mao; Yow-Kuan Lin; Junfeng Yang

自己教師あり畳み込み視覚プロンプト

機械学習モデルは、分布外 (OOD) サンプルで失敗することがよくあります。ビジュアルプロンプトは、大規模なビジョンモデルの入力空間における軽量の適応方法として出現します。既存のビジョンプロンプトは、高次元の加法ベクトルを最適化し、トレーニングでラベル付きデータを必要とします。ただし、ラベル付きデータが利用できない場合、このパラダイムはテスト時間の適応に失敗することがわかります。この場合、高次元の視覚的プロンプトが自己教師付きの目的にオーバーフィットします。ラベルなしでテスト時間適応のための畳み込み視覚プロンプトを提示します。畳み込みプロンプトは構造化されており、トレーニング可能なパラメーターが少なくて済みます (標準のビジュアルプロンプトの 1 % 未満のパラメーター)。さまざまな OOD 認識タスクに関する広範な実験により、私たちのアプローチが効果的であり、多くの大規模なモデルアーキテクチャで堅牢性が最大 5.87% 向上することが示されています。

Machine learning models often fail on out-of-distribution (OOD) samples. Visual prompts emerge as a light-weight adaptation method in input space for large-scale vision models. Existing vision prompts optimize a high-dimensional additive vector and require labeled data on training. However, we find this paradigm fails on test-time adaptation when labeled data is unavailable, where the high-dimensional visual prompt overfits to the self-supervised objective. We present convolutional visual prompts for test-time adaptation without labels. Our convolutional prompt is structured and requires fewer trainable parameters (less than 1 % parameters of standard visual prompts). Extensive experiments on a wide variety of OOD recognition tasks show that our approach is effective, improving robustness by up to 5.87 % over a number of large-scale model architectures.

updated: Wed Mar 01 2023 03:06:29 GMT+0000 (UTC)

published: Wed Mar 01 2023 03:06:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト