LaCViT: A Label-aware Contrastive Training Framework for Vision Transformers

Zijun Long; Zaiqiao Meng; Gerardo Aragon Camarasa; Richard McCreadie

LaCViT: ビジョントランスフォーマー向けのラベル認識型コントラストトレーニングフレームワーク

ビジョントランスフォーマーは、長い機能の依存関係をモデル化できるため、コンピュータービジョンタスクに取り組む際に非常に効果的です。大規模なトレーニングデータとさまざまな自己教師あり信号 (マスクされたランダムパッチなど) を使用することにより、ビジョントランスフォーマーは、ImageNet-1k や CIFAR-10 などのいくつかのベンチマークデータセットで最先端のパフォーマンスを提供します。ただし、一般的な大規模画像コーパスで事前トレーニングされたこれらのビジョントランスフォーマーは、異方性表現空間しか生成できず、その一般化可能性とターゲットダウンストリームタスクへの転送可能性が制限されます。このホワイトペーパーでは、シンプルで効果的なラベル認識コントラストトレーニングフレームワーク LaCViT を提案します。これは、ビジョントランスフォーマーの事前トレーニング済み表現空間の等方性を改善し、それによって幅広い画像分類タスク間でより効果的な転移学習を可能にします。 5 つの標準的な画像分類データセットに対する実験を通じて、LaCViT でトレーニングされたモデルが元の事前トレーニングされたベースラインよりも約 9% 絶対精度@1 優れていることを実証し、評価した 3 つのビジョントランスフォーマーに LaCViT を適用すると、一貫した改善が見られます。

Vision Transformers have been incredibly effective when tackling computer vision tasks due to their ability to model long feature dependencies. By using large-scale training data and various self-supervised signals (e.g., masked random patches), vision transformers provide state-of-the-art performance on several benchmarking datasets, such as ImageNet-1k and CIFAR-10. However, these vision transformers pretrained over general large-scale image corpora could only produce an anisotropic representation space, limiting their generalizability and transferability to the target downstream tasks. In this paper, we propose a simple and effective Label-aware Contrastive Training framework LaCViT, which improves the isotropy of the pretrained representation space for vision transformers, thereby enabling more effective transfer learning amongst a wide range of image classification tasks. Through experimentation over five standard image classification datasets, we demonstrate that LaCViT-trained models outperform the original pretrained baselines by around 9% absolute Accuracy@1, and consistent improvements can be observed when applying LaCViT to our three evaluated vision transformers.

updated: Fri Mar 31 2023 12:38:08 GMT+0000 (UTC)

published: Fri Mar 31 2023 12:38:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト