MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning

Sara Atito; Muhammad Awais; Ammarah Farooq; Zhenhua Feng; Josef Kittler

MC-SSL0.0：マルチコンセプトの自己管理学習に向けて

自己監視型の事前トレーニングは、自然言語処理モデルに最適な方法であり、多くのビジョンタスクで急速に人気が高まっています。最近、自己監視あり事前トレーニングは、多くのダウンストリームビジョンアプリケーションで教師あり事前トレーニングよりも優れていることが示され、この地域で画期的な出来事となっています。この優位性は、複数の概念を伝えるトレーニング画像の不完全なラベル付けの悪影響に起因しますが、単一の支配階級ラベルを使用して注釈が付けられます。自己監視学習（SSL）には原則としてこの制限はありませんが、SSLを促進する口実タスクの選択は、学習プロセスを単一の概念出力に向けて推進することにより、この欠点を永続させています。この研究は、ラベルを使用せずに画像に存在するすべての概念をモデル化する可能性を調査することを目的としています。この側面では、提案されたSSLフレームワークMC-SSL0.0は、マルチコンセプト自己監視学習（MC-SSL）に向けた一歩であり、画像内の単一のドミナントラベルのモデリングを超えて、存在するすべてのコンセプトからの情報を効果的に利用します。初期化。 MC-SSL0.0は、グループマスクモデルの学習と、モメンタムエンコーダー（教師-学生）フレームワークを使用したデータトークンの疑似概念の学習という2つのコア設計概念で構成されています。マルチラベルおよびマルチクラス画像分類のダウンストリームタスクに関する実験結果は、MC-SSL0.0が既存のSSLメソッドを超えるだけでなく、教師あり転送学習よりも優れていることを示しています。ソースコードは、コミュニティがより大きなコーパスでトレーニングできるように公開されます。

Self-supervised pretraining is the method of choice for natural language processing models and is rapidly gaining popularity in many vision tasks. Recently, self-supervised pretraining has shown to outperform supervised pretraining for many downstream vision applications, marking a milestone in the area. This superiority is attributed to the negative impact of incomplete labelling of the training images, which convey multiple concepts, but are annotated using a single dominant class label. Although Self-Supervised Learning (SSL), in principle, is free of this limitation, the choice of pretext task facilitating SSL is perpetuating this shortcoming by driving the learning process towards a single concept output. This study aims to investigate the possibility of modelling all the concepts present in an image without using labels. In this aspect the proposed SSL frame-work MC-SSL0.0 is a step towards Multi-Concept Self-Supervised Learning (MC-SSL) that goes beyond modelling single dominant label in an image to effectively utilise the information from all the concepts present in it. MC-SSL0.0 consists of two core design concepts, group masked model learning and learning of pseudo-concept for data token using a momentum encoder (teacher-student) framework. The experimental results on multi-label and multi-class image classification downstream tasks demonstrate that MC-SSL0.0 not only surpasses existing SSL methods but also outperforms supervised transfer learning. The source code will be made publicly available for community to train on bigger corpus.

updated: Tue Nov 30 2021 12:36:38 GMT+0000 (UTC)

published: Tue Nov 30 2021 12:36:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト