Reasoning for Complex Data through Ensemble-based Self-Supervised Learning

Gabriel Bertocco; Antônio Theophilo; Fernanda Andaló; Anderson Rocha

アンサンブルベースの自己監視学習による複雑なデータの推論

自己監視学習は、利用可能なラベル付きデータがほとんどまたはまったくない問題を扱います。最近の研究では、基礎となるクラスに意味上の大きな違いがある場合に印象的な結果が示されています。クラス内距離はクラス間距離よりも大幅に短いため、この手法が成功する重要なデータセットの1つはImageNetです。ただし、これはいくつかの重要なタスクには当てはまりません。一般的な教師あり学習方法では、クラスのセマンティクスが近い場合、識別機能を学習できないため、より堅牢な戦略が必要になります。この問題に取り組み、異なるクラスのサンプルが目立って多様でない場合でも、ラベルのないデータから学習できるようにするための戦略を提案します。異なる構成から派生したクラスターを組み合わせて、完全に監視されていない方法でデータサンプルのより適切なグループ化を生成する、新しいアンサンブルベースのクラスタリング戦略を活用することで、問題に取り組みます。この戦略により、密度が異なり、変動性が高いクラスターを出現させることができます。これにより、データセットごとに最適な構成を見つける負担を必要とせずに、クラス内の不一致が減少します。また、サンプル間の距離を計算するために、さまざまな畳み込みニューラルネットワークを検討します。コンテキスト分析を実行してこれらの距離を調整し、それらをグループ化して補足情報を取得します。パイプラインを検証するために、個人の再識別とテキスト作成者の検証という2つのアプリケーションを検討します。これらは、クラスが意味的に互いに近く、トレーニングセットとテストセットのアイデンティティが互いに素であることを考えると、難しいアプリケーションです。私たちの方法は、さまざまなモダリティにわたって堅牢であり、ラベル付けや人間の介入なしに完全に監視されていないソリューションで最先端の結果を上回ります。

Self-supervised learning deals with problems that have little or no available labeled data. Recent work has shown impressive results when underlying classes have significant semantic differences. One important dataset in which this technique thrives is ImageNet, as intra-class distances are substantially lower than inter-class distances. However, this is not the case for several critical tasks, and general self-supervised learning methods fail to learn discriminative features when classes have closer semantics, thus requiring more robust strategies. We propose a strategy to tackle this problem, and to enable learning from unlabeled data even when samples from different classes are not prominently diverse. We approach the problem by leveraging a novel ensemble-based clustering strategy where clusters derived from different configurations are combined to generate a better grouping for the data samples in a fully-unsupervised way. This strategy allows clusters with different densities and higher variability to emerge, which in turn reduces intra-class discrepancies, without requiring the burden of finding an optimal configuration per dataset. We also consider different Convolutional Neural Networks to compute distances between samples. We refine these distances by performing context analysis and group them to capture complementary information. We consider two applications to validate our pipeline: Person Re-Identification and Text Authorship Verification. These are challenging applications considering that classes are semantically close to each other and that training and test sets have disjoint identities. Our method is robust across different modalities and outperforms state-of-the-art results with a fully-unsupervised solution without any labeling or human intervention.

updated: Sat Feb 12 2022 20:47:45 GMT+0000 (UTC)

published: Mon Feb 07 2022 13:08:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト