Doubly Contrastive Deep Clustering

Zhiyuan Dang; Cheng Deng; Xu Yang; Heng Huang

二重対照ディープクラスタリング

ディープクラスタリングは、従来の機能よりも効果的な機能を提供することに成功しているため、現在の教師なし学習において重要な手法になります。ただし、ほとんどのディープクラスタリング手法は、データ拡張によって導入された重要な正と負のペアを無視し、さらに対照的な学習の重要性を無視します。これにより、パフォーマンスが最適化されなくなります。このホワイトペーパーでは、新しい二重対照ディープクラスタリング（DCDC）フレームワークを紹介します。これは、サンプルビューとクラスビューの両方で対照的な損失を構築して、より識別力のある機能と競争力のある結果を取得します。具体的には、サンプルビューでは、元のサンプルとその拡張バージョンのクラス分布を正のサンプルペアとして設定し、他の拡張サンプルの1つを負のサンプルペアとして設定します。その後、サンプルごとの対照損失を採用して、正のサンプルペアをまとめ、負のサンプルペアを引き離すことができます。同様に、クラスビューの場合、クラスのサンプル分布から正と負のペアを作成します。このように、2つの対照的な損失は、サンプルレベルとクラスレベルの両方でミニバッチサンプルのクラスタリング結果を正常に制約します。 6つのベンチマークデータセットに関する広範な実験結果は、最先端の方法に対する提案されたモデルの優位性を示しています。特に挑戦的なデータセットTiny-ImageNetでは、私たちの方法は最新の比較方法に対して5.6％リードしています。私たちのコードはhttps://github.com/ZhiyuanDang/DCDCで入手できます。

Deep clustering successfully provides more effective features than conventional ones and thus becomes an important technique in current unsupervised learning. However, most deep clustering methods ignore the vital positive and negative pairs introduced by data augmentation and further the significance of contrastive learning, which leads to suboptimal performance. In this paper, we present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views to obtain more discriminative features and competitive results. Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs and set one of the other augmented samples as negative sample pairs. After that, we can adopt the sample-wise contrastive loss to pull positive sample pairs together and push negative sample pairs apart. Similarly, for the class view, we build the positive and negative pairs from the sample distribution of the class. In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level. Extensive experimental results on six benchmark datasets demonstrate the superiority of our proposed model against state-of-the-art methods. Particularly in the challenging dataset Tiny-ImageNet, our method leads 5.6% against the latest comparison method. Our code will be available at https://github.com/ZhiyuanDang/DCDC.

updated: Tue Mar 09 2021 15:15:32 GMT+0000 (UTC)

published: Tue Mar 09 2021 15:15:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト