Nearest Neighborhood-Based Deep Clustering for Source Data-absent Unsupervised Domain Adaptation

Song Tang; Yan Yang; Zhiyuan Ma; Norman Hendrich; Fanyu Zeng; Shuzhi Sam Ge; Changshui Zhang; Jianwei Zhang

ソースデータのない教師なしドメイン適応のための最も近い近隣ベースのディープクラスタリング

教師なしドメイン適応（UDA）の従来の設定では、ラベル付けされたソースデータがトレーニングフェーズで利用できます。ただし、多くの実際のシナリオでは、プライバシー保護や情報セキュリティなどの理由により、ソースデータにアクセスできず、ソースドメインでトレーニングされたモデルのみが使用可能です。この論文は、この挑戦的なタスクのための新しいディープクラスタリング手法を提案します。機能レベルでの動的クラスタリングを目指して、プロセスを支援するために、データ間の幾何学的構造に隠された追加の制約を導入します。具体的には、最も近い近隣（SCNNH）でのセマンティック整合性という名前のジオメトリベースの制約を提案し、それを使用して堅牢なクラスタリングを促進します。この目標を達成するために、すべてのターゲットデータに最も近い近傍を構築し、ジオメトリ上に目的を構築することにより、それを基本的なクラスタリングユニットとして使用します。また、セマンティックハイパーニアレストネイバーフッド（SHNNH）という名前の、セマンティック信頼性制約が追加された、よりSCNNH準拠の構造を開発します。その後、メソッドをこの新しいジオメトリに拡張します。 3つの挑戦的なUDAデータセットでの広範な実験は、私たちの方法が最先端の結果を達成することを示しています。提案された方法では、すべてのデータセットで大幅な改善が見られます（SHNNHを採用すると、大規模なデータセットで平均精度が3.0％以上向上します）。コードはhttps://github.com/tntek/N2DCXで入手できます。

In the classic setting of unsupervised domain adaptation (UDA), the labeled source data are available in the training phase. However, in many real-world scenarios, owing to some reasons such as privacy protection and information security, the source data is inaccessible, and only a model trained on the source domain is available. This paper proposes a novel deep clustering method for this challenging task. Aiming at the dynamical clustering at feature-level, we introduce extra constraints hidden in the geometric structure between data to assist the process. Concretely, we propose a geometry-based constraint, named semantic consistency on the nearest neighborhood (SCNNH), and use it to encourage robust clustering. To reach this goal, we construct the nearest neighborhood for every target data and take it as the fundamental clustering unit by building our objective on the geometry. Also, we develop a more SCNNH-compliant structure with an additional semantic credibility constraint, named semantic hyper-nearest neighborhood (SHNNH). After that, we extend our method to this new geometry. Extensive experiments on three challenging UDA datasets indicate that our method achieves state-of-the-art results. The proposed method has significant improvement on all datasets (as we adopt SHNNH, the average accuracy increases by over 3.0% on the large-scaled dataset). Code is available at https://github.com/tntek/N2DCX.

updated: Tue Aug 03 2021 09:59:56 GMT+0000 (UTC)

published: Tue Jul 27 2021 04:13:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト