DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations

Wenhao Wang; Shengcai Liao; Fang Zhao; Cuicui Kang; Ling Shao

DomainMix：人間の注釈なしで一般化可能な人物の再識別を学習する

既存の個人の再識別方法は、一般化の可能性が低いことがよくあります。これは主に、大規模なラベル付きトレーニングデータの可用性が限られているためです。ただし、大規模なトレーニングデータのラベル付けは、非常に費用と時間がかかります。これに対処するために、このペーパーでは、DomainMixと呼ばれるソリューションを紹介します。このソリューションは、完全に人間の注釈なしで、合成データと実世界のデータの両方から個人の再識別モデルを初めて学習できます。このように、提案された方法は、大規模なトレーニングデータの安価な可用性を享受し、そのスケーラビリティと多様性の恩恵を受けて、学習されたモデルは目に見えないドメインでうまく一般化することができます。具体的には、効果的な人物再識別トレーニングのための大規模な合成データを生成する最近の研究から着想を得て、提案された方法は、各エポックで、最初にラベルのない実世界の画像をクラスター化し、3つの基準、すなわち独立性、コンパクト性に従って信頼できるクラスターを選択します、および数量。次に、分類レイヤーは、実世界の画像の生成された特徴を使用して適応的に初期化されます。訓練するとき、2つのドメイン間の大きなドメインギャップに対処するために、ドメイン不変特徴学習法が提案されます。これは、ドメイン不変特徴学習とドメイン識別の間の敵対的学習を設計し、その間、人の再識別のための識別特徴を学習します。このようにして、合成データと実世界のデータ間のドメインギャップが大幅に削減され、大規模で多様なトレーニングデータのおかげで、学習された機能を一般化できます。実験結果は、提案された注釈なしの方法が、完全な人間の注釈で訓練された対応物に多かれ少なかれ匹敵することを示しており、これは非常に有望である。さらに、直接のクロスデータセット評価の下で、複数の個人の再識別データセットで現在の最先端技術を実現します。

Existing person re-identification methods often have low generalizability, which is mostly due to the limited availability of large-scale labeled training data. However, labeling large-scale training data is very expensive and time-consuming. To address this, this paper presents a solution, called DomainMix, which can learn a person re-identification model from both synthetic and real-world data, for the first time, completely without human annotations. This way, the proposed method enjoys the cheap availability of large-scale training data, and benefiting from its scalability and diversity, the learned model is able to generalize well on unseen domains. Specifically, inspired from a recent work generating large-scale synthetic data for effective person re-identification training, in each epoch, the proposed method firstly clusters the unlabeled real-world images and select the reliable clusters according to three criteria, i.e. independence, compactness, and quantity. Then, the classification layer is initialized adaptively using the generated features of real-world images. When training, to address the large domain gap between two domains, a domain-invariant feature learning method is proposed, which designs an adversarial learning between domain-invariant feature learning and domain discrimination, and meanwhile learns a discriminative feature for person re-identification. This way, the domain gap between synthetic and real-world data is much reduced, and the learned feature is generalizable thanks to the large-scale and diverse training data. Experimental results show that the proposed annotation-free method is more or less comparable to the counterpart trained with full human annotations, which is quite promising. In addition, it achieves the current state of the art on several person re-identification datasets under direct cross-dataset evaluation.

updated: Sun Mar 21 2021 01:53:47 GMT+0000 (UTC)

published: Tue Nov 24 2020 08:15:53 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト