Learning Generalisable Omni-Scale Representations for Person Re-Identification

Kaiyang Zhou; Yongxin Yang; Andrea Cavallaro; Tao Xiang

個人の再識別のための一般化可能なオムニスケール表現の学習

効果的な人物再識別（re-ID）モデルは、似たような人物を区別するための識別力と、適応なしでデータセット全体に展開するための一般化可能な特徴表現を学習する必要があります。このホワイトペーパーでは、両方の課題に対処するための新しいCNNアーキテクチャを開発します。まず、オムニスケールネットワーク（OSNet）と呼ばれるre-ID CNNを紹介し、さまざまな空間スケールをキャプチャするだけでなく、複数のスケールの相乗的な組み合わせ、つまりオムニスケール機能をカプセル化する機能を学習します。基本的なビルディングブロックは、複数の畳み込みストリームで構成され、それぞれが特定のスケールで特徴を検出します。オムニスケールの特徴学習のために、統合された集約ゲートが導入され、マルチスケールの特徴をチャネルごとの重みで動的に融合します。 OSNetは、その構成要素が因数分解された畳み込みで構成されているため、軽量です。次に、一般化可能な特徴学習を改善するために、インスタンス正規化（IN）レイヤーをOSNetに導入して、データセット間の不一致に対処します。さらに、アーキテクチャ内のこれらのINレイヤーの最適な配置を決定するために、効率的な微分可能なアーキテクチャ検索アルゴリズムを定式化します。広範な実験により、従来の同じデータセット設定では、OSNetは、既存のre-IDモデルよりもはるかに小さいにもかかわらず、最先端のパフォーマンスを実現することが示されています。より挑戦的でありながら実用的なクロスデータセット設定では、OSNetは、ターゲットデータを使用せずに、最新の教師なしドメイン適応方法を打ち負かします。私たちのコードとモデルはhttps://github.com/KaiyangZhou/deep-person-reidでリリースされています。

An effective person re-identification (re-ID) model should learn feature representations that are both discriminative, for distinguishing similar-looking people, and generalisable, for deployment across datasets without any adaptation. In this paper, we develop novel CNN architectures to address both challenges. First, we present a re-ID CNN termed omni-scale network (OSNet) to learn features that not only capture different spatial scales but also encapsulate a synergistic combination of multiple scales, namely omni-scale features. The basic building block consists of multiple convolutional streams, each detecting features at a certain scale. For omni-scale feature learning, a unified aggregation gate is introduced to dynamically fuse multi-scale features with channel-wise weights. OSNet is lightweight as its building blocks comprise factorised convolutions. Second, to improve generalisable feature learning, we introduce instance normalisation (IN) layers into OSNet to cope with cross-dataset discrepancies. Further, to determine the optimal placements of these IN layers in the architecture, we formulate an efficient differentiable architecture search algorithm. Extensive experiments show that, in the conventional same-dataset setting, OSNet achieves state-of-the-art performance, despite being much smaller than existing re-ID models. In the more challenging yet practical cross-dataset setting, OSNet beats most recent unsupervised domain adaptation methods without using any target data. Our code and models are released at https://github.com/KaiyangZhou/deep-person-reid.

updated: Thu Apr 29 2021 14:41:52 GMT+0000 (UTC)

published: Tue Oct 15 2019 14:44:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト