Deep convolutional neural networks (CNNs) have demonstrated dominant performance in person re-identification (Re-ID). Existing CNN based methods utilize global average pooling (GAP) to aggregate intermediate convolutional features for Re-ID. However, this strategy only considers the first-order statistics of local features and treats local features at different locations equally important, leading to sub-optimal feature representation. To deal with these issues, we propose a novel weighted bilinear coding (WBC) framework for local feature aggregation in CNN networks to pursue more representative and discriminative feature representations, which can adapt to other state-of-the-art methods and improve their performance. In specific, bilinear coding is used to encode the channel-wise feature correlations to capture richer feature interactions. Meanwhile, a weighting scheme is applied on the bilinear coding to adaptively adjust the weights of local features at different locations based on their importance in recognition, further improving the discriminability of feature aggregation. To handle the spatial misalignment issue, we use a salient part net (spatial attention module) to derive salient body parts, and apply the WBC model on each part. The final representation, formed by concatenating the WBC encoded features of each part, is both discriminative and resistant to spatial misalignment. Experiments on three benchmarks including Market-1501, DukeMTMC-reID and CUHK03 evidence the favorable performance of our method against other outstanding methods.
updated: Wed Jan 08 2020 14:39:21 GMT+0000 (UTC)
published: Thu Mar 22 2018 20:51:26 GMT+0000 (UTC)