AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (3.7 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Improving Semantic Part Features for Person Re-identification with Supervised Non-local Similarity

Yifan SunZhaopeng DouYali LiShengjin Wang( )
Department of Electronic Engineering, Tsinghua University, Beijing 100084, China.
Show Author Information

Abstract

In person re-IDentification (re-ID) task, the learning of part-level features benefits from fine-grained information. To facilitate part alignment, which is a prerequisite for learning part-level features, a popular approach is to detect semantic parts with the use of human parsing or pose estimation. Such methods of semantic partition do offer cues to good part alignment but are prone to noisy part detection, especially when they are employed in an off-the-shelf manner. In response, this paper proposes a novel part feature learning method for re-ID, that suppresses the impact of noisy semantic part detection through Supervised Non-local Similarity (SNS) learning. Given several detected semantic parts, SNS first locates their center points on the convolutional feature maps for use as a set of anchors and then evaluates the similarity values between these anchors and each pixel on the feature maps. The non-local similarity learning is supervised such that: each anchor should be similar to itself and simultaneously dissimilar to any other anchors, thus yielding the SNS. Finally, each anchor absorbs features from all of the similar pixels on the convolutional feature maps to generate a corresponding part feature (SNS feature). We evaluate our method with extensive experiments conducted under both holistic and partial re-ID scenarios. Experimental results confirm that SNS consistently improves re-ID accuracy using human parsing or pose estimation, and that our results are on par with state-of-the-art methods.

References

[1]
Y. F. Sun, L. Zheng, Y. Yang, Q. Tian, and S. J. Wang, Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline), presented at European Conference on Computer Vision (ECCV), Munich, Germany, 2018.
[2]
L. H. Wei, S. L. Zhang, H. T Yao, W. Gao, and Q. Tian, GLAD: Global-local-alignment descriptor for pedestrian retrieval, in Proceedings of the 25th ACM International Conference on Multiondia, 2017.
[3]
C. Su, J. I. Li, S. L. Zhang, J. L. Xing, W. Gao, and Q. Tian, Pose-driven deep convolutional model for person re-identification, presented at IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017.
[4]
Y. M. Suh, J. D. Wang, S. Y. Tang, T. Mei, and K. M. Lee, Part-aligned bilinear representations for person re-identification, presented at European Conference on Computer Vision (ECCV), Munich, Germany, 2018.
[5]
L. M. Zhao, X. Li, Y. T. Zhuang, and J. D. Wang, Deeply-learned part-aligned representations for person re-identification, presented at IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017.
[6]
C. Wang, Q. Zhang, C. Huang, W. Y. Liu, and X. G. Wang, Mancs: A multi-task attentional network with curriculum sampling for person re-identification, presented at European Conference on Computer Vision, Munich, Germany, 2018.
[7]
M. M. Kalayeh, E. Basaran, M. Gokmen, M.E. Kamasak, and M. Shah, Human semantic parsing for person reidentification, presented at European Conference on Computer Vision (ECCV), Munich, Germany, 2018.
[8]
L. Zheng, L. Y. Shen, L. Tian, S. J. Wang, J. D. Wang, and Q. Tian, Scalable person re-identification: A benchmark, presented at IEEE International Conference on Computer Vision, Santiago, Chile, 2015.
[9]
X. L. Wang, R. Girshick, A. Gupta, and K. M. He, Non-local neural networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[10]
L. Zheng, Y. Yang, and A. G. Hauptmann, Person reidentification: Past, present and future, arXiv preprint arXiv: 1610.02984, 2016.
[11]
L. C. Chen, G. Papandreou, and I. Kokkinos, Semantic image segmentation with deep convolutional nets and fully connected CRFs, Computer Science, vol. 4, pp. 357-361, 2014.
[12]
K. Gong, X. D. Liang, D. Y. Zhang, X. H. Shen, and L. Lin, Look into Person: Self-supervised structure-sensitive learning and a new benchmark for human parsing, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017.
[13]
J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, IEEETransactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 640-651, 2016.
[14]
S. E. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh, Convolutional pose machines, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[15]
E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and B. Schiele, DeeperCut: A deeper, stronger, and faster multi-person pose estimation model, presented at European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 2016.
[16]
Z. Cao, T. Simon, S. E. Wei, and Y. Sheikh, Realtime multiperson 2d pose estimation using part affinity fields, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
[17]
W. Li, X. T. Zhu, and S. G. Gong, Harmonious attention network for person re-identification, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[18]
X. H. Liu, H. Y. Zhao, M. Q. Tian, L. Sheng, J. Shao, S. Yi, J. J. Yan, and X. G. Wang, HydraPlus-Net: Attentive deep features for pedestrian analysis, in Proceedings of the IEEE International Conference on Computer Vision, 2017.
[19]
K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. S. Bengio, Show, attend and tell: Neural image caption generation with visual attention, Computer Science, in Proceedings of the 32nd International Conference on Machine Learning (ICML), 2015.
[20]
X. L. He, J. Liang, H. Q. Li, and Z. N. Sun, Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[21]
W. S. Zheng, X. Li, T. Xiang, S. C. Liao, J. H. Lai, and S. G. Gong, Partial person re-identification, presented at IEEE International Conference on Computer Vision, Santiago, Chile, 2015.
[22]
E. Ristani, F. Solera, R. Zou, R. Cucchiara, and C. Tomasi, Performance measures and a data set for multi-target, multi-camera tracking, presented at European Conference on Computer Vision, Amsterdam, The Netherlands, 2016.
[23]
Z. D. Zheng, L. Zheng, and Y. Yang, Unlabeled samples generated by GAN improve the person re-identification baseline in vitro, presented at IEEE International Conference on Computer Vision, Venice, Italy, 2017.
[24]
W. S. Zheng and S. G. Gong, and T. Xiang, Person re-identification by probabilistic relative distance comparison, in Computer Vision and Pattern Recognition. Springer, 2011.
[25]
P. Felzenszwalb, D. McAllester, and D. Ramanan, A discriminatively trained, multiscale, deformable part model, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008.
[26]
D. L. Gray, S. Brennan, and H. Tao, Evaluating appearance models for recognition, reacquisition, and tracking, in Proceedings of the 10th IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, 2007.
[27]
S. Y. Lo, H.M. Hang, S. W. Chan, and J. J. Lin, Efficient dense modules of asymmetric convolution for real-time semantic segmentation, arXiv preprint arXiv: 1809.06323, 2018.
[28]
L. C. Chen, Y. K. Zhu, G. Papandreou, F. Schroff, and H. Adam, Encoder-decoder with atrous separable convolution for semantic image segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[29]
T. Xiao, H. S. Li, W. Ouyang, and X. G. Wang, Learning deep feature representations with domain guided dropout for person re-identification, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[30]
Y. F. Sun, L. Zheng, W. J. Deng, and S. J. Wang, SVDNet for pedestrian retrieval, presented at IEEE International Conference on Computer Vision, Venice, Italy, 2017.
[31]
H. J. Huang, D. W. Li, Z. Zhang, X. T. Chen, and K. Q. Huang, Adversarially occluded samples for person re-identification, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[32]
Z. Zhong, L. Zheng, and G. Kang, Random erasing data augmentation, arXiv preprint arXiv: 1708.04896, 2017.
[33]
W. J. Deng, L. Zheng, Q. X. Ye, G. L. Kang, Y. Yang, and J. B. Jiao, Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
[34]
Y. Zhang, T. Xiang, T. M. Hospedales, and H. H. Lu, Deep mutual learning, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[35]
Z. Zhong, L. Zheng, Z. Z. Zheng, S. Z. Li, and Y. Yang, Camera style adaptation for person re-identification, arXiv preprint arXiv: 1711.10295, 2017.
[36]
X. B. Chang, T. M. Hospedales, and T. Xiang, Multilevel factorization net for person re-identification, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[37]
S. M. Saquib, A. Schumann, A. Eberle, and R. Stiefelhagen, A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[38]
S. C. Liao, A. K. Jain, and S. Z. Li, Partial face recognition: Alignment-free approach, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 6, pp. 1193-1205, 2012.
Tsinghua Science and Technology
Pages 636-646
Cite this article:
Sun Y, Dou Z, Li Y, et al. Improving Semantic Part Features for Person Re-identification with Supervised Non-local Similarity. Tsinghua Science and Technology, 2020, 25(5): 636-646. https://doi.org/10.26599/TST.2019.9010024

623

Views

18

Downloads

2

Crossref

N/A

Web of Science

3

Scopus

1

CSCD

Altmetrics

Received: 22 January 2019
Accepted: 23 May 2019
Published: 16 March 2020
© The author(s) 2020

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return