AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (6.3 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

SAM-driven MAE pre-training and background-aware meta-learning for unsupervised vehicle re-identification

Dong Wang1Qi Wang2,3,4Weidong Min2,3,4( )Di Gai2,3,4Qing Han2,3,4Longfei Li2Yuhan Geng5
School of Software, Nanchang University, Nanchang 330047, China
School of Mathematics and Computer Science, Nanchang University, Nanchang 330031, China
Institute of Metaverse, Nanchang University, Nanchang 330031, China
Jiangxi Key Laboratory of Smart City, Nanchang 330031, China
School of Public Health, University of Michigan, Ann Arbor 48109, USA
Show Author Information

Graphical Abstract

Abstract

Distinguishing identity-unrelated background information from discriminative identity information poses a challenge in unsupervised vehicle re-identification (Re-ID). Re-ID models suffer from varying degrees of background interference caused by continuous scene variations. The recently proposed segment anything model (SAM) has demonstrated exceptional performance in zero-shot segmentation tasks. The combination of SAM and vehicle Re-ID models can achieve efficient separation of vehicle identity and background information. This paper proposes a method that combines SAM-driven mask autoencoder (MAE) pre-training and background-aware meta-learning for unsupervised vehicle Re-ID. The method consists of three sub-modules. First, the segmentation capacity of SAM is utilized to separate the vehicle identity region from the background. SAM cannot be robustly employed in exceptional situations, such as those with ambiguity or occlusion. Thus, in vehicle Re-ID downstream tasks, a spatially-constrained vehicle background segmentation method is presented to obtain accurate background segmentation results. Second, SAM-driven MAE pre-training utilizes the aforementioned segmentation results to select patches belonging to the vehicle and to mask other patches, allowing MAE to learn identity-sensitive features in a self-supervised manner. Finally, we present a background-aware meta-learning method to fit varying degrees of background interference in different scenarios by combining different background region ratios. Our experiments demonstrate that the proposed method has state-of-the-art performance in reducing background interference variations.

References

[1]

Lei, J.; Qin, T.; Peng, B.; Li, W.; Pan, Z.; Shen, H.; Kwong, S. Reducing background induced domain shift for adaptive person re-identification. IEEE Transactions on Industrial Informatics Vol. 19, No. 6, 7377–7388, 2023.

[2]

Zhang, G.; Zhang, H.; Lin, W.; Chandran, A. K.; Jing, X. Camera contrast learning for unsupervised person re-identification. IEEE Transactions on Circuits and Systems for Video Technology Vol. 33, No. 8, 4096–4107, 2023.

[3]

Zhu, K.; Guo, H.; Liu, S.; Wang, J.; Tang, M. Learning semantics-consistent stripes with self-refinement for person re-identification. IEEE Transactions on Neural Networks and Learning Systems Vol. 34, No. 11, 8531–8542, 2023.

[4]
Wu, M.; Zhang, Y.; Zhang, T.; Zhang, W. Background segmentation for vehicle re-identification. In: MultiMedia Modeling. Lecture Notes in Computer Science, Vol. 11962. Springer Cham, 88–99, 2020.
[5]
Munir, A.; Martinel, N.; Micheloni, C. Oriented splits network to distill background for vehicle re-identification. In: Proceedings of the 17th IEEE International Conference on Advanced Video and Signal Based Surveillance, 1–8, 2021.
[6]

Lu, Z.; Lin, R.; Hu, H. MART: Mask-aware reasoning transformer for vehicle re-identification. IEEE Transactions on Intelligent Transportation Systems Vol. 24, No. 2, 1994–2009, 2023.

[7]

Ning, X.; Gong, K.; Li, W.; Zhang, L.; Bai, X.; Tian, S. Feature refinement and filter network for person re-identification. IEEE Transactions on Circuits and Systems for Video Technology Vol. 31, No. 9, 3391–3402, 2021.

[8]
Li, Z.; Deng, Y.; Tang, Z.; Huang, J. SFMNet: Self-guided feature mining network for vehicle re-identification. In: Proceedings of the International Joint Conference on Neural Networks, 1–8, 2023.
[9]
He, K.; Chen, X.; Xie, S.; Li, Y.; Dollár, P.; Girshick, R. Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15979–15988, 2022.
[10]
Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A. C.; Lo, W. Y.; et al. Segment anything. arXiv preprint arXiv:2304.02643, 2023.
[11]

Lin, Y.; Wu, Y.; Yan, C.; Xu, M.; Yang, Y. Unsupervised person re-identification via cross-camera similarity exploration. IEEE Transactions on Image Processing Vol. 29, 5481–5490, 2020.

[12]

Wang, H.; Lu, J.; Pang, F.; Zhou, J.; Zhang, K. Bi-directional style adaptation network for person re-identification. IEEE Sensors Journal Vol. 22, No. 12, 12339–12347, 2022.

[13]
Lou, Y.; Bai, Y.; Liu, J.; Wang, S.; Duan, L. VERI-wild: A large dataset and a new method for vehicle re-identification in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3230–3238, 2019.
[14]
Kamenou, E.; Del Rincón, J. M.; Miller, P.; Devlin-Hill, P. A meta-learning approach for domain generalisation across visual modalities in vehicle re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 385–393, 2023.
[15]

Zhang, L.; Liu, Z.; Zhang, W.; Zhang, D. Style uncertainty based self-paced meta learning for generalizable person re-identification. IEEE Transactions on Image Processing Vol. 32, 2107–2119, 2023.

[16]
Ni, H.; Song, J.; Luo, X.; Zheng, F.; Li, W.; Shen, H. T. Meta distribution alignment for generalizable person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2477–2486, 2022.
[17]

Zheng, Z.; Ruan, T.; Wei, Y.; Yang, Y.; Mei, T. VehicleNet: Learning robust visual representation for vehicle re-identification. IEEE Transactions on Multimedia Vol. 23, 2683–2693, 2020.

[18]
Yu, J.; Oh, H. Unsupervised vehicle re-identification via self-supervised metric learning using feature dictionary. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 3806–3813, 2021.
[19]

Lu, Z.; Lin, R.; He, Q.; Hu, H. Mask-aware pseudo label denoising for unsupervised vehicle re-identification. IEEE Transactions on Intelligent Transportation Systems Vol. 24, No. 4, 4333–4347, 2023.

[20]

He, Z.; Zhao, H.; Wang, J.; Feng, W. Multi-level progressive learning for unsupervised vehicle re-identification. IEEE Transactions on Vehicular Technology Vol. 72, No. 4, 4357–4371, 2023.

[21]

Wang, P.; Ding, C.; Tan, W.; Gong, M.; Jia, K.; Tao, D. Uncertainty-aware clustering for unsupervised domain adaptive object re-identification. IEEE Transactions on Multimedia Vol. 25, 2624–2635, 2022.

[22]

Dai, P.; Chen, P.; Wu, Q.; Hong, X.; Ye, Q.; Tian, Q.; Lin, C. W.; Ji, R. Disentangling task-oriented representations for unsupervised domain adaptation. IEEE Transactions on Image Processing Vol. 31, 1012–1026, 2022.

[23]

Wei, R.; Gu, J.; He, S.; Jiang, W. Transformer-based domain-specific representation for unsupervised domain adaptive vehicle re-identification. IEEE Transactions on Intelligent Transportation Systems Vol. 24, No. 3, 2935–2946, 2023.

[24]
Wu, C.; Lin, Z.; Cohen, S.; Bui, T.; Maji, S. PhraseCut: Language-based image segmentation in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10213–10222, 2020.
[25]
Yang, Z.; Wang, J.; Tang, Y.; Chen, K.; Zhao, H.; Torr, P. H. S. LAVT: Language-aware vision transformer for referring image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18134–18144, 2022.
[26]
Xie, J.; Hou, X.; Ye, K.; Shen, L. CLIMS: Cross language image matching for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4473–4482, 2022.
[27]
Wang, X.; Zhang, X.; Cao, Y.; Wang, W.; Shen, C.; Huang, T. SegGPT: Segmenting everything in context. arXiv preprint arXiv:2304.03284, 2023.
[28]

Peng, J.; Jiang, G.; Chen, D.; Zhao, T.; Wang, H.; Fu, X. Eliminating cross-camera bias for vehicle re-identification. Multimedia Tools and Applications Vol. 81, No. 24, 34195–34211, 2022.

[29]
Khorramshahi, P.; Peri, N.; Chen, J. C.; Chellappa, R. The devil is in the details: Self-supervised attention for vehicle re-identification. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12359. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 369–386, 2020.
[30]
Zhu, X.; Luo, Z.; Fu, P.; Ji, X. VOC-RelD: Vehicle re-identification based on vehicle-orientation-camera. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2566–2573, 2020.
[31]
Ester, M.; Kriegel, H. P.; Sander, J.; Xu, X. Adensity-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, 226–231, 1996.
[32]
Radford, A.; Kim, J. W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning transferable visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning, 8748–8763, 2021.
[33]
He, S.; Luo, H.; Wang, P.; Wang, F.; Li, H.; Jiang, W. TransReID: Transformer-based object re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 14993–15002, 2021.
[34]
Li, J.; Wang, M.; Gong, X. Transformer based multi-grained features for unsupervised person re-identification. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 1–9, 2023.
[35]

Li, M.; Li, C. G.; Guo, J. Cluster-guided asymmetric contrastive learning for unsupervised person re-identification. IEEE Transactions on Image Processing Vol. 31, 3606–3617, 2022.

[36]

Han, X.; Yu, X.; Li, G.; Zhao, J.; Pan, G.; Ye, Q.; Jiao, J.; Han, Z. Rethinking sampling strategies for unsupervised person re-identification. IEEE Transactions on Image Processing Vol. 32, 29–42, 2023.

[37]
Hu, Z.; Zhu, C.; He, G. Hard-sample guided hybrid contrast learning for unsupervised person re-identification. In: Proceedings of the 7th IEEE International Conference on Network Intelligence and Digital Content, 91–95, 2021.
[38]
Zhang, X.; Li, D.; Wang, Z.; Wang, J.; Ding, E.; Shi, J. Q.; Zhang, Z.; Wang, J. Implicit sample extension for unsupervised person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7359–7368, 2022.
[39]
Yang, F.; Zhong, Z.; Luo, Z.; Cai, Y.; Lin, Y.; Li, S.; Sebe, N. Joint noise-tolerant learning and meta camera shift adaptation for unsupervised person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4853–4862, 2021.
[40]
Ge, Y.; Zhu, F.; Chen, D.; Zhao, R.; Li, H. Self-paced contrastive learning with hybrid memory for domain adaptive object re-ID. arXiv preprint arXiv:2006.02713, 2020.
[41]
Ge, Y.; Chen, D.; Li, H. Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv preprint arXiv:2001.01526, 2020.
[42]
Liu, X.; Zhang, S. Graph consistency based mean-teaching for unsupervised domain adaptive person re-identification. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence, 874–880, 2021.
[43]

Ding, Y.; Fan, H.; Xu, M.; Yang, Y. Adaptive exploration for unsupervised person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications Vol. 16, No. 1, Article No. 3, 2020.

[44]

Wang, W.; Zhao, F.; Liao, S.; Shao, L. Attentive WaveBlock: Complementarity-enhanced mutual networks for unsupervised domain adaptation in person re-identification and beyond. IEEE Transactions on Image Processing Vol. 31, 1532–1544, 2022.

[45]
Zheng, K.; Liu, W.; He, L.; Mei, T.; Luo, J.; Zha, Z. J. Group-aware label transfer for domain adaptive person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5306–5315, 2021.
[46]
Wang, D.; Zhang, S. Unsupervised person re-identification via multi-label classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10978–10987, 2020.
[47]

Lin, Y.; Dong, X.; Zheng, L.; Yan, Y.; Yang, Y. A bottom-up clustering approach to unsupervised person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 33, No. 1, 8738–8745, 2019.

[48]
Zhong, Z.; Zheng, L.; Luo, Z.; Li, S.; Yang, Y. Invariance matters: Exemplar memory for domain adaptive person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 598–607, 2019.
[49]
Yu, H. X.; Zheng, W. S. Weakly supervised discriminative feature learning with state information for person identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5527–5537, 2020.
[50]
Lin, Y.; Xie, L.; Wu, Y.; Yan, C.; Tian, Q. Unsupervised person re-identification via softened similarity learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3387–3396, 2020.
[51]
Li, J.; Zhang, S. Joint visual and temporal consistency for unsupervised domain adaptive person re-identification. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12369. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 483–499, 2020.
[52]
Zeng, K.; Ning, M.; Wang, Y.; Guo, Y. Hierarchical clustering with hard-batch triplet loss for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13654–13662, 2020.
[53]

Jin, X.; Lan, C.; Zeng, W.; Chen, Z. Uncertainty-aware multi-shot knowledge distillation for image-based object re-identification. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 11165–11172, 2020.

[54]
Zhouy, Y.; Shao, L. Viewpoint-aware attentive multi-view inference for vehicle re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6489–6498, 2018.
[55]
Rao, Y.; Chen, G.; Lu, J.; Zhou, J. Counterfactual attention learning for fine-grained visual categorization and re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1005–1014, 2021.
[56]

Jin, X.; Lan, C.; Zeng, W.; Wei, G.; Chen, Z. Semantics-aligned representation learning for person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 11173–11180, 2020.

[57]

Yan, C.; Pang, G.; Bai, X.; Liu, C.; Ning, X.; Gu, L.; Zhou, J. Beyond triplet loss: Person re-identification with fine-grained difference-aware pairwise loss. IEEE Transactions on Multimedia Vol. 24, 1665–1677, 2021.

[58]
Bai, Y.; Lou, Y.; Dai, Y.; Liu, J.; Chen, Z.; Duan, L.-Y.; Pillar, I. Disentangled feature learning network for vehicle re-identification. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, 474–480, 2020.
[59]

Lou, Y.; Bai, Y.; Liu, J.; Wang, S.; Duan, L. Y. Embedding adversarial learning for vehicle re-identification. IEEE Transactions on Image Processing Vol. 28, No. 8, 3794–3807, 2019.

Computational Visual Media
Pages 771-789
Cite this article:
Wang D, Wang Q, Min W, et al. SAM-driven MAE pre-training and background-aware meta-learning for unsupervised vehicle re-identification. Computational Visual Media, 2024, 10(4): 771-789. https://doi.org/10.1007/s41095-024-0424-2

79

Views

4

Downloads

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 04 January 2024
Accepted: 03 March 2024
Published: 15 August 2024
© The Author(s) 2024.

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return