AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (12.8 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

Scale variant vehicle object recognition by CNN module of multi-pooling-PCA process

Yuxiang Guo1Itsuo Kumazawa1Chuyo Kaku2( )
Department of Information and Communications Engineering, Tokyo Institute of Technology, Tokyo 152-8550, Japan
Research and Development Center, Jiangsu Chaoli Electric Manufacture Co., Ltd., Shanghai 212321, China
Show Author Information

Abstract

The moving vehicles present different scales in the image due to the perspective effect of different viewpoint distances. The premise of advanced driver assistance system (ADAS) system for safety surveillance and safe driving is early identification of vehicle targets in front of the ego vehicle. The recognition of the same vehicle at different scales requires feature learning with scale invariance. Unlike existing feature vector methods, the normalized PCA eigenvalues calculated from feature maps are used to extract scale-invariant features. This study proposed a convolutional neural network (CNN) structure embedded with the module of multi-pooling-PCA for scale variant object recognition. The validation of the proposed network structure is verified by scale variant vehicle image dataset. Compared with scale invariant network algorithms of Scale-invariant feature transform (SIFT) and FSAF as well as miscellaneous networks, the proposed network can achieve the best recognition accuracy tested by the vehicle scale variant dataset. To testify the practicality of this modified network, the testing of public dataset ImageNet is done and the comparable results proved its effectiveness in general purpose of applications.

References

[1]

Ao, D., Li, J., 2022. Subjective assessment for an advanced driver assistance system: A case study in China. J Intell Connect Veh, 5, 112–122.

[2]
Bello, I., 2021. LambdaNetworks: Modeling long-range interactions without attention. https://arxiv.org/abs/2102.08602.pdf
[3]

Bila, C., Sivrikaya, F., Khan, M. A., Albayrak, S., 2017. Vehicles of the future: A survey of research on safety issues. IEEE Trans Intell Transp Syst, 18, 1046–1065.

[4]
Chen, L. C., Yang, Y., Wang, J., Xu, W., Yuille, A. L., 2016. Attention to scale: Scale-aware semantic image segmentation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3640–3649.
[5]

Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A. L., 2018. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell, 40, 834–848.

[6]
Dehghani, M., Djolonga, J., Mustafa, B., Padlewski, P., Heek, J., Gilmer, J., et al., 2023. Scaling vision transformers to 22 billion parameters. In: International Conference on Machine Learning, 7480–7512.
[7]
Eigen, D., Fergus, R., 2015. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: 2015 IEEE International Conference on Computer Vision (ICCV), 2650–2658.
[8]

Guo, Y., Kumazawa, I., Kaku, C., 2018. Blind spot obstacle detection from monocular camera images with depth cues extracted by CNN. Automot Innov, 1, 362–373.

[9]

He, K., Zhang, X., Ren, S., Sun, J., 2015. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell, 37, 1904–1916.

[10]

Hua, J., Wang, J., Peng, H., Yang, J., 2011. A novel edge detection method based on PCA. Int J Adv Comput Technol, 3, 228–238.

[11]
International Organization for Standardization (ISO), 2022. ISO: Road vehicles — Safety of the intended functionality, ISO 21448:2022. https://www.iso.org/standard/77490.html
[12]
Jolliffe, I.T., 2002. Principal Component Analysis. New York: Springer-Yerlag, 24.
[13]
Krizhevsky, A., Sutskever, I., Hinton, G. E., 2012. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, 1097–1105.
[14]

LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature, 521, 436–444.

[15]

Li, X., Wang, W., Zhang, Z., Rötting, M., 2018. Effects of feature selection on lane-change maneuver recognition: An analysis of naturalistic driving data. J Intell Connect Veh, 1, 85–98.

[16]
Lin, G., Shen, C., Van Den Hengel, A., Reid, I., 2016. Efficient piecewise training of deep structured models for semantic segmentation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3194–3203.
[17]
Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S., 2017. Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1, 936–944.
[18]

Lindeberg, T., 2012. Scale invariant feature transform. Scholarpedia, 7, 10491.

[19]

Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X. et al., 2020. Deep learning for generic object detection: A survey. Int J Comput Vis, 128, 261–318.

[20]

Muhammad, K., Ullah, A., Lloret, J., Del Ser, J., de Albuquerque, V. H. C., 2020. Deep learning for safe autonomous driving: Current challenges and future directions. IEEE Trans Intell Transp Syst, 22, 4316–4336.

[21]
Pinheiro, P. O., Collobert, R., 2014. Recurrent convolutional neural networks for scene labeling. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, 82−90.
[22]
Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556.pdf
[23]
Tan, M., Le, Q. V., 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning ( ICML), 6105−6114.
[24]
World Health Organization (WHO), 2023. Death on the roads. https://extranet.who.int/roadsafety/death-on-the-roads
[25]
Xiao, L., Bahri, Y., Sohl-Dickstein, J., Schoenholz, S., Pennington, J., 2018. Dynamical isometry and a mean field theory of CNNs: How to train 10,000-layer vanilla convolutional neural networks. In: Proceedings of the 35th International Conference on Machine Learning. https://arxiv.org/abs/1806.05393.pdf
[26]

Yohanes, B. W., 2019. Images similarity based on bags of SIFT descriptor and K-means clustering. Tech, 18, 137–146.

[27]

Zhang, X., Yang, Y. H., Han, Z., Wang, H., Gao, C., 2013. Object class detection: A survey. ACM Comput Surv, 46, 10.

[28]
Zhu, C., He, Y., Savvides, M., 2020. Feature selective anchor-free module for single-shot object detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 840–849.
[29]
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J., 2017. Pyramid scene parsing network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6230–6239.
[30]
Zoph, B., Vasudevan, V., Shlens, J., Le, Q. V., 2018. Learning transferable architectures for scalable image recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8697–8710.
Journal of Intelligent and Connected Vehicles
Pages 227-236
Cite this article:
Guo Y, Kumazawa I, Kaku C. Scale variant vehicle object recognition by CNN module of multi-pooling-PCA process. Journal of Intelligent and Connected Vehicles, 2023, 6(4): 227-236. https://doi.org/10.26599/JICV.2023.9210017

330

Views

15

Downloads

0

Crossref

0

Scopus

Altmetrics

Received: 25 June 2023
Revised: 18 July 2023
Accepted: 12 August 2023
Published: 30 December 2023
© The author(s) 2023.

This is an open access article under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return