Journal Home > Volume 2 , Issue 1

Road damage detection plays an important role in ensuring road safety and improving traffic flow. The dramatic progress of artificial intelligence (AI) technology offers new opportunities for this field. In this paper, we introduce lightweight attention ghost-you only look once (LAG-YOLO), an efficient deep-learning network for road damage detection. LAG-YOLO optimizes the network structure of YOLO, making it more suitable for real-time processing and lightweight deployment while ensuring high accuracy. In addition, a novel module called attention ghost is designed to reduce the model parameters and improve the model performance by the simple attention module (SimAM). LAG-YOLO achieves an impressive parameter reduction to 4.19 million, delivering remarkable mean average precision (mAP) scores of 45.80% on the Hualu dataset and 52.35% on the RDD2020 dataset. In summary, the proposed network performs satisfactorily on extensive road damage datasets with fewer parameters, making it more suitable to be deployed in practice.


menu
Abstract
Full text
Outline
About this article

LAG-YOLO: Efficient road damage detector via lightweight attention ghost module

Show Author's information Junxin ChenaXiaojie YuaQiankun LibWei WangcBen-Guo Hed( )
School of Software, Dalian University of Technology, Dalian 116621, China
Department of Automation, University of Science and Technology of China, Hefei 230027, China
Guangdong–Hong Kong–Macao Joint Laboratory for Emotion Intelligence and Pervasive Computing, Artificial Intelligence Research Institute, Shenzhen MSU-BIT University, Shenzhen 518038, China
Key Laboratory of Ministry of Education on Safe Mining of Deep Metal Mines, Northeastern University, Shenyang 110819, China

Abstract

Road damage detection plays an important role in ensuring road safety and improving traffic flow. The dramatic progress of artificial intelligence (AI) technology offers new opportunities for this field. In this paper, we introduce lightweight attention ghost-you only look once (LAG-YOLO), an efficient deep-learning network for road damage detection. LAG-YOLO optimizes the network structure of YOLO, making it more suitable for real-time processing and lightweight deployment while ensuring high accuracy. In addition, a novel module called attention ghost is designed to reduce the model parameters and improve the model performance by the simple attention module (SimAM). LAG-YOLO achieves an impressive parameter reduction to 4.19 million, delivering remarkable mean average precision (mAP) scores of 45.80% on the Hualu dataset and 52.35% on the RDD2020 dataset. In summary, the proposed network performs satisfactorily on extensive road damage datasets with fewer parameters, making it more suitable to be deployed in practice.

Keywords: deep learning, attention mechanism, road damage, lightweight detector

References(42)

[1]
T. S. Nguyen, M. Avila, S. Begot. Automatic detection and classification of defect on road pavement using anisotropy measure. In: Proceedings of the 17th European Signal Processing Conference, Glasgow, UK, 2009: pp 617–621.
[2]

Y. Shi, L. M. Cui, Z. Q. Qi, et al. Automatic road crack detection using random structured forests. IEEE Trans Intell Transp Syst, 2016, 17: 3434–3445.

[3]
V. Mandal, A. R. Mussah, Y. Adu-gyamfi. Deep learning frameworks for pavement distress classification: A comparative analysis. In: Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, USA, 2020: pp 5577–5583.
DOI
[4]

Z. J. Lin, H. Wang, S. L. Li. Pavement anomaly detection based on transformer and self-supervised learning. Autom Constr, 2022, 143: 104544.

[5]
F. Kortmann, K. Talits, P. Fassmeyer, et al. Detecting various road damage types in global countries utilizing faster R-CNN. In: Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, USA, 2020: pp 5563–5571.
DOI
[6]
W. Z. Wang, B. Wu, S. X. Yang, et al. Road damage detection and classification with faster R-CNN. In: Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, USA, 2018: pp 5220–5223.
DOI
[7]
R. Vishwakarma, R. Vennelakanti. CNN model & tuning for global road damage detection. In: Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, USA, 2020: pp 5609–5615.
DOI
[8]
J. Redmon, A. Farhadi. YOLOv3: An incremental improvement. 2018, arXiv:1804.02767. arXiv.org e-Print archive. https://arxiv.org/abs/1804.02767 (accessed 2023-12-21
[9]
A. Bochkovskiy, C. Y. Wang, H. Y. M. Liao. YOLOv4: Optimal speed and accuracy of object detection. 2020, arXiv:2004.10934. arXiv.org e-Print archive. https://arxiv.org/abs/2004.10934 (accessed 2023-12-21
[10]
T. Y. Lin, P. Goyal, R. Girshick, et al. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017: pp 2999–3007.
DOI
[11]
X. Y. Zhou, D. Q. Wang, P. Krähenbühl. Objects as points. 2019, arXiv:1904.07850. arXiv.org e-Print archive. https://arxiv.org/abs/1904.07850 (accessed 2023-12-21
[12]
K. W. Duan, S. Bai, L. X. Xie, et al. CenterNet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 2019: pp 6568–6577.
DOI
[13]

V. P. Tran, T. S. Tran, H. J. Lee, et al. One stage detector (RetinaNet)-based crack detection for asphalt pavements considering pavement distresses and surface objects. J Civil Struct Health Monit, 2021, 11: 205–222.

[14]

G. Ochoa-Ruiz, A. A. Angulo-murillo, A. Ochoa-zezzatti, et al. An asphalt damage dataset and detection system based on retinanet for road conditions assessment. Appl Sci, 2020, 10: 3974.

[15]

Z. Liu, W. X. Wu, X. Y. Gu, et al. Application of combining YOLO models and 3D GPR images in road detection and maintenance. Remote Sens, 2021, 13: 1081.

[16]
A. Dosovitskiy, L. Beyer, A. Kolesnikov, et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. 2021, arXiv:2010.11929. arXiv.org e-Print archive. https://arxiv.org/abs/2010.11929 (accessed 2023-12-21
[17]
K. Han, Y. H. Wang, Q. Tian, et al. GhostNet: More features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: pp 1577–1586.
DOI
[18]
L. X. Yang, R. Y. Zhang, L. D. Li, et al. SimAM: A simple, parameter-free attention module for convolutional neural networks. In: Proceedings of the 38th International Conference on Machine Learning, PMLR, 2021: pp 11863–11874.
[19]

H. Maeda, T. Kashiyama, Y. Sekimoto, et al. Generative adversarial network for road damage detection. Comput Aided Civil Infrastruct Eng, 2021, 36: 47–60.

[20]
C. Bowles, L. Chen, R. Guerrero, et al. GAN augmentation: Augmenting training data using generative adversarial networks. 2018, arXiv:1810.10863. arXiv.org e-Print archive. https://arxiv.org/abs/1810.10863 (accessed 2023-12-21
[21]
L. Ale, N. Zhang, L. Z. Li. Road damage detection using RetinaNet. In: Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, USA, 2018: pp 5197–5200.
DOI
[22]

H. Maeda, Y. Sekimoto, T. Seto, et al. Road damage detection and classification using deep neural networks with smartphone images. Comput Aided Civil Infrastruct Eng, 2018, 33: 1127–1141.

[23]

J. F. Wang, Y. Chen, Z. K. Dong, et al. Improved YOLOv5 network for real-time multi-scale traffic sign detection. Neural Comput Appl, 2023, 35: 7853–7865.

[24]
J. F. Wang, Y. Chen, X. Y. Ji, et al. Vehicle-mounted adaptive traffic sign detector for small-sized signs in multiple working conditions. IEEE Trans Intell Transp Syst, in press, https://doi.org/10.1109/TITS.2023.3309644.
DOI
[25]

G. G. Guo, Z. Y. Zhang. Road damage detection algorithm for improved YOLOv5. Sci Rep, 2022, 12: 15523.

[26]
A. Howard, M. Sandler, B. Chen, et al. Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 2019: pp 1314–1324.
DOI
[27]
V. Pham, C. Pham, T. Dang. Road damage detection and classification with detectron2 and faster R-CNN. In: Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, USA, 2020: pp 5592–5601.
DOI
[28]
R. Girshick. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 2015: pp 1440–1448.
DOI
[29]
A. Vaswani, N. Shazeer, N. Parmar, et al. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: pp 6000–6010.
[30]
M. J. Zhu, Y. H. Tang, K. Han. Vision transformer pruning. 2021, arXiv:2104.08500. arXiv.org e-Print archive. https://arxiv.org/abs/2104.08500 (accessed 2023-12-21
[31]
J. S. Li, X. Xia, W. Li, et al. Next-ViT: Next generation vision transformer for efficient deployment in realistic industrial scenarios. 2022, arXiv:2207.05501. arXiv.org e-Print archive. https://arxiv.org/abs/2207.05501 (accessed 2023-12-21
[32]

J. C. Mao, H. R. Yang, A. Li, et al. Tprune: Efficient transformer pruning for mobile devices. ACM Trans Cyber-Phys Syst, 2021, 5: 26.

[33]
S. Q. Ren, K. M. He, R. Girshick, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015: pp 91–99.
[34]
J. Redmon, S. Divvala, R. Girshick, et al. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: pp 779–788.
DOI
[35]
C. Y. Wang, H. Y. M. Liao, Y. H. Wu, et al. CSPNet: A new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, USA, 2020: pp 1571–1580.
DOI
[36]
T. Y. Lin, P. Dollár, R. Girshick, et al. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: pp 936–944.
DOI
[37]
S. Liu, L. Qi, H. F. Qin, et al. Path aggregation network for instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: pp 8759–8768.
DOI
[38]

L. B. Cheng, J. Li, P. Duan, et al. A small attentional YOLO model for landslide detection from satellite remote sensing images. Landslides, 2021, 18: 2751–2765.

[39]
H. R. Liu, Z. Z. Yu, X. W. Xu, et al. Global attention augmentation ghost module: More features from lightweight global attention extraction. In: Proceedings of the 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), Washington, USA, 2021: pp 40–48.
DOI
[40]
T. H. Wu, T. W. Wang, Y. Q. Liu. Real-time vehicle and distance detection based on improved YOlOv5 network. In: Proceedings of the 2021 3rd World Symposium on Artificial Intelligence (WSAI), Guangzhou, China, 2021: pp 24–28.
[41]
X. Y. Qin, N. Li, C. Weng, et al. Simple attention module based speaker verification with iterative noisy label detection. In: Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 2022: pp 6722–6726.
DOI
[42]

D. Arya, H. Maeda, S. K. Ghosh, et al. Rdd2020: An annotated image dataset for automatic road damage detection using deep learning. Data Brief, 2021, 36: 107133.

Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 23 October 2023
Revised: 11 November 2023
Accepted: 19 November 2023
Published: 20 February 2024
Issue date: March 2024

Copyright

© The Author(s) 2024. Published by Tsinghua University Press.

Acknowledgements

Acknowledgement

This work is funded by the National Natural Science Foundation of China (Nos. 52222810 and 62171114) and the Fundamental Research Funds for the Central Universities (No. DUT22RC(3)099).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Return