| Sign up

PDF (13.1 MB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Research Article | Open Access

LDTR: Transformer-based lane detection with anchor-chain representation

Zhongyu Yang^¹, Chen Shen^², Wei Shao^², Tengfei Xing^², Runbo Hu^², Pengfei Xu^², Hua Chai^², Ruini Xue^¹()

1

School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China

2

Didi Chuxing, Beijing 100081, China

Show Author Information

Graphical Abstract

View original image Download original image

Abstract

Despite recent advances in lane detection methods, scenarios with limited- or no-visual-clue of lanes due to factors such as lighting conditions and occlusion remain challenging and crucial for automated driving. Moreover, current lane representations require complex post-processing and struggle with specific instances. Inspired by the DETR architecture, we propose LDTR, a transformer-based model to address these issues. Lanes are modeled with a novel anchor-chain, regarding a lane as a whole from the beginning, which enables LDTR to handle special lanes inherently. To enhance lane instance perception, LDTR incorporates a novel multi-referenced deformable attention module to distribute attention around the object. Additionally, LDTR incorporates two line IoU algorithms to improve convergence efficiency and employs a Gaussian heatmap auxiliary branch to enhance model representation capability during training. To evaluate lane detection models, we rely on Frßchet distance, parameterized F1-score, and additional synthetic metrics. Experimental results demonstrate that LDTR achieves state-of-the-art performance on well-known datasets.

Keywords

transformer lane detection anchor-chain

Electronic Supplementary Material

Video

cvm-10-4-753_ESM.mp4

References

[1]

Kim, Z. Robust lane detection and tracking in challenging scenarios. IEEE Transactions on Intelligent Transportation Systems Vol. 9, No. 1, 16–26, 2008.

Crossref Google Scholar

[2]

Borkar, A.; Hayes, M.; Smith, M. T. Robust lane detection and tracking with ransac and Kalman filter. In: Proceedings of the 16th IEEE International Conference on Image Processing, 3261–3264, 2009.

[3]

Pan, X.; Shi, J.; Luo, P.; Wang, X.; Tang, X. Spatial as deep: Spatial CNN for traffic scene understanding. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 7276–7283, 2018.

[4]

Neven, D.; De Brabandere, B.; Georgoulis, S.; Proesmans, M.; Van Gool, L. Towards end-to-end lane detection: An instance segmentation approach. In: Proceedings of the IEEE Intelligent Vehicles Symposium, 286–291, 2018.

[5]

Abualsaud, H.; Liu, S.; Lu, D. B.; Situ, K.; Rangesh, A.; Trivedi, M. M. LaneAF: Robust multi-lane detection with affinity fields. IEEE Robotics and Automation Letters Vol. 6, No. 4, 7477–7484, 2021.

Crossref Google Scholar

[6]

Liu, L.; Chen, X.; Zhu, S.; Tan, P. CondLaneNet: A top-to-down lane detection framework based on conditional convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 3753–3762, 2021.

[7]

Tabelini, L.; Berriel, R.; Paixão, T. M.; Badue, C.; De Souza, A. F.; Oliveira-Santos, T. Keep your eyes on the lane: Real-time attention-guided lane detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 294–302, 2021.

[8]

Yang, Z.; Shen, C.; Shao, W.; Xing, T.; Hu, R.; Xu, P.; Chai, H.; Xue, R. CANet: Curved guide line network with adaptive decoder for lane detection. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 1–5, 2023.

[9]

Zhang, J. Q.; Duan, H. B.; Chen, J. L.; Shamir, A.; Wang, M. HoughLaneNet: Lane detection with deep Hough transform and dynamic convolution. Computers & Graphics Vol. 116, 82–92, 2023.

Crossref Google Scholar

[10]

Liu, R.; Yuan, Z.; Liu, T.; Xiong, Z. End-to-end lane shape prediction with transformers. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 3693–3701, 2021.

[11]

Qiu, Q.; Gao, H.; Hua, W.; Huang, G.; He, X. PriorLane: A prior knowledge enhanced lane detection approach based on transformer. arXiv preprint arXiv:2209.06994, 2022.

[12]

Han, J.; Deng, X.; Cai, X.; Yang, Z.; Xu, H.; Xu, C.; Liang, X. Laneformer: Object-aware row-column transformers for lane detection. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence, 799–807, 2022.

[13]

Zhou, K.; Zhou, R. End-to-end lane detection with one-to-several transformer. arXiv preprint arXiv:2305.00675, 2023.

[14]

Chen, Z.; Liu, Y.; Gong, M.; Du, B.; Qian, G.; Smith-Miles, K. Generating dynamic kernels via transformers for lane detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 6812–6821, 2023.

[15]

Qu, Z.; Jin, H.; Zhou, Y.; Yang, Z.; Zhang, W. Focus on local: Detecting lane marker from bottom up via key point. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14117–14125, 2021.

[16]

Wang, J.; Ma, Y.; Huang, S.; Hui, T.; Wang, F.; Qian, C.; Zhang, T. A keypoint-based global association network for lane detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1392–1401, 2022.

[17]

Qin, Z.; Wang, H.; Li, X. Ultra-fast structure-aware deep lane detection. In: Computer Vision–ECCV 2020. Lecture Notes in Computer Science, Vol. 12369. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 276–291, 2020.

[18]

Xu, H.; Wang, S.; Cai, X.; Zhang, W.; Liang, X.; Li, Z. CurveLane-NAS: Unifying lane-sensitive architecture search and adaptive point blending. In: Computer Vision–ECCV 2020. Lecture Notes in Computer Science, Vol. 12360. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 689–704, 2020.

[19]

Feng, Z.; Guo, S.; Tan, X.; Xu, K.; Wang, M.; Ma, L. Rethinking efficient lane detection via curve modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17062–17070, 2022.

[20]

Zheng, T.; Huang, Y.; Liu, Y.; Tang, W.; Yang, Z.; Cai, D.; He, X. CLRNet: Cross layer refinement network for lane detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 898–907, 2022.

[21]

Jin, D.; Kim, D.; Kim, C. S. Recursive video lane detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 8473–8482, 2023.

[22]

Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 39, No. 6, 1137–1149, 2017.

Crossref Google Scholar

[23]

Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 779–788, 2016.

[24]

Redmon, J.; Farhadi, A. YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.

[25]

Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. CenterNet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 6569–6578, 2019.

[26]

Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9627–9636, 2019.

[27]

Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12346. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 213–229, 2020.

[28]

Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159, 2020.

[29]

Liu, S.; Li, F.; Zhang, H.; Yang, X.; Qi, X.; Su, H.; Zhu, J.; Zhang, L. DAB-DETR: Dynamic anchor boxes are better queries for DETR. arXiv preprint arXiv:2201.12329, 2022.

[30]

Kuhn, H. W. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly Vol. 2, Nos. 1–2, 83–97, 1955.

Crossref Google Scholar

[31]

Liao, B.; Chen, S.; Wang, X.; Cheng, T.; Zhang, Q.; Liu, W.; Huang, C. MapTR: Structured modeling and learning for online vectorized HD map construction. arXiv preprint arXiv:2208.14437, 2022.

[32]

Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159, 2020.

[33]

Li, F.; Zhang, H.; Xu, H.; Liu, S.; Zhang, L.; Ni, L. M.; Shum, H. Y. Mask DINO: Towards A unified transformer-based framework for object detection and segmentation. arXiv preprint arXiv:2206.02777, 2022.

[34]

Lin, T. Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C. L. Microsoft COCO: Common objects in context. In: Computer Vision – ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 740–755, 2014.

Computational Visual Media

Volume 10 Issue 4,
August 2024

Pages 753-769

DOI: 10.1007/s41095-024-0421-5

Cite this article:

Yang Z, Shen C, Shao W, et al. LDTR: Transformer-based lane detection with anchor-chain representation. Computational Visual Media, 2024, 10(4): 753-769. https://doi.org/10.1007/s41095-024-0421-5

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号