PDF (13.1 MB)
Collect
Submit Manuscript
Research Article | Open Access

LDTR: Transformer-based lane detection with anchor-chain representation

School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
Didi Chuxing, Beijing 100081, China
Show Author Information

Graphical Abstract

View original image Download original image

Abstract

Despite recent advances in lane detection methods, scenarios with limited- or no-visual-clue of lanes due to factors such as lighting conditions and occlusion remain challenging and crucial for automated driving. Moreover, current lane representations require complex post-processing and struggle with specific instances. Inspired by the DETR architecture, we propose LDTR, a transformer-based model to address these issues. Lanes are modeled with a novel anchor-chain, regarding a lane as a whole from the beginning, which enables LDTR to handle special lanes inherently. To enhance lane instance perception, LDTR incorporates a novel multi-referenced deformable attention module to distribute attention around the object. Additionally, LDTR incorporates two line IoU algorithms to improve convergence efficiency and employs a Gaussian heatmap auxiliary branch to enhance model representation capability during training. To evaluate lane detection models, we rely on Frßchet distance, parameterized F1-score, and additional synthetic metrics. Experimental results demonstrate that LDTR achieves state-of-the-art performance on well-known datasets.

Electronic Supplementary Material

Video
cvm-10-4-753_ESM.mp4

References

[1]

Kim, Z. Robust lane detection and tracking in challenging scenarios. IEEE Transactions on Intelligent Transportation Systems Vol. 9, No. 1, 16–26, 2008.

[2]
Borkar, A.; Hayes, M.; Smith, M. T. Robust lane detection and tracking with ransac and Kalman filter. In: Proceedings of the 16th IEEE International Conference on Image Processing, 3261–3264, 2009.
[3]
Pan, X.; Shi, J.; Luo, P.; Wang, X.; Tang, X. Spatial as deep: Spatial CNN for traffic scene understanding. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 7276–7283, 2018.
[4]
Neven, D.; De Brabandere, B.; Georgoulis, S.; Proesmans, M.; Van Gool, L. Towards end-to-end lane detection: An instance segmentation approach. In: Proceedings of the IEEE Intelligent Vehicles Symposium, 286–291, 2018.
[5]

Abualsaud, H.; Liu, S.; Lu, D. B.; Situ, K.; Rangesh, A.; Trivedi, M. M. LaneAF: Robust multi-lane detection with affinity fields. IEEE Robotics and Automation Letters Vol. 6, No. 4, 7477–7484, 2021.

[6]
Liu, L.; Chen, X.; Zhu, S.; Tan, P. CondLaneNet: A top-to-down lane detection framework based on conditional convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 3753–3762, 2021.
[7]
Tabelini, L.; Berriel, R.; Paixão, T. M.; Badue, C.; De Souza, A. F.; Oliveira-Santos, T. Keep your eyes on the lane: Real-time attention-guided lane detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 294–302, 2021.
[8]
Yang, Z.; Shen, C.; Shao, W.; Xing, T.; Hu, R.; Xu, P.; Chai, H.; Xue, R. CANet: Curved guide line network with adaptive decoder for lane detection. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 1–5, 2023.
[9]

Zhang, J. Q.; Duan, H. B.; Chen, J. L.; Shamir, A.; Wang, M. HoughLaneNet: Lane detection with deep Hough transform and dynamic convolution. Computers & Graphics Vol. 116, 82–92, 2023.

[10]
Liu, R.; Yuan, Z.; Liu, T.; Xiong, Z. End-to-end lane shape prediction with transformers. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 3693–3701, 2021.
[11]
Qiu, Q.; Gao, H.; Hua, W.; Huang, G.; He, X. PriorLane: A prior knowledge enhanced lane detection approach based on transformer. arXiv preprint arXiv:2209.06994, 2022.
[12]
Han, J.; Deng, X.; Cai, X.; Yang, Z.; Xu, H.; Xu, C.; Liang, X. Laneformer: Object-aware row-column transformers for lane detection. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence, 799–807, 2022.
[13]
Zhou, K.; Zhou, R. End-to-end lane detection with one-to-several transformer. arXiv preprint arXiv:2305.00675, 2023.
[14]
Chen, Z.; Liu, Y.; Gong, M.; Du, B.; Qian, G.; Smith-Miles, K. Generating dynamic kernels via transformers for lane detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 6812–6821, 2023.
[15]
Qu, Z.; Jin, H.; Zhou, Y.; Yang, Z.; Zhang, W. Focus on local: Detecting lane marker from bottom up via key point. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14117–14125, 2021.
[16]
Wang, J.; Ma, Y.; Huang, S.; Hui, T.; Wang, F.; Qian, C.; Zhang, T. A keypoint-based global association network for lane detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1392–1401, 2022.
[17]
Qin, Z.; Wang, H.; Li, X. Ultra-fast structure-aware deep lane detection. In: Computer Vision–ECCV 2020. Lecture Notes in Computer Science, Vol. 12369. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 276–291, 2020.
[18]
Xu, H.; Wang, S.; Cai, X.; Zhang, W.; Liang, X.; Li, Z. CurveLane-NAS: Unifying lane-sensitive architecture search and adaptive point blending. In: Computer Vision–ECCV 2020. Lecture Notes in Computer Science, Vol. 12360. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 689–704, 2020.
[19]
Feng, Z.; Guo, S.; Tan, X.; Xu, K.; Wang, M.; Ma, L. Rethinking efficient lane detection via curve modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17062–17070, 2022.
[20]
Zheng, T.; Huang, Y.; Liu, Y.; Tang, W.; Yang, Z.; Cai, D.; He, X. CLRNet: Cross layer refinement network for lane detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 898–907, 2022.
[21]
Jin, D.; Kim, D.; Kim, C. S. Recursive video lane detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 8473–8482, 2023.
[22]

Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 39, No. 6, 1137–1149, 2017.

[23]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 779–788, 2016.
[24]
Redmon, J.; Farhadi, A. YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
[25]
Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. CenterNet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 6569–6578, 2019.
[26]
Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9627–9636, 2019.
[27]
Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12346. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 213–229, 2020.
[28]
Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159, 2020.
[29]
Liu, S.; Li, F.; Zhang, H.; Yang, X.; Qi, X.; Su, H.; Zhu, J.; Zhang, L. DAB-DETR: Dynamic anchor boxes are better queries for DETR. arXiv preprint arXiv:2201.12329, 2022.
[30]

Kuhn, H. W. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly Vol. 2, Nos. 1–2, 83–97, 1955.

[31]
Liao, B.; Chen, S.; Wang, X.; Cheng, T.; Zhang, Q.; Liu, W.; Huang, C. MapTR: Structured modeling and learning for online vectorized HD map construction. arXiv preprint arXiv:2208.14437, 2022.
[32]
Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159, 2020.
[33]
Li, F.; Zhang, H.; Xu, H.; Liu, S.; Zhang, L.; Ni, L. M.; Shum, H. Y. Mask DINO: Towards A unified transformer-based framework for object detection and segmentation. arXiv preprint arXiv:2206.02777, 2022.
[34]
Lin, T. Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C. L. Microsoft COCO: Common objects in context. In: Computer Vision – ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 740–755, 2014.
Computational Visual Media
Pages 753-769
Cite this article:
Yang Z, Shen C, Shao W, et al. LDTR: Transformer-based lane detection with anchor-chain representation. Computational Visual Media, 2024, 10(4): 753-769. https://doi.org/10.1007/s41095-024-0421-5
Metrics & Citations  
Article History
Copyright
Rights and Permissions
Return