A three-stage real-time detector for traffic signs in large panoramas

Yizhi Song; Ruochen Fan; Sharon Huang; Zhe Zhu; Ruofeng Tong

doi:10.1007/s41095-019-0152-1

| Sign up

PDF (21.6 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Research Article | Open Access

A three-stage real-time detector for traffic signs in large panoramas

Yizhi Song^¹, Ruochen Fan^², Sharon Huang^³, Zhe Zhu^⁴, Ruofeng Tong^⁵()

1Department of Computer Science, Purdue University, 305 N. University Street, West Lafayette, IN 47907, USA.

2Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China.

3College of Information Sciences and Technology, Penn State University, University Park, PA 16802, USA.

4Department of Radiology, Duke University, Durham, NC 27705, USA.

5College of Computer Science and Technology,Zhejiang University, Hangzhou 310007, China.

Show Author Information

Abstract

Traffic sign detection is one of the key com-ponents in autonomous driving. Advanced autonomous vehicles armed with high quality sensors capture high definition images for further analysis. Detecting traffic signs, moving vehicles, and lanes is important for localization and decision making. Traffic signs, especially those that are far from the camera, are small, and so are challenging to traditional object detection methods. In this work, in order to reduce computational cost and improve detection performance, we split the large input images into small blocks and then recognize traffic signs in the blocks using another detection module. Therefore, this paper proposes a three-stage traffic sign detector, which connects a BlockNet with an RPN-RCNN detection network. BlockNet, which is composed of a set of CNN layers, is capable of performing block-level foreground detection, making inferences in less than 1 ms. Then, the RPN-RCNN two-stage detector is used to identify traffic sign objects in each block; it is trained on a derived dataset named TT100KPatch. Experiments show that our framework can achieve both state-of-the-art accuracy and recall; its fastest detection speed is 102 fps.

Keywords

traffic sign detection real time

References

[1]

Everingham,

; L.

van Gool,

; C. K. I.

Williams,

; J.

Winn,

; A.

Zisserman,

The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision Vol. 88, No. 2, 303-338, 2010.

Crossref Google Scholar

[2]

Zhu,

; D.

Liang,

; S. H.

Zhang,

; X. L.

Huang,

; B. L.

Li,

; S. M.

Hu,

Traffic-sign detection and classification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2110-2118, 2016.

[3]

Sermanet,

; D.

Eigen,

; X.

Zhang,

; M.

Mathieu,

; R.

Fergus,

; Y

LeCun,

. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229, 2013.

Google Scholar

[4]

Redmon,

; S.

Divvala,

; R.

Girshick,

; A.

Farhadi.

You only look once: Unified, real-time object detection. In: Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, 779-788, 2016.

[5]

Liu,

; D.

Anguelov,

; D.

Erhan,

; C.

Szegedy,

; S.

Reed,

; C. Y.

Fu,

; A. C.

Berg,

SSD: Single shot MultiBox detector. In: Computer Vision-ECCV 2016. Lecture Notes in Computer Science, Vol. 9905. B.

Leibe,

; J.

Matas,

; N.

Sebe,

; M

Welling,

. Eds. Springer Cham, 21-37, 2016.

Crossref

[6]

Kong,

; F. C.

Sun,

; A. B.

Yao,

; H. P.

Liu,

; M.

Lu,

; Y. R.

Chen,

RON: Reverse connection with objectness prior networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5244-5252, 2017.

[7]

T. Y.

Lin,

; P.

Goyal,

; R.

Girshick,

; K. M.

He,

; P.

Dollar,

Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2018.

Crossref Google Scholar

[8]

Girshick,

; J.

Donahue,

; T.

Darrell,

; J.

Malik,

Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 580-587, 2014.

[9]

Girshick,

Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 1440-1448, 2015.

[10]

Ren,

; K.

He,

; R.

Girshick,

; J.

Sun,

Faster R-CNN: Towards real-time object detection with region proposal networks In: Proceedings of the Advances in Neural Information Processing Systems 28, 91-99, 2015.

[11]

He,

; G.

Gkioxari,

; P.

Dollar,

; R.

Girshick,

Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2961-2969, 2017.

[12]

Everingham,

; L.

van Gool,

; C. K. I.

Williams,

; J.

Winn,

; A

Zisserman,

. The Pascal visual object classes challenge 2007 (voc2007) results. Available at http://www.pascal-network.org/challenges/VOC/ voc2007/workshop/index.html.

[13]

T. Y.

Lin,

; M.

Maire,

; S.

Belongie,

; J.

Hays,

; P.

Perona,

; D.

Ramanan,

; P.

Dollár,

; C. L.

Zitnick,

Microsoft COCO: Common objects in context. In: Computer Vision-ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. D.

Fleet,

; T.

Pajdla,

; B.

Schiele,

; T

Tuytelaars,

. Eds. Springer Cham, 740-755, 2014.

Crossref

[14]

Houben,

; J.

Stallkamp,

; J.

Salmen,

; M.

Schlipsing,

; C.

Igel,

Detection of traffic signs in real-world images: The German traffic sign detection benchmark. In: Proceedings of the International Joint Conference on Neural Networks, 1-8, 2013.

[15]

Stallkamp,

; M.

Schlipsing,

; J.

Salmen,

; C.

Igel,

The German traffic sign recognition benchmark: A multi-class classification competition. In: Proceedings of the International Joint Conference on Neural Networks, 1453-1460, 2011.

[16]

Timofte,

; K.

Zimmermann,

; L.

van Gool,

Multi-view traffic sign detection, recognition, and 3D localisation. Machine Vision and Applications Vol. 25, No. 3, 633-647, 2014.

Crossref Google Scholar

[17]

V. B.

Hemadri,

; U. P.

Kulkarni,

Recognition of traffic sign based on support vector machine and creation of the Indian traffic sign recognition benchmark. In: Cognitive Computing and Information Processing. Communications in Computer and Information Science, Vol. 801. T.

Nagabhushan,

; V.

Aradhya,

; P.

Jagadeesh,

; S

Shukla,

. Eds. Springer Singapore, 227-238, 2018.

[18]

Larsson,

; M.

Felsberg,

Using Fourier descriptors and spatial models for traffic sign recognition. In: Image Analysis. Lecture Notes in Computer Science, Vol. 6688. A.

Heyden,

; F

Kahl,

. Eds. Springer Berlin Heidelberg, 238-249, 2011.

Crossref

[19]

Yang,

; H. L.

Luo,

; H. R.

Xu,

; F. C.

Wu,

Towards real-time traffic sign detection and classification. IEEE Transactions on Intelligent Transportation Systems Vol. 17, No. 7, 2022-2031, 2016.

Crossref Google Scholar

[20]

Meng,

; X.

Fan,

; X.

Chen,

; M.

Chen,

; Y.

Tong,

Detecting small signs from large images. In: Proceedings of the IEEE International Conference on Information Reuse and Integration, 217-224, 2017.

[21]

T. T.

Yang,

; X.

Long,

; A. K.

Sangaiah,

; Z. G.

Zheng,

; C.

Tong,

Deep detection network for real-life traffic sign in vehicular networks. Computer Networks Vol. 136, 95-104, 2018.

Crossref Google Scholar

[22]

Pon,

; O.

Adrienko,

; A.

Harakeh,

; S. L.

Waslander,

A hierarchical deep architecture and mini-batch selection method for joint traffic sign and light detection. In: Proceedings of the 15th Conference on Computer and Robot Vision, 102-109, 2018.

[23]

He,

; X.

Zhang,

; S.

Ren,

; J.

Sun,

Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778, 2015.

[24]

Yang,

; J.

Lu,

; D.

Batra,

; Parikh, D. A faster pytorch implementation of faster R-CNN 2017. Available at https://github.com/jwyang/faster-rcnn.pytorch.

[25]

Li,

; X.

Liang,

; Y.

Wei,

; T.

Xu,

; J.

Feng,

; S.

Yan,

Perceptual generative adversarial networks for small object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1222-1230, 2017.

[26]

Redmon,

; A

Farhadi,

. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.

Google Scholar

[27]

Lu,

; J.

Lu,

; S.

Zhang,

; P.

Hall,

Traffic signal detection and classification in street views using an attention model. Computational Visual Media Vol. 4, No. 3, 253-266, 2018.

Crossref Google Scholar

Computational Visual Media

Volume 5 Issue 4,
December 2019

Pages 403-416

DOI: 10.1007/s41095-019-0152-1

Cite this article:

Song Y, Fan R, Huang S, et al. A three-stage real-time detector for traffic signs in large panoramas. Computational Visual Media, 2019, 5(4): 403-416. https://doi.org/10.1007/s41095-019-0152-1