| Sign up

PDF (1.9 MB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Open Access

CPT: A Configurable Predictability Testbed for DNN Inference in AVs

Liangkai Liu^¹, Yanzhi Wang^², Weisong Shi^¹()

1Department of Computer and Information Science, University of Delaware, Newark, DE 19713, USA

2Department of Electrical & Computer Engineering, Northeastern University, Boston, MA 02115, USA

Show Author Information

Abstract

Predictability is an essential challenge for autonomous vehicles (AVs)’ safety. Deep neural networks have been widely deployed in the AV’s perception pipeline. However, it is still an open question on how to guarantee the perception predictability for AV because there are millions of deep neural networks (DNNs) model combinations and system configurations when deploying DNNs in AVs. This paper proposes configurable predictability testbed (CPT), a configurable testbed for quantifying the predictability in AV’s perception pipeline. CPT provides flexible configurations of the perception pipeline on data, DNN models, fusion policy, scheduling policies, and predictability metrics. On top of CPT, the researchers can profile and optimize the predictability issue caused by different application and system configurations. CPT has been open-sourced at: https://github.com/Torreskai0722/CPT.

Keywords

predictability deep neural network autonomous driving

References

[1]

L. K. Liu, S. D. Lu, R. Zhong, B. F. Wu, Y. T. Yao, Q. Y. Zhang, and W. S. Shi, Computing systems for autonomous driving: State-of-the-art and challenges, arXiv preprint arXiv: 2009.14349, 2020.

[2]

S. C. Lin, Y. Zhang, C. H. Hsu, M. Skach, M. E. Haque, L. Tang, and J. Mars, The architectural implications of autonomous driving: Constraints and acceleration, in Proc. Twenty-Third Int. Conf. Architectural Support for Programming Languages and Operating Systems, Williamsburg, VA, USA, 2018, pp. 751–766.

[3]

C. C. Wan, M. Santriaji, E. Rogers, H. Hoffmann, M. Maire, and S. Lu, ALERT: Accurate learning for energy and timeliness, arXiv preprint arXiv: 1911.00119, 2019.

[4]

C. J. Wu, D. Brooks, K. Chen, D. Chen, S. Choudhury, M. Dukhan, K. Hazelwood, E. Isaac, Y. Jia, B. Jia, et al., Machine learning at facebook: Understanding inference at the edge, in Proc. IEEE Int. Symp. on High Performance Computer Architecture (HPCA), Washington, DC, USA, 2019, pp. 331–344.

[5]

I. Gog, S. Kalra, P. Schafhalter, J. E. Gonzalez, and I. Stoica, D3: A dynamic deadline-driven approach for building autonomous vehicles, in Proc. Seventeenth European Conf. Computer Systems, Rennes, France, 2022, pp. 453–471.

[6]

L. Liu, Z. Dong, Y. Wang, and W. Shi, Prophet: Realizing a predictable real-time perception pipeline for autonomous vehicles, in Proc. IEEE Real-Time Systems Symp. (RTSS), Houston, TX, USA, 2022.

[7]

L. K. Liu, Y. Z. Wang, and W. S. Shi, Understanding time variations of DNN inference in autonomous driving, arXiv preprint arXiv: 2209.05487, 2022.

[8]

E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, A survey of autonomous driving: Common practices and emerging technologies, IEEE Access, vol. 8, pp. 58443–58469, 2020.

Crossref Google Scholar

[9]

The Autoware Foundation, https://github.com/Autoware-AI/autoware.ai, 2024.

[10]

The Autoware Foundation, https://gitlab.com/autowarefoundation/autoware.auto/AutowareAuto, 2024.

[11]

Baidu, Apollo Open Platform, http://apollo.auto/index.html, 2021.

[12]

NVIDIA, NVIDIA driveworks, https://developer.nvidia.com/drive/driveworks, 2024.

[13]

Y. Wang, S. Liu, X. Wu, and W. Shi, CAVBench: A benchmark suite for connected and autonomous vehicles, in Proc. IEEE/ACM Symp. on Edge Computing (SEC), Seattle, WA, USA, 2018.

[14]

S. Liu, L. Liu, J. Tang, B. Yu, Y. Wang, and W. Shi, Edge computing for autonomous driving: Opportunities and challenges, Proc. IEEE, vol. 107, no. 8, pp. 1697–1716, 2019.

Crossref Google Scholar

[15]

S. Han, J. Pool, J. Tran, and W. J. Dally, Learning both weights and connections for efficient neural networks, arXiv preprint arXiv: 1506.02626, 2015.

[16]

V. Sze, Y. H. Chen, T. J. Yang, and J. S. Emer, Efficient processing of deep neural networks: A tutorial and survey, Proc. IEEE, vol. 105, no. 12, pp. 2295–2329, 2017.

Crossref Google Scholar

[17]

A. Gujarati, R. Karimi, S. Alzayat, W. Hao, A. Kaufmann, Y. Vigfusson, and J. Mace, Serving DNNs like clockwork: Performance predictability from the bottom up, in Proc. Conf. 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), Berkeley, CA, USA, pp. 443–462, 2020.

[18]

S. Kato, E. Takeuchi, Y. Ishiguro, Y. Ninomiya, K. Takeda, and T. Hamada, An open approach to autonomous vehicles, IEEE Micro, vol. 35, no. 6, pp. 60–68, 2015.

Crossref Google Scholar

[19]

A. Geiger, P. Lenz, and R. Urtasun, Are we ready for autonomous driving? The KITTI vision benchmark suite, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Providence, RI, USA, 2012, pp. 3354–3361.

[20]

H. Mao, X. Yang, and B. Dally, A delay metric for video object detection: What average precision fails to tell, in Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV). Seoul, Republic of Korea, 2019, pp. 573–582.

[21]

R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, ORB-SLAM: A versatile and accurate monocular SLAM system, IEEE Trans. Robot., vol. 31, no. 5, pp. 1147–1163, 2015.

Crossref Google Scholar

[22]

R. Mur-Artal and J. D. Tardos, ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras, IEEE Trans. Robot., vol. 33, no. 5, pp. 1255–1262, 2017.

Crossref Google Scholar

[23]

AMCL, http://wiki.ros.org/amcl, 2024.

[24]

S. Ren, K. He, R. Girshick, and J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017.

Crossref Google Scholar

[25]

K. He, G. Gkioxari, P. Dollar, and R. Girshick, Mask R-CNN, in Proc. IEEE Int. Conf. Computer Vision (ICCV), Venice, Italy, 2017, pp. 2961–2969.

[26]

J. Redmon and A. Farhadi, YOLOv3: An incremental improvement, arXiv preprint arXiv: 1804.02767, 2018.

[27]

D. Neven, B. De Brabandere, S. Georgoulis, M. Proesmans, and L. Van Gool, Towards end-to-end lane detection: An instance segmentation approach, in Proc. IEEE Intelligent Vehicles Symp. (IV), Changshu, China, 2018, pp. 286–291.

[28]

Y. Ko, J. Jun, D. Ko, and M. Jeon, Key points estimation and point instance segmentation approach for lane detection, arXiv preprint arXiv: 2002.06604, 2020.

[29]

T. Zheng, H. Fang, Y. Zhang, W. J. Tang, Z. Yang, H. F. Liu, and D. Cai. Resa: Recurrent feature-shift aggregator for lane detection, arXiv preprint arXiv: 2008.13719, 2020.

[30]

X. Pan, J. Shi, P. Luo, X. Wang, and X. Tang, Spatial as deep: Spatial CNN for traffic scene understanding, Proc. AAAI Conf. Artif. Intell., vol. 32, no. 1, pp. 7276–7283, 2018.

Crossref Google Scholar

[31]

O. Ronneberger, P. Fischer, and T. Brox, U-net: Convolutional networks for biomedical image segmentation, in Proc. 18th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2015), Munich, Germany, 2015, pp. 234–241.

[32]

L. C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, Encoder-decoder with atrous separable convolution for semantic image segmentation, in Computer Vision – ECCV 2018, V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, eds. Cham, Switzerland: Springer, 2018, pp. 833–851.

[33]

F. Hafeez, U. U. Sheikh, N. Alkhaldi, H. Z. Al Garni, Z. Ahmad Arfeen, and S. A. Khalid, Insights and strategies for an autonomous vehicle with a sensor fusion innovation: A fictional outlook, IEEE Access, vol. 8, pp. 135162–135175, 2020.

Crossref Google Scholar

[34]

Global planner, http://wiki.ros.org/global_planner, 2024.

[35]

dwa local planner, http://wiki.ros.org/dwa_local_planner, 2024.

[36]

M. Bertoluzzo, P. Bolognesi, O. Bruno, G. Buja, A. Landi, and A. Zuccollo, Drive-by-wire systems for ground vehicles, in Proc. IEEE Int. Symp. on Industrial Electronics, Ajaccio, France, 2004, pp. 711–716.

[37]

Y. J. Pan, C. Canudas-de-Wit, and O. Sename, A new predictive approach for bilateral teleoperation with applications to drive-by-wire systems, IEEE Trans. Robot., vol. 22, no. 6, pp. 1146–1162, 2006.

Crossref Google Scholar

[38]

M. Farsi, M. Barbosa, and K. Ratcliff, An overview of controller area network, Comput. Contr. Eng. J., vol. 10, no. 3, pp. 113–120, 1999.

Crossref Google Scholar

[39]

C. Michaelis, B. Mitzkus, R. Geirhos, E. Rusak, O. Bringmann, A. S. Ecker, M. Bethge, and W. Brendel, Benchmarking robustness in object detection: Autonomous driving when winter is coming, arXiv preprint arXiv: 1907.07484, 2019.

[40]

R. Girshick Fast R-CNN, in Proc. IEEE Int. Conf. Computer Vision (ICCV), Santiago, Chile, 2015, pp. 1440–1448.

[41]

R. Girshick, J. Donahue, T. Darrell, and J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014, pp. 580–587.

[42]

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, SSD: Single shot MultiBox detector, in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, M. Welling, eds. Cham, Switzerland: Springer, 2016, pp. 21–37.

[43]

C. Y. Fu, W. Liu, A. Ranga, A. Tyagi, and A. C. Berg, DSSD: Deconvolutional single shot detector, arXiv preprint arXiv: 1701.06659, 2017.

[44]

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, You only look once: Unified, real-time object detection, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 779–788.

[45]

J. Redmon and A. Farhadi, YOLO9000: Better, faster, stronger, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 7263–7271.

[46]

N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, End-to-end object detection with transformers, in Computer Vision – ECCV 2020, A. Vedaldi, H. Bischof, T. Brox, and J. M. Frahm, eds. Cham, Switzerland: Springer, 2020, pp. 213-229.

[47]

Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, in Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV), Montreal, Canada, 2021, pp. 10012–10022.

[48]

A. Bochkovskiy, C. Y. Wang, H. Yuan, and M. Liao. Yolov4: Optimal speed and accuracy of object detection, arXiv preprint arXiv: 2004.10934, 2020.

[49]

R. Gopalan, T. Hong, M. Shneier, and R. Chellappa, A learning approach towards detection and tracking of lane markings, IEEE Trans. Intell. Transp. Syst., vol. 13, no. 3, pp. 1088–1098, 2012.

Crossref Google Scholar

[50]

S. Lee, J. Kim, J. S. Yoon, S. Shin, O. Bailo, N. Kim, T. H. Lee, H. S. Hong, S.-H. Han, and I. S. Kweon, VPGNet: Vanishing point guided network for lane and road marking detection and recognition, in Proc. IEEE Int. Conf. Computer Vision (ICCV), Venice, Italy, 2017, pp. 1947–1955.

[51]

Y. Hou, Z. Ma, C. Liu, and C. C. Loy, Learning lightweight lane detection CNNs by self attention distillation, in Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV), Seoul, Republic of Korea, 2019, pp. 1013–1021.

[52]

J. Li, X. Mei, D. Prokhorov, and D. Tao, Deep neural network for structural prediction and lane detection in traffic scene, IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 3, pp. 690–703, 2017.

Crossref Google Scholar

[53]

J. Philion, FastDraw: Addressing the long tail of lane detection by adapting a sequential prediction network, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 11582–11591.

[54]

Y. C. Hsu, Z. Xu, Z. Kira, and J. Huang, Learning to cluster for proposal-free instance segmentation, in Proc. Int. Joint Conf. Neural Networks (IJCNN), Rio de Janeiro, Brazil, 2018, pp. 1–8.

[55]

Y. N. Hou, Agnostic lane detection, arXiv preprint arXiv: 1905.03704, 2019.

[56]

N. Garnett, R. Cohen, T. Pe’er, R. Lahav, and D. Levi, 3D-LaneNet: End-to-end 3D multiple lane detection, in Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV), Seoul, Republic of Korea, 2019, pp. 2921–2930.

[57]

T. E. Huang, BDD100K model zoo, https://github.com/SysCV/bdd100k-models, 2024.

[58]

H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, Pyramid scene parsing network, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 2881–2890.

[59]

BDD100K Model Zoo, https://github.com/SysCV/bdd100k-models, 2024.

[60]

S. Zhang, C. Chi, Y. Yao, Z. Lei, and S. Z. Li, Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 9759–9768.

[61]

Z. Cai and N. Vasconcelos, Cascade R-CNN: Delving into high quality object detection, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 6154–6162.

[62]

Z. Liu, H. Mao, C. Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, A ConvNet for the 2020s, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 11976–11986.

[63]

Z. Tian, C. Shen, H. Chen, and T. He, FCOS: Fully convolutional one-stage object detection, in Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV), Seoul, Republic of Korea, 2019, pp. 9627–9636.

[64]

T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, Focal loss for dense object detection, in Proc. IEEE Int. Conf. Computer Vision (ICCV), Venice, Italy, 2017, pp. 2980–2988.

[65]

P. Sun, R. Zhang, Y. Jiang, T. Kong, C. Xu, W. Zhan, M. Tomizuka, L. Li, Z. Yuan, C. Wang et al., Sparse R-CNN: End-to-end object detection with learnable proposals, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 14454–14463.

[66]

J. He, Z. Deng, L. Zhou, Y. Wang, and Y. Qiao, Adaptive pyramid context network for semantic segmentation, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 7519–7528.

[67]

Z. Huang, X. Wang, L. Huang, C. Huang, Y. Wei, and W. Liu, CCNet: Criss-cross attention for semantic segmentation, in Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV), Seoul, Republic of Korea, 2019, pp. 603–612.

[68]

J. He, Z. Deng, and Y. Qiao, Dynamic multi-scale filters for semantic segmentation, in Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV), Seoul, Republic of Korea, 2019, pp. 3562–3572.

[69]

M. Yin, Z. Yao, Y. Cao, X. Li, Z. Zhang, S. Lin, and H. Hu, Disentangled non-local neural networks, in Computer Vision – ECCV 2020, A. Vedaldi, H. Bischof, T. Brox, and J. M. Frahm, eds. Cham, Switzerland: Springer, 2020pp. 191–207.

[70]

J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M. Tan, X. Wang, et al, Deep high-resolution representation learning for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 10, pp. 3349–3364, 2021.

Crossref Google Scholar

[71]

H. Zhao, Y. Zhang, S. Liu, J. Shi, C. C. Loy, D. Lin, and J. Jia, PSANet: Point-wise spatial attention network for scene parsing, in Proc. European Conf. on Computer Vision (ECCV), 2018, pp. 270–286.

[72]

X. Li, Z. Zhong, J. Wu, Y. Yang, Z. Lin, and H. Liu, Expectation-maximization attention networks for semantic segmentation, in Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV), Seoul, Republic of Korea, 2019, pp. 9167–9176.

[73]

H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jxegou, Training data-efficient image transformers & distillation through attention, in Int. Conf. on Machine Learning, pp. 10347–10357, 2021.

[74]

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., An image is worth 16x16 words: Transformers for image recognition at scale, arXiv preprint arXiv: 2010.11929, 2020.[

[75]

F. Yu, H. Chen, X. Wang, W. Xian, Y. Chen, F. Liu, V. Madhavan, and T. Darrell, BDD100K: A diverse driving dataset for heterogeneous multitask learning, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 2636–2645.

[76]

M. Li, Y. X. Wang, and D. Ramanan, Towards streaming perception, in European Conf. on Computer Vision, 2020, pp. 473–488.

[77]

The KITTI Vision Benchmark Suite, http://www.cvlibs.net/datasets/kitti/, 2024.

[78]

T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollxar, and C. L. Zitnick, Microsoft coco: Common objects in context, in Proc. Computer Vision–ECCV 2014 : 13th European Conf., Zurich, Switzerland, 2014, pp. 740–755.

[79]

J. Li, X. Liang, Y. Wei, T. Xu, J. Feng, and S. Yan, Perceptual generative adversarial networks for small object detection, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 1222–1230.

[80]

MMDetection, https://github.com/open-mmlab/mmdetection, 2024.

[81]

M. Tan, R. Pang, and Q. V. Le, EfficientDet: Scalable and efficient object detection, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 10781–10790.

[82]

E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, A survey of autonomous driving: Common practices and emerging technologies, IEEE Access, vol. 8, pp. 58443–58469, 2020.

Crossref Google Scholar

[83]

Message filters, http://wiki.ros.org/message_filters, 2024.

[84]

sched(7) — Linux manual page, https://man7.org/linux/man-pages/man7/sched.7.html, 2024.

[85]

Spuri and Buttazzo, Efficient aperiodic service under earliest deadline scheduling, in Proc. Real-Time Systems Symp. REAL-94, San Juan, Puerto Rico, 1994, pp. 2–11.

[86]

A. Amarnath, S. Pal, H. Kassa, A. Vega, A. Buyuktosunoglu, H. Franke, J. D. Wellman, R. Dreslinski, and P. Bose, Hetsched: Quality-of-mission aware scheduling for autonomous vehicle socs, arXiv preprint arXiv: 2203.13396, 2022.

[87]

Robot Operating System (ROS), Powering the World’s Robots, https://www.ros.org/, 2019.

[88]

R. Mur-Artal, ORB-SLAM2: An opensource SLAM system for monocular, stereo and rgb-d cameras, arXiv preprint arXiv: 1610.06475, 2016.

[89]

S. Qiao, L. C. Chen, and A. Yuille, DetectoRS: Detecting objects with recursive feature pyramid and switchable atrous convolution, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 10213–10224.

[90]

S. Kato, S. Tokunaga, Y. Maruyama, S. Maeda, M. Hirabayashi, Y. Kitsukawa, A. Monrroy, T. Ando, Y. Fujii, and T. Azumi, Autoware on board: Enabling autonomous vehicles with embedded systems, in Proc. ACM/IEEE 9th Int. Conf. Cyber-Physical Systems (ICCPS), Porto, The Portuguese Republic, 2018, pp. 287–296.

Tsinghua Science and Technology

Volume 30 Issue 1,
February 2025

Pages 87-99

DOI: 10.26599/TST.2024.9010037

Cite this article:

Liu L, Wang Y, Shi W. CPT: A Configurable Predictability Testbed for DNN Inference in AVs. Tsinghua Science and Technology, 2025, 30(1): 87-99. https://doi.org/10.26599/TST.2024.9010037

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号