AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

Evaluating RISC-V Vector Instruction Set Architecture Extension with Computer Vision Workloads

National Engineering Research Center for Big Data Technology and System, Huazhong University of Science and Technology Wuhan 430074, China
Services Computing Technology and System Laboratory, Huazhong University of Science and Technology, Wuhan 430074 China
Cluster and Grid Computing Lab, Huazhong University of Science and Technology, Wuhan 430074, China
Show Author Information

Abstract

Computer vision (CV) algorithms have been extensively used for a myriad of applications nowadays. As the multimedia data are generally well-formatted and regular, it is beneficial to leverage the massive parallel processing power of the underlying platform to improve the performances of CV algorithms. Single Instruction Multiple Data (SIMD) instructions, capable of conducting the same operation on multiple data items in a single instruction, are extensively employed to improve the efficiency of CV algorithms. In this paper, we evaluate the power and effectiveness of RISC-V vector extension (RV-V) on typical CV algorithms, such as Gray Scale, Mean Filter, and Edge Detection. By our examinations, we show that compared with the baseline OpenCV implementation using scalar instructions, the equivalent implementations using the RV-V (version 0.8) can reduce the instruction count of the same CV algorithm up to 24x, when processing the same input images. Whereas, the actual performances improvement measured by the cycle counts is highly related with the specific implementation of the underlying RV-V co-processor. In our evaluation, by using the vector co-processor (with eight execution lanes) of Xuantie C906, vector-version CV algorithms averagely exhibit up to 2.98x performances speedups compared with their scalar counterparts.

Electronic Supplementary Material

Download File(s)
JCST-2101-11266-Highlights.pdf (185.7 KB)

References

[1]

Lu D, Weng Q. A survey of image classification methods and techniques for improving classification performance. International Journal of Remote Sensing, 2007, 28(5): 823–870. DOI: 10.1080/01431160600746456.

[2]
Zhang Z, Hu Y T, Lipton A J, Venetianer P L, Yu L, Yin W H. Target detection and tracking from video streams. US Patent 7801330. September 21, 2010.
[3]

Zhao W, Chellappa R, Phillips P J, Rosenfeld A. Face recognition: A literature survey. ACM Computing Surveys, 2003, 35(4): 399–458. DOI: 10.1145/954339.954342.

[4]

Nauman A, Qadri Y A, Amjad M, Zikria Y B, Afzal M K, Kim S W. Multimedia internet of things: A comprehensive survey. IEEE Access, 2020, 8: 8202–8250. DOI: 10.1109/ACCESS.2020.2964280.

[5]

Diefendorff K, Dubey P K. How multimedia workloads will change processor design. Computer, 1997, 30(9): 43–45. DOI: 10.1109/2.612247.

[6]

Wolf W, Jerraya A A, Martin G. Multiprocessor system-on-chip (MPSoC) technology. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2008, 27(10): 1701–1713. DOI: 10.1109/TCAD.2008.923415.

[7]
Mijat R. Take GPU processing power beyond graphics with Mali GPU computing. White Paper, ARM, 2012. https://developer.arm.com/-/media/Files/pdf/graphics-and-multimedia/WhitePaper_GPU_Computing_on_Mali.pdf, July 2023.
[8]
Shahbahrami A, Juurlink B H H, Vassiliadis S. A comparison between processor architectures for multimedia applications. In Proc. the 15th Annual Workshop on Circuits, Systems and Signal Processing, Apr. 2004, pp.138–152.
[9]

Reddy V G. Neon technology introduction. ARM Corporation, 2008, 4(1): 1–33.

[10]
Asanović K, Patterson D A. Instruction sets should be free: The case for RISC-V. Technical Report, EECS Department, University of California, Berkeley. https://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-146.html, July 2023.
[11]
Patterson D, Waterman A. The RISC-V Reader: An Open Architecture Atlas. Strawberry Canyon, 2017.
[12]

Duncan R. A survey of parallel computer architectures. Computer, 1990, 23(2): 5–16. DOI: 10.1109/2.44900.

[13]

Barnes G H, Brown R M, Kato M, Kuck D J, Slotnick D L, Stokes R A. The ILLIAC IV computer. IEEE Trans. Computers, 1968, C-17(8): 746–757. DOI: 10.1109/TC.1968.229158.

[14]
Watson W J. The TI ASC: A highly modular and flexible super computer architecture. In Proc. the Fall Joint Computer Conference, Dec. 1972, pp.221–228.
[15]

Russell R M. The CRAY-1 computer system. Communications of the ACM, 1978, 21(1): 63–72. DOI: 10.1145/359327.359336.

[16]

Peleg A, Wilkie S, Weiser U. Intel MMX for multimedia PCs. Communications of the ACM, 1997, 40(1): 24–38. DOI: 10.1145/242857.242865.

[17]

Stephens N, Biles S, Boettcher M, Eapen J, Eyole M, Gabrielli G, Horsnell M, Magklis G, Martinez A, Premillieu N, Reid A, Rico A, Walker P. The ARM scalable vector extension. IEEE Micro, 2017, 37(2): 26–39. DOI: 10.1109/MM.2017.35.

[18]
Parker J R. Algorithms for Image Processing and Computer Vision (2nd edition). John Wiley & Sons, 2010.
[19]
Bradski G, Kaehler A. Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media, Inc., 2008.
[20]
Saravanan C. Color image to grayscale image conversion. In Proc. the 2nd International Conference on Computer Engineering and Applications, Mar. 2010, pp.196–199.
[21]

Chandel R, Gupta G. Image filtering algorithms and techniques: A review. International Journal of Advanced Research in Computer Science and Software Engineering, 2013, 3(10): 198–202.

[22]

Maini R, Aggarwal H. Study and comparison of various image edge detection techniques. International Journal of Image Processing, 2009, 3(1): 1–11.

[23]

Cavalcante M, Schuiki F, Zaruba F, Schaffner M, Benini L. Ara: A 1-GHz+ scalable and energy-efficient RISC-V vector processor with multiprecision floating-point support in 22-nm FD-SOI. IEEE Trans. Very Large Scale Integration (VLSI) Systems, 2020, 28(2): 530–543. DOI: 10.1109/TVLSI.2019.2950087.

[24]
Tagliavini G, Mach S, Rossi D, Marongiu A, Benini L. Design and evaluation of SmallFloat SIMD extensions to the RISC-V ISA. In Proc. the 2019 Design, Automation & Test in Europe Conference & Exhibition, Mar. 2019, pp.654–657.
[25]
Louis M S, Azad Z, Delshadtehrani L, Gupta S, Warden P, Reddi V J, Joshi A. Towards deep learning using tensorFlow lite on RISC-V. In Proc. the 3rd Workshop on Computer Architecture Research with RISC-V, Jun. 2019.
[26]
Waterman A, Asanović K. The RISC-V instruction set manual volume II: Privileged architecture version 20190608-Priv-MSU-Ratified. RISC-V Foundation, 2019. DOI: 10.1109/HOTCHIPS.2013.7478332.
[27]
Lomont C. Introduction to Intel® advanced vector extensions. White Paper, Intel®, 2011. https://hpc.llnl.gov/sites/default/files/intelAVXintro.pdf, July 2023.
[28]
Lee Y. Decoupled vector-fetch architecture with a scalarizing compiler [Ph.D. Thesis]. University of California, Berkeley, 2016.
[29]
Patsidis K, Nicopoulos C, Sirakoulis G C, Dimitrakopoulos G. RISC-V2: A scalable RISC-V vector processor. In Proc. the 2020 IEEE International Symposium on Circuits and Systems, Sept. 2020.
[30]
Chen C, Xiang X Y, Liu C, Shang Y H, Guo R, Liu D Q, Lu Y M, Hao Z Y, Luo J H, Chen Z J, Li C Q, Pu Y, Meng J Y, Yan X L, Xie Y, Qi X N. Xuantie-910: A commercial multi-core 12-stage pipeline out-of-order 64-bit high performance RISC-V processor with vector extension: Industrial product. In Proc. the 47th ACM/IEEE Annual International Symposium on Computer Architecture, Jun. 2020, pp.52–64.
[31]

Binkert N, Beckmann B, Black G, Reinhardt S K, Saidi A, Basu A, Hestness J, Hower D R, Krishna T, Sardashti S, Sen R, Sewell K, Shoaib M, Vaish N, Hill M D, Wood D A. The gem5 simulator. ACM SIGARCH Computer Architecture News, 2011, 39(2): 1–7. DOI: 10.1145/2024716.2024718.

Journal of Computer Science and Technology
Pages 807-820
Cite this article:
Li R-S, Peng P, Shao Z-Y, et al. Evaluating RISC-V Vector Instruction Set Architecture Extension with Computer Vision Workloads. Journal of Computer Science and Technology, 2023, 38(4): 807-820. https://doi.org/10.1007/s11390-023-1266-6

497

Views

0

Crossref

1

Web of Science

1

Scopus

0

CSCD

Altmetrics

Received: 05 January 2021
Accepted: 25 May 2023
Published: 06 December 2023
© Institute of Computing Technology, Chinese Academy of Sciences 2023
Return