Abstract
Computer vision (CV) algorithms have been extensively used for a myriad of applications nowadays. As the multimedia data are generally well-formatted and regular, it is beneficial to leverage the massive parallel processing power of the underlying platform to improve the performances of CV algorithms. Single Instruction Multiple Data (SIMD) instructions, capable of conducting the same operation on multiple data items in a single instruction, are extensively employed to improve the efficiency of CV algorithms. In this paper, we evaluate the power and effectiveness of RISC-V vector extension (RV-V) on typical CV algorithms, such as Gray Scale, Mean Filter, and Edge Detection. By our examinations, we show that compared with the baseline OpenCV implementation using scalar instructions, the equivalent implementations using the RV-V (version 0.8) can reduce the instruction count of the same CV algorithm up to 24x, when processing the same input images. Whereas, the actual performances improvement measured by the cycle counts is highly related with the specific implementation of the underlying RV-V co-processor. In our evaluation, by using the vector co-processor (with eight execution lanes) of Xuantie C906, vector-version CV algorithms averagely exhibit up to 2.98x performances speedups compared with their scalar counterparts.
References
Lu D, Weng Q. A survey of image classification methods and techniques for improving classification performance. International Journal of Remote Sensing, 2007, 28(5): 823–870. https://doi.org/10.1080/01431160600746456.
Zhang Z, Hu Y T, Lipton A J, Venetianer P L, Yu L, Yin W H. Target detection and tracking from video streams. US Patent 7801330. September 21, 2010.
Zhao W, Chellappa R, Phillips P J, Rosenfeld A. Face recognition: A literature survey. ACM Computing Surveys, 2003, 35(4): 399–458. https://doi.org/10.1145/954339.954342.
Nauman A, Qadri Y A, Amjad M, Zikria Y B, Afzal M K, Kim S W. Multimedia internet of things: A comprehensive survey. IEEE Access, 2020, 8: 8202–8250. https://doi.org/10.1109/ACCESS.2020.2964280.
Diefendorff K, Dubey P K. How multimedia workloads will change processor design. Computer, 1997, 30(9): 43–45. https://doi.org/10.1109/2.612247.
Wolf W, Jerraya A A, Martin G. Multiprocessor system-on-chip (MPSoC) technology. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2008, 27(10): 1701–1713. https://doi.org/10.1109/TCAD.2008.923415.
Mijat R. Take GPU processing power beyond graphics with Mali GPU computing. White Paper, ARM, 2012. https://developer.arm.com/-/media/Files/pdf/graphicsand-multimedia/WhitePaper_GPU_Computing_on_Mali.pdf, July 2023.
Shahbahrami A, Juurlink B H H, Vassiliadis S. A comparison between processor architectures for multimedia applications. In Proc. the 15th Annual Workshop on Circuits, Systems and Signal Processing, Apr. 2004, pp.138–152.
Reddy V G. Neon technology introduction. ARM Corporation, 2008, 4(1): 1–33.
Asanović K, Patterson D A. Instruction sets should be free: The case for RISC-V. Technical Report, EECS Department, University of California, Berkeley. https://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-146.html, July 2023.
Patterson D, Waterman A. The RISC-V Reader: An Open Architecture Atlas. Strawberry Canyon, 2017.
Duncan R. A survey of parallel computer architectures. Computer, 1990, 23(2): 5–16. https://doi.org/10.1109/2.44900.
Barnes G H, Brown R M, Kato M, Kuck D J, Slotnick D L, Stokes R A. The ILLIAC IV computer. IEEE Trans. Computers, 1968, C-17(8): 746–757. https://doi.org/10.1109/TC.1968.229158.
Watson W J. The TI ASC: A highly modular and flexible super computer architecture. In Proc. the Fall Joint Computer Conference, Dec. 1972, pp.221–228.
Russell R M. The CRAY-1 computer system. Communications of the ACM, 1978, 21(1): 63–72. https://doi.org/10.1145/359327.359336.
Peleg A, Wilkie S, Weiser U. Intel MMX for multimedia PCs. Communications of the ACM, 1997, 40(1): 24–38. https://doi.org/10.1145/242857.242865.
Stephens N, Biles S, Boettcher M, Eapen J, Eyole M, Gabrielli G, Horsnell M, Magklis G, Martinez A, Premillieu N, Reid A, Rico A, Walker P. The ARM scalable vector extension. IEEE Micro, 2017, 37(2): 26–39. https://doi.org/10.1109/MM.2017.35.
Parker J R. Algorithms for Image Processing and Computer Vision (2nd edition). John Wiley & Sons, 2010.
Bradski G, Kaehler A. Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media, Inc., 2008.
Saravanan C. Color image to grayscale image conversion. In Proc. the 2nd International Conference on Computer Engineering and Applications, Mar. 2010, pp.196–199. https://doi.org/10.1109/ICCEA.2010.192.
Chandel R, Gupta G. Image filtering algorithms and techniques: A review. International Journal of Advanced Research in Computer Science and Software Engineering, 2013, 3(10): 198–202.
Maini R, Aggarwal H. Study and comparison of various image edge detection techniques. International Journal of Image Processing, 2009, 3(1): 1–11. https://doi.org/10.1049/iet-ipr:20080080.
Cavalcante M, Schuiki F, Zaruba F, Schaffner M, Benini L. Ara: A 1-GHz+ scalable and energy-efficient RISC-V vector processor with multiprecision floating-point support in 22-nm FD-SOI. IEEE Trans. Very Large Scale Integration (VLSI) Systems, 2020, 28(2): 530–543. https://doi.org/10.1109/TVLSI.2019.2950087.
Tagliavini G, Mach S, Rossi D, Marongiu A, Benini L. Design and evaluation of SmallFloat SIMD extensions to the RISC-V ISA. In Proc. the 2019 Design, Automation & Test in Europe Conference & Exhibition, Mar. 2019, pp.654–657. https://doi.org/10.23919/DATE.2019.8714897.
Louis M S, Azad Z, Delshadtehrani L, Gupta S, Warden P, Reddi V J, Joshi A. Towards deep learning using tensorFlow lite on RISC-V. In Proc. the 3rd Workshop on Computer Architecture Research with RISC-V, Jun. 2019. https://doi.org/10.13140/RG.2.2.30400.89606.
Waterman A, Asanović K. The RISC-V instruction set manual volume II: Privileged architecture version 20190608-Priv-MSU-Ratified. RISC-V Foundation, 2019. https://doi.org/10.1109/HOTCHIPS.2013.7478332.
Lomont C. Introduction to Intel® advanced vector extensions. White Paper, Intel®, 2011. https://hpc.llnl.gov/sites/default/files/intelAVXintro.pdf, July 2023.
Lee Y. Decoupled vector-fetch architecture with a scalarizing compiler [Ph.D. Thesis]. University of California, Berkeley, 2016.
Patsidis K, Nicopoulos C, Sirakoulis G C, Dimitrakopoulos G. RISC-V2: A scalable RISC-V vector processor. In Proc. the 2020 IEEE International Symposium on Circuits and Systems, Sept. 2020. https://doi.org/10.1109/ISCAS45731.2020.9181071.
Chen C, Xiang X Y, Liu C, Shang Y H, Guo R, Liu D Q, Lu Y M, Hao Z Y, Luo J H, Chen Z J, Li C Q, Pu Y, Meng J Y, Yan X L, Xie Y, Qi X N. Xuantie-910: A commercial multi-core 12-stage pipeline out-of-order 64-bit high performance RISC-V processor with vector extension: Industrial product. In Proc. the 47th ACM/IEEE Annual International Symposium on Computer Architecture, Jun. 2020, pp.52–64. https://doi.org/10.1109/ISCA45697.2020.00016.
Binkert N, Beckmann B, Black G, Reinhardt S K, Saidi A, Basu A, Hestness J, Hower D R, Krishna T, Sardashti S, Sen R, Sewell K, Shoaib M, Vaish N, Hill M D, Wood D A. The gem5 simulator. ACM SIGARCH Computer Architecture News, 2011, 39(2): 1–7. https://doi.org/10.1145/2024716.2024718.
Author information
Authors and Affiliations
Corresponding author
Supplementary Information
ESM 1
(PDF 185 kb)
Rights and permissions
About this article
Cite this article
Li, RS., Peng, P., Shao, ZY. et al. Evaluating RISC-V Vector Instruction Set Architecture Extension with Computer Vision Workloads. J. Comput. Sci. Technol. 38, 807–820 (2023). https://doi.org/10.1007/s11390-023-1266-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-023-1266-6