Optimization Method for High-Performance Libraries Targeting RISC-V Vector Extension
Author:
Affiliation:

Clc Number:

TP311

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The performance acceleration of high-performance libraries on CPUs can be achieved by efficiently leveraging SIMD hardware through vectorization. Implementing vectorization depends on programming methods specific to the target SIMD hardware. However, the programming models and methods of different SIMD extensions vary significantly. To avoid redundant implementation of algorithm optimizations across various platforms and improve the maintainability of algorithm libraries, a hardware abstraction layer (HAL) is often introduced in high-performance libraries. Since existing SIMD extension instruction sets are designed with fixed-length vector registers, most hardware abstraction layers only support fixed-length vector types and operations. However, the design of fixed-length vector representations in hardware abstraction layers cannot accommodate the variable vector register lengths introduced by the RISC-V vector extension. Treating RISC-V vector extensions as fixed-length vectors within existing HAL designs would introduce unnecessary overhead and cause performance degradation. To address this problem, the paper proposes a HAL design method compatible with variable-length vector extension platforms and fixed-length SIMD extension platforms. Based on our method, the OpenCV universal intrinsic functions have been redesigned and optimized to support RISC-V vector extension devices better while maintaining compatibility with existing SIMD platforms. Moreover, we designed experiments to compare the performance of the OpenCV library optimized using our approach against the original version. The results demonstrate that the universal intrinsic redesigned by our method efficiently integrates RISC-V vector extensions into the hardware abstraction layer optimization framework. Our method achieved a 3.93x performance improvement in core modules, significantly enhancing the execution performance of the high-performance library on RISC-V devices, thereby validating the effectiveness of this paper. Additionally, our work has been open-sourced and integrated into the OpenCV source code, demonstrating our approach’s practicality and application value.

    Reference
    Related
    Cited by
Get Citation

韩柳彤,张洪滨,邢明杰,武延军,赵琛.面向RISC-V向量扩展的高性能算法库优化方法.软件学报,2025,36(9):0

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 26,2024
  • Revised:November 20,2024
  • Adopted:
  • Online: December 10,2024
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063