ParaC: A Domain Programming Framework of Image Processing on GPU Accelerators
Author:
Affiliation:

Clc Number:

Fund Project:

National Natural Science Foundation of China (61432018, 61402445, 61502452, 61602443, 61432018); National Key R&D Program of China (2016YFB1000402); State Key Laboratory of Mathematical Engineering and Advanced Computing Open Foundation (2016A03); Beijing Municipal Science & Technology Commission Program (D161100001216002)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Image processing algorithms take the GPU accelerators as the main speedup solution. However, the performance difference between a naïve implementation and a highly optimized one on the same GPU accelerators is frequently an order of magnitude or more. The GPGPU platform features complicated hardware architecture characteristics, such as the large amount of multi-dimension and multi -level threads and the deep hierarchy memory system, while the different part of the latter features different capacity, bandwidth, latency and access authority. Additionally, image processing algorithms have complex operations, border data accessing rules and memory accessing patterns. Therefore, parallel execution model of tasks, organization of threads and parallel tasks to device mapping not only have big impact on the scalability, scheduling, communication and synchronization, but also affect the efficiency of memory accessing. In a word, the algorithm optimization methods on GPGPU platforms are difficult, complicated and less efficient. This paper proposes a domain specific language, ParaC, which can provide high level program semantics through the new language extensions. It obtains the applications' software characteristics, such as the operation information, the data reuse among parallel tasks and the memory access patterns, along with hardware platform information and the domain pre-knowledge driven optimization mechanism, to generate high performance GPGPU code automatically. The source-to-source compiler is then used to output the standard OpenCL programs. Experiment results on test cases show that ParaC automatically generated optimization version has gained 3.22 speedup compared to the hand-tuned version for the best case, while the number of lines of the former is just 1.2% to 39.68% of the latter.

    Reference
    Related
    Cited by
Get Citation

卢兴敬,刘雷,贾海鹏,冯晓兵,武成岗. ParaC:面向GPU平台的图像处理领域的编程框架.软件学报,2017,28(7):1655-1675

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:September 05,2016
  • Revised:October 14,2016
  • Adopted:
  • Online: November 26,2016
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063