An Iterative Clustering Based Approach for Parallel Performance Analysis
Author:
Affiliation:

  • Article
  • | |
  • Metrics
  • |
  • Reference [10]
  • |
  • Related
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    With the development of supercomputers, the CPU core numbers of which come to several hundreds of thousands, and on which the complexity of the applications run are increasing. Therefore, in order to optimize the source code of the programs, developers of parallel applications need to measure the performance of parallel applications and make a useful analysis, so that they can improve the performance of the applications. However, due to a substantial increasing of the CPU core numbers, performance measurement will produce vast amounts of performance data, and then, how to deal with massive data is a very critical problem for parallel performance analysis. A new approach, named Iterative based Clustering Approach for Parallel Performance Analysis (ICAPPA), is proposed for parallel performance analysis in this paper. In this approach, clustering method of data mining technique, which is used to processing massive data, will be carried out iteratively for the result in some conditions after previous clustering, to find out the dominating functions and processes of the parallel performance. And Bayesian Information Criteria (BIC) is applied to evaluate the result of clustering method. By using BIC score, whether iterative clustering applied to the result is reliable or not can be decided. And at the end of this paper, the validity of that approach is verified by experimental analysis.

    Reference
    [1] Sherwood T, Perelman E, Calder B. Basic block distribution analysis to find periodic behavior and simulation points in applications. In: Proc. of the Int’l Conf. on Parallel Architecture and Compilation Techniques (PACT). 2001. 3-14. [doi: 10.1109/PACT.2001. 953283]
    [2] Perelman E, Hamerly G, Van Biesbrouck M, et al. Using SimPoint for accurate and efficient simulation. ACM SIGMETRICS Performance Evaluation Review, 2003,31(1):318-319. [doi: 10.1145/781027.781076]
    [3] Hamerly G, Perelman E, Lau J, et al. Simpoint 3.0: Faster and more flexible program phase analysis. Journal of Instruction Level Parallelism, 2005,7(4):1-28. [doi: 10.1.1.113.888]
    [4] MacQueen J. Some methods for classification and analysis of multivariate observations. California, 1967. 14.
    [5] Malony AD, Shende S, Morris A, et al. Evolution of a parallel performance system. Tools for High Performance Computing, 2008. 169-190. [doi: 10.1007/978-3-540-68564-7_11]
    [6] Shende SS, Malony AD. The TAU parallel performance system. Int’l Journal of High Performance Computing Applications, 2006, 20(2):287-311. [ doi: 10.1177/1094342006064482]
    [7] Huck KA, Malony AD. PerfExplorer: A performance data mining framework for large-scale parallel computing, In: Proc. of the 2005 ACM/IEEE Conf. on Supercomputing (SC 2005). 2005. 41-52. [doi: 10.1109/SC.2005.55]
    [8] Huck KA, Malony AD, Shende S, Morris A. Knowledge support and automation for performance analysis with PerfExplorer 2.0. Scientific Programming, 2008,16(2-3):123-134. [doi: 10.3233/SPR-2008-0254]
    [9] Jolliffe IT. Principal Component Analysis. 2nd ed., New York: Springer-Verlag, 2002.
    [10] Schwarz G. Estimating the dimension of a model. The Annals of Statistics, 1978,6(2):461-464.
    Related
    Cited by
Get Citation

朱鹏,李巍,李云春.一种基于迭代聚类的并行应用性能分析方法.软件学报,2010,21(zk):284-289

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:June 15,2010
  • Revised:December 10,2010
You are the first2038320Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063