一种分片式多核处理器的用户级模拟器
作者:
基金项目:

Supported by the National Natural Science Foundation of China under Grant No.60673146 (国家自然科学基金); the National Natural Foundation of China for Distinguished Young Scholars under Grant No.60325205 (国家杰出青年基金); the National High-Tech Research and Development Plan of China under Grant No.2006AA010201 (国家高技术研究发展计划(863)); the National Basic Research Program of China under Grant No.2005CB321600 (国家重点基础研究发展计划(973)); the Beijing Natural Science Foundation of China under Grant No.4072024 (北京市自然科学基金); the Knowledge Innovation Program of the Institute of Computing Technology, the Chinese Academy of Sciences under Grant No.20066012 (中国科学院计算技术研究所知识创新课题)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [18]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    随着片上晶体管资源的增多和互连线延迟的加大,分片式多核微处理器已成为多核处理器设计的新方向.为了对这种新型处理器进行体系结构的深入研究和设计空间的探索,设计并实现了针对分片式多核处理器的用户级多核性能模拟器.该多核模拟器在龙芯2号单处理器核的基础上,完整地模拟了基于目录的Cache一致性协议和存储转发式片上互联网络的结构模型,详细地刻画了由于系统乱序处理各种请求应答和请求之间的冲突而造成的时序特性,可以通过运行各种串行或并行的工作负载对多核处理器的各种重要性能指标加以评估,为多核处理器的结构设计提供了快速、灵活、高效的研究平台.

    Abstract:

    As the transistor resources and delay of interconnect wires increase, the tiled multi-core processor has been a new direction for multi-core processor. In order to thoroughly study new type processor and explore the design space of it, this paper designs and implements a user-level performance simulator for the tiled CMP architecture. The simulator adopts the directory-based Cache Coherence Protocol and the architecture of store-and-forward Network- on-Chip with Godson-2 CPU as the processing core model, and depicts out-of-order transacted requests and responses and conflictions of requests and their timing characteristics in detail. The simulator can be used to evaluate all kinds of important performance features of the tiled CMP (chip multiprocessor) architecture by running all kinds of sequential or parallel workloads, and thus provides a fast, flexible and efficient platform for architecture design of multi-core processor.

    参考文献
    [1]McNairy C,Bhatia R.Montecito:A dual-core,dual-thread itanium processor.IEEE Micro,2005,25(2):10-20.
    [2]Kongetira P,Aingaran K,Olukotun K.Niagara:A 32-way multithreaded sparc processor.IEEE Micro,2005,25(2):21-29.
    [3]Kahle JA,Day MN,Hofstee HP,Johns CR,Maeurer TR,Shippy D.Introduction to the cell multiprocessor.IBM Journal of Research & Development,2005,49(4-5):589-604.
    [4]Patterson D,Hennessy J.Computer Architecture:A Quantitative Approach.4th ed.,San Francisco:Morgan Kauffman Publishers,2006.
    [5]Taylor MB,Lee W,Miller J,Wentzlaff D,Bratt I,Greenwald B,Hoffmann H,Johnson P,Kim J,Psota J,Saraf A,Shnidman N,Strumpen V,Frank M,Amarasinghe S,Agarwal A.Evaluation of the raw microprocessor:An exposed-wire-delay architecture for ILP and streams.In:Proc.of the Int'l Symp.on Computer Architecture.Munich:IEEE Computer Society,2004.2-13.
    [6]Sankaralingam K,Nagarajan R,Liu H,Huh J,Kim CK,Burger D,Keckler SW,Moore CR.Exploiting ILP,TLP,and DLP using polymorphism in the TRIPS architecture.In:Proc.of the 30th Annual Int'l Symp.on Computer Architecture.New York:ACM Press,2003.422-433.
    [7]Kim CK,Burger D,Keckler SW.An adaptive,non-uniform cache structure for wire-delay dominated on-chip caches.In:Proc.of the Int'l Conf.on Architectural Support for Programming Languages and Operating Systems.New York:ACM Press,2002.211-222.
    [8]Hu WW,Zhao JY,Zhong SQ,Yang X,Guidetti E,Wu C.Implementing a 1GHz four-issue out-of-order execution microprocessor in a standard cell ASIC methodology.Journal of Computer Science and Technology,2007,22(1):1-14.
    [9]Zhang FX,Zhang LB,Hu WW.Sim-Godson:A godson processor simulator based on SimpleScalar.Chinese Journal of Computers,2007,30(1):68-73 (in Chinese with English abstract).
    [10]Martin MMK,Sorin DJ,Beckmann BM,Marty MR,Xu M,Alameldeen AR,Moore KE,Hill MD,Wood DA.Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset.Computer Architecture News (CAN),2005.http://www.cs.wisc.edu/ gems
    [11]Magnusson PS,Christensson M,Eskilson J,Forsgren D,H?llberg G,H?gberg J,Larsson F,Moestedt A,Weruer B.Simics:A full system simulation platform.IEEE Computer,2002,35(2):50-58.
    [12]http://sesc.sourceforge.net.2005.
    [13]Shang L,Peh LS,Jha NK.Dynamic voltage scaling with links for power optimization of interconnection networks.In:Proc.of the 9th Int'l Symp.on High-Performance Computer Architecture.Anaheim:IEEE Computer Society,2003.91-102.
    [14]Wang HS,Peh LS,Malik S.Power-Driven design of router microarchitectures in on-chip networks.In:Proc.of the 36th Int'l Symp.on Microarchitecture.San Diego:IEEE Computer Society,2003.105-116.
    [15]Laudon J,Lenoski D.The SGI origin:A ccNuma highly scalable server.In:Proc.of the 24th Annual Int'l Symp.on Computer Architecture.Denver:ACM Press,1997.241-251.
    [16]Dally WJ,Towles B.Principles and Practices of Interconnection Networks.San Francisco:Morgan Kaufmann Publishers,2003.
    [17]Woo SC,Ohara M,Torrie E,Singh JP,Gupta A.The SPLASH-2 programs:Characterization and methodological considerations.In:Proc.of the 22nd Int'l Symp.on Computer Architecture.Santa Margherita Ligure:ACM Press,1995.24-36.
    [18]Gao X,Zhang FX,Tang Y,Zhang LB,Hu WW,Tang ZM.SimOS-Goodson:A goodson-processor based multi-core full-system simulator.Journal of Software,2007,18(4):1047-1055 (in Chinese with English abstract).http://www.jos.org.cn/1000-9825/18/ 1047.htm
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

黄 琨,马 可,曾洪博,张 戈,章隆兵.一种分片式多核处理器的用户级模拟器.软件学报,2008,19(4):1069-1080

复制
分享
文章指标
  • 点击次数:4604
  • 下载次数: 6938
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2007-02-05
  • 最后修改日期:2007-05-24
文章二维码
您是第19763508位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号