内存数据库在TPC-H负载下的处理器性能
DOI:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

Supported by the National Natural Science Foundation of China under Grant Nos.60496325, 60473069, 60503038 (国家自然科学基金); the Grant from HP Labs. China (国际合作(HP Lab.)项目)


Main Memory Database TPC-H Workload Characterization on Modern Processor
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    Ailamaki等人1999年研究了数据库管理系统(database management system,简称DBMS)在处理器上的时间开销分解.此后,相关研究集中在分析DBMS在处理器上的瓶颈.但这些研究工作均是在磁盘数据库DRDBs(disk resident databases)上开展的,而且都是分析DBMS上的TPC-C类负载.然而,随着硬件技术的进步,现代计算机的多级缓存结构(memory hierarchy)在逐渐地"上移".例如,容量越来越大的芯片内缓存(on-chip caches)和芯片外缓存(off-chip caches),容量越来越大的RAM,Flash Memory等等.为此,处理器负载分析的研究工作也应随之"上移".研究内存数据MMDBs(main memory resident databases)在计算密集型负载下的处理器行为特性.由于磁盘数据库的主要性能瓶颈是磁盘I/O,因而可以用索引、压缩等技术进行优化;然而,内存数据库的性能瓶颈却在于处理器和内存之间的数据交换.针对这一问题,首先分析了磁盘数据库和内存数据库在TPC-H负载下处理器性能瓶颈的差异,并给出了一些优化建议,提出了通过预取的优化方法.其次,通过实验比较了不同存储体系结构(行存储与列存储)对处理器利用率的差异,并探索了下一代内存数据库体系结构方面的解决方案.此外,还研究了索引结构对处理器多级缓存的影响,并给出了索引的优化建议.最后,提出一个微测试集用于评估内存数据库在DSS(decision support system)负载下处理器的性能及行为特性.研究结果会对运行于下一代处理器上的内存数据库体系结构设计和性能优化提供一定的实验依据.

    Abstract:

    In 1999, the research of database systems' execution time breakdown on modern computer platforms has been analyzed by Ailamaki, et al. The primary motivation of these studies is to improve the performance of Disk Resident Databases (DRDBs), which form the main stream of database systems until now. The typical benchmark used in those studies is TPC-C. However, continuing hardware advancements have "moved-up" on the memory hierarchy, such as the larger and larger on-chip and off-chip caches, the steadily increasing RAM space, and the commercial availability of huge flash memory (solid-state disk) on top of regular disk, etc. To reflect such a trend, the target of workload characterization research along the memory hierarchy is also studied. This paper focuses on Main Memory Databases (MMDBs), and the TPC-H benchmark. Unlike the performance of DRDB which is I/O bound and may be optimized by high-level mechanisms such as indexing, the performance of MMDB is basically CPU and memory bound. In this study, the paper first compares the execution time breakdown of DRDB and MMDB, and the paper proposes an optimize strategy to optimize the memory resident aggregate. Then, the paper explores the difference between column-oriented and row-oriented storage models in CPU and cache utilization. Furthermore, the paper measures performance of MMDBs on different generational CPUs. In addition, the paper analyzes the index influence and gives a strategy for main memory database index optimization. Finally, the paper analyzes each query in the full TPC-H benchmark in detail, and obtains systematic results, which help design micro-benchmarks for further analysis of CPU cache stall. Results of this study are expected to benefit the performance optimization of MMDBs, and the architecture design memory-oriented databases of the next generation.

    参考文献
    相似文献
    引证文献
引用本文

刘大为,栾 华,王 珊,覃 飙.内存数据库在TPC-H负载下的处理器性能.软件学报,2008,19(10):2573-2584

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2007-07-20
  • 最后修改日期:2008-01-29
  • 录用日期:
  • 在线发布日期:
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号