[关键词]
[摘要]
副本复制是数据网格中提高数据访问效率的有效方法,如何提高副本复制的效率是一个关键性问题.现有的复制策略大多基于文件访问历史选择高价值副本进行复制,但其针对的都是节点已经访问过的文件.通过对虚拟组织文件访问特性进行深入分析,引入隐性高价值文件概念,提出虚拟组织副本协作预取机制(cooperative replicaprefetching mechanism,简称CoRPM),使得本地节点通过与虚拟组织中其他节点进行协作来获取隐性高价值文件副本.该机制首先给出了副本协作预取架构,各个虚拟组织节点上的文件预取模块以协作的方式为虚拟组织内节点提供文件预取服务;然后,在副本协作预取架构的基础上设计了副本协作预取流程,其核心算法包括以作业类型为中心的本地文件预取算法和预取文件选择算法.模拟实验结果表明,CoRPM 与已有的基于文件访问历史的副本复制策略相结合,可以更加有效地降低数据访问延迟.
[Key word]
[Abstract]
Replication is an effective method that improves data access in Data Grids; however, the improvement of the efficiency of repilication is a crucial problem. Previous works on the replication mechanism mostly select high value replicas to replicate, and simply choose the files that have been accessed based on the access records of files. This paper starts with the analysis of the file access characteristics in the virtual organization (VO). After introducing the concept of implicit high-value file (IHVF), it proposes the cooperative replica prefetching mechanism (CoRPM) for virtual organization, which is based on which local grid nodes can obtain the replicas of IHVFs through cooperation with other gird nodes in the same VO. The architecture of CoRPM is presented first, in which prefetching elements running on every gird node work cooperatively to provide a file prefetching service for all the grid nodes in the VO. Then, on the basis of the design of CoRPM, the process of CoRPM is described, whose core algorithms include a job-type centric file prefetching algorithm and a prefetching file selecting algorithm. In the end, the paper evaluates the performance of CoRPM through simulations, and the results show that CoRPM does reduce the file access latency more effectively.
[中图分类号]
[基金项目]
国家自然科学基金(60903161, 60903162, 61070161, 61003257, 61103229); 国家重点基础研究发展计划(973) (2010CB328104); 高等学校博士学科点专项科研基金(200802860031); 江苏省自然科学基金(BK2008030); 江苏省网络与信息安全 重点实验室(BM2003201); 东南大学计算机网络和信息集成教育部重点实验室(93K-9)