Abstract:By doing a lot of experiments, if two users have more cross-project then they will own more duplication data at a virtual desktop instrument system. So, according to this finding, this paper proposes a user-aware de-duplication algorithm. This algorithm breaks the rule of data locality and can work at the new rule of user locality. According to the new rule, it just need load one user's finger print data into memory for each user group. So it can reduce 5x~10x memory requirements than other algorithm and it can control the searching scope in a limited number for each checking besides. So this algorithm can avoid a lot of read I/O operations. Meanwhile, this algorithm can adjust the searching scope dynamically according to the current workload of VDI system. Because it always tries to get the best de-duplication rate but not affect the response time of VDI system. The prototype experimental results show that it can improve the performance of de-duplication algorithm, especially when it used in a massive data storage system. Compared with OpenDedup, the algorithm can reduce more than 200% read I/O operations and can accelerate the response time more than 3x fast when the finger print data is bigger than available memory.