Abstract:This paper proposes a multi-query optimization algorithm for pipeline-based distributed similarity query processing (pGMSQ) in grid environment. First, when a number of query requests are simultaneously submitted by users, a cost-based dynamic query clustering (DQC) is invoked to quickly and effectively identify the correlation among the query spheres (requests). Then, index-support vector set reduction is performed at data node level in parallel. Finally, refinement of the candidate vectors is conducted to get the answer set at the execution node level. By adopting pipeline-based technique, this algorithm is experimentally proved to be efficient and effective in minimizing the response time by decreasing network transfer cost and increasing the throughput.