Abstract:Top-K query has been widely used in lots of modern applications such as search engine and e-commerce. Top-K query returns the most relative results for user from massive data, and its main purpose is to eliminate the negative effect of information overload. Top-K query on big data has brought new challenges to data management and analysis. In light of features of MapReduce, this paper presents an in-depth study of Top-K query on big data from the perspective of data partitioning and data filtering. Experimental results show that the proposed approaches have better performance and scalability.