Distance Metric Based Diversified Ranking on Large Graphs
Author:
Affiliation:

Clc Number:

TP311

Fund Project:

National Natural Science Foundation of China (61562091, 61472345);Program for the Second Batch of Yunling Scholar of Yunnan Province (C6153001);Natural Science Foundation of Yunnan Province (2014FA023, 2016FB110);Foundation of Backbone Teacher Development of Yunnan University (WX173602);Program for Excellent Young Talents of Yunnan University (XT412003);Project of Data Driven Software Engineeringinnovation Team of Yunnan University, Yunnan Province (2017HC012)

  • Article
  • | |
  • Metrics
  • |
  • Reference [20]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Expansion relevance which combines both relevance and diversity into a single function is resorted to a submodular optimization objective that can be solved by applying the classic cardinality constrained monotone submodular maximization. However, expansion relevance do not directly capture the dis-similarity over a pair of nodes. Existing submodular algorithms are sequential and not easy to take full advantage of the power of distributed cluster computing platform, such as Spark, to significantly improve the efficiency of algorithm. To tackle this issue, in this paper, a distance metric, which is defined by a sum function of personalized PageRank scores over the symmetry difference of neighbors of a pair of nodes, is first introduced to capture the pairwise dis-similarity over pairs of nodes. Then, the problem of diversified ranking on graphs is formulated as a max-sum k-dispersion problem with metrical edge weight. A polynomial time 2-approximate algorithm is proposed to solve the problem. Considering the computational independence of different pairs of nodes, a MapReduce algorithm is further developed to boost the efficiency of the process. Finally, extensive experiments are conducted on real network datasets to verify the effectiveness and efficiency of the proposed algorithm.

    Reference
    [1] Cheng XQ, Sun BJ, Shen HW, Yu ZH. Research status and trends of diversified graph ranking. Bulletin of Chinese Academy of Science, 2015,30(2):248-256(in Chinese with English abstract).[doi:10.16418/j.issn.1000-3045.2015.02.012]
    [2] Page L, Brin S, Motwani R, et al. The PageRank citation ranking:Bringing order to the Web. Stanford InfoLab, 1999.
    [3] Haveliwala TH. Topic-Sensitive pagerank. In:Proc. of the 11th Int'l Conf. on World Wide Web. ACM Press, 2002. 517-526.
    [4] Mei Q, Guo J, Radev D. DivRank:The interplay of prestige and diversity in information networks. In:Proc. of the 16th Int'l Conf. on Knowledge Discovery and Data Mining. ACM Press, 2010. 1009-1018.
    [5] Zhu X, Goldberg AB, Van Gael J, Andrzejewski D. Improving diversity in ranking using absorbing random walks. In:Proc. of the Human Language Technology Conf. of the North American Chapter of the Association of Computational Linguistics. the Association for Computational Linguistics, 2007. 97-104.
    [6] Zheng K, Wang H, Qi Z, Li JZ, Gao H. A survey of query result diversification. Knowledge & Information Systems, 2017,51:1-36.[doi:10.1007/s10115-016-0990-4]
    [7] Tong H, He J, Wen Z, Konuru R, Lin CY. Diversified ranking on large graphs:An optimization viewpoint. In:Proc. of the 17th Int'l Conf. on Knowledge Discovery and Data Mining. ACM Press, 2011. 1028-1036.
    [8] Li RH, Yu JX. Scalable diversified ranking on large graphs. IEEE Trans. on Knowledge and Data Engineering, 2013,25(9):2133-2146.[doi:10.1109/TKDE.2012.170]
    [9] Küçüktunç O, Saule E, Kaya K, Çatalyürek ÜV. Diversified recommendation on graphs:Pitfalls, measures, and algorithms. In:Proc. of the 22nd Int'l Conf. on World Wide Web. ACM Press, 2013. 715-726.
    [10] Küçüktunç O, Saule E, Kaya K, Çatalyürek ÜV. Diversifying citation recommendations. ACM Trans. on Intelligent Systems and Technology (TIST), 2015,5(4):55.[doi:10.1145/2668106]
    [11] Buchbinder N, Feldman M, Naor J, Schwartz R. Submodular maximization with cardinality constraints. In:Proc. of the 25th Annual Symp. on Discrete Algorithms. SIAM, 2014. 1433-1452.[doi:10.1137/1.9781611973402.106]
    [12] Apache Spark-Lightning-fast cluster computing. http://spark.apache.org/
    [13] Ravi SS, Rosenkrantzt DJ, Tayi GK. Approximation algorithms for facility dispersion. Gonzalez TF, ed. Handbook of Approximation Algorithms and Metaheuristics. Chapman & Hall/CRC, 2007. 38.1-38.17.[doi:10.1201/9781420010749]
    [14] Hassin R, Rubinstein S, Tamir A. Approximation algorithms for maximum dispersion. Operations Research Letters, 1997,21(3):133-137.[doi:10.1016/S0167-6377(97)00034-5]
    [15] Gollapudi S, Sharma A. An axiomatic approach for result diversification. In:Proc. of the 18th Int'l Conf. on World Wide Web. ACM Press, 2009. 381-390.
    [16] Dean J, Ghemawat S. MapReduce:Simplified data processing on large clusters. In:Proc. of the 6th Symp. on Operating System Design and Implementation. USENIX Association, 2004. 137-150.
    [17] Leskovec J, Krause A, Guestrin C, Faloutsos C, Van Briesen J, Glance N. Cost-Effective outbreak detection in networks. In:Proc. of the 13th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. ACM Press, 2007. 420-429.[doi:10.1145/1281192.1281239]
    [18] Muller E, Sanchez PI, Mulle Y, Böhm K. Ranking outlier nodes in subspaces of attributed graphs. In:Proc. of the 29th Int'l Conf. on Data Engineering, IEEE. 2013. 216-222.[doi:10.1109/ICDEW.2013.6547453]
    附中文参考文献:
    [1] 程学旗,孙冰杰,沈华伟,余智华.多样性图排序的研究现状及展望.中国科学院院刊,2015,30(2):248-256.[doi:10.16418/j.issn. 1000-3045.2015.02.012]
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

李劲,岳昆,蔡娇,张志坚,刘惟一.基于距离度量的多样性图排序方法.软件学报,2018,29(3):599-613

Copy
Share
Article Metrics
  • Abstract:3972
  • PDF: 6200
  • HTML: 3164
  • Cited by: 0
History
  • Received:August 02,2017
  • Revised:September 05,2017
  • Online: December 05,2017
You are the first2041659Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063