Efficient Algorithm on Anonymizing Social Networks with Reachability Preservation
Author:
Affiliation:

Fund Project:

National Natural Science Foundation of China (61502316, 61502317); Doctor Startup Fund of Shenyang Aerospace University of China (15YB36)

  • Article
  • | |
  • Metrics
  • |
  • Reference [33]
  • |
  • Related
  • | | |
  • Comments
    Abstract:

    As a proven effective solution to privacy preservation, graph anonymization has been studied extensively. The goal of graph anonymization is to avoid disclosure of privacy in social networks through graph modifications while at the same time preserving data utility of the anonymized graph for social network analysis and graph queries. Reachability is an important graph data utility as reachable queries are not only common on graph databases but also serving as fundamental operations for many other graph queries. However, the reachability of each vertex in the anonymized graph is severely distorted after the anonymization due to neglecting that the reachability is highly sensitive to edge modifications. This work solves the problem by designing a reachability preserving anonymization (RPA) algorithm. The main idea of RPA is to organize vertices into groups and greedily anonymizes each vertex with low impact on reachability. A number of techniques are designed to make RPA efficient. Firstly, reachable interval is proposed to efficiently measure the anonymization cost incurred by an edge addition. Secondly, an index structure, CN-index is adopted to accelerate anonymizing each vertex. Extensive experiments on real datasets demonstrate that RPA performs with high efficiency and the generated anonymized social networks preserve high data utility on reachable queries.

    Reference
    [1] Liu K, Terzi E. Towards identity anonymization on graphs. In: Proc. of the 2008 ACM SIGMOD Int'l Conf. on Management of Data. 2008. 93-106. [doi: 10.1145/1376616.1376629]
    [2] Zhou B, Pei J. Preserving privacy in social networks against neighborhood attacks. In: Proc. of the 24th IEEE Int'l Conf. on Data Engineering. 2008. 506-515. [doi: 10.1109/ICDE.2008.4497459]
    [3] Hay M, Miklau G, Jensen D, Towsley D. Resisting structural identification in anonymized social networks. In: Proc. of the 34th Int'l Conf. on Very Large Databases. 2008. 102-114. [doi: 10.14778/1453856.1453873]
    [4] Zou L, Chen L, Ozsu MT. K-Automorphism: A general framework for privacy preserving network publication. In: Proc. of the 35th Int'l Conf. on Very Large Databases. 2009. 946-957. [doi: 10.14778/1687627.1687734]
    [5] Cormode G, Srivastava D, Yu T, Zhang Q. Anonymizing bipartite graph data using safe groupings. In: Proc. of the 34th Int'l Conf. on Very Large Databases. 2008. 833-844. [doi: 10.14778/1453856.1453947]
    [6] Bhagat S, Cormode G, Krishnamurthy B, Srivastava D. Class-Based graph anonymization for social network data. In: Proc. of the 35th Int'l Conf. on Very Large Databases. 2009. 766-777. [doi: 10.14778/1687627.1687714]
    [7] Fard AM, Wang K, Yu PS. Limiting link disclosure in social network analysis through subgraph-wise perturbation. In: Proc. of the 15th Int'l Conf. on Extending Database Technology. 2012. 109-119. [doi: 10.1145/2247596.2247610]
    [8] Gao J, Xu JY, Jin R, Zhou J, Wang T, Yang D. Neighborhood-Privacy protected shortest distance computing in cloud. In: Proc. of the 2011 ACM SIGMOD Int'l Conf. on Management of Data. 2011. 409-420. [doi: 10.1145/1989323.1989367]
    [9] Wang Y, Zheng B. Preserving privacy in social networks against connection fingerprint attacks. In: Proc. of the 2015 IEEE Int'l Conf. on Data Engineering. 2015. 54-65. [doi: 10.1109/ICDE.2015.7113272]
    [10] Cheng J, Fu AWC, Liu J. K-Isomorphism: Privacy preserving network publication against structural attacks. In: Proc. of the 2010 ACM SIGMOD Int'l Conf. on Management of Data. 2010. 459-470. [doi: 10.1145/1807167.1807218]
    [11] Yuan M, Chen L, Yu PS. Personalized privacy protection in social networks. In: Proc. of the 36th Int'l Conf. on Very Large Databases. 2010. 141-150. [doi: 10.14778/1921071.1921080]
    [12] Wang L, Meng XF. Location privacy preservation in big data era: A survey. Ruan Jian Xue Bao/Journal of Software, 2014,25(4): 693-712 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4551.htm [doi: 10.13328/j.cnki.jos.004551]
    [13] Huo Z, Meng X, Zhang R. Feel free to check-in: Privacy alert against hidden location inference attacks in GeoSNs. In: Proc. of the 18th Int'l Conf. on Database Systems for Advanced Applications. 2013. 377-391. [doi: 10.1007/978-3-642-37487-6_29]
    [14] Liu X, Yang X. Protecting sensitive relationships against inference attacks in social networks. In: Proc. of the 17th Int'l Conf. on Database Systems for Advanced Applications. 2012. 335-350. [doi: 10.1007/978-3-642-29038-1_25]
    [15] Yuan M, Chen L, Yu PS, Yu T. Protecting sensitive labels in social network data anonymization. IEEE Trans. on Knowledge and Data Engineering, 2013,25(3):633-647. [doi: 10.1109/TKDE.2011.259]
    [16] Wang Y, Xie L, Zheng B, Lee KCK. High utility K-anonymization for social network publishing. Knowledge and Information Systems, 2014,41(3):697-725. [doi: 10.1007/s10115-013-0674-2]
    [17] Agrawal R, Borgida A, Jagadish HV. Efficient management of transitive relationships in large data and knowledge bases. In: Proc. of the 1989 ACM SIGMOD Int'l Conf. on Management of Data. 1989. 253-262. [doi: 10.1145/67544.66950]
    [18] Anyanwu K, Sheth A. r-Queries: Enabling querying for semantic associations on the semantic Web. In: Proc. of the 12th Int'l Conf. on World Wide Web. 2003. 690-699. [doi: 10.1145/775152.775249]
    [19] Wang H, He H, Yang J, Yu PS, Yu JX. Dual labeling: Answering graph reachability queries in constant time. In: Proc. of the 22nd Int'l Conf. on Data Engineering. 2006. 75-75. [doi: 10.1109/ICDE.2006.53]
    [20] Chen Y, Chen Y. An efficient algorithm for answering graph reachability queries. In: Proc. of the 20th Int'l Conf. on Data Engineering. 2008. 893-902. [doi: 10.1109/ICDE.2008.4497498]
    [21] Cheng J, Yu JX, Lin X, Wang H, Yu PS. Fast computing reachability labelings for large graphs with high compression rate. In: Proc. of the 11th Int'l Conf. on Extending Database Technology. 2008. 193-204. [doi: 10.1145/1353343.1353370]
    [22] Seufert S, Anand A, Bedathur S, Weikum G. Ferrari: Flexible and efficient reachability range assignment for graph indexing. In: Proc. of the 2013 IEEE Int'l Conf. on Data Engineering. 2013. 1009-1020. [doi: 10.1109/ICDE.2013.6544893]
    [23] Xie N, Shen D, Feng S, Kou Y, Nie T, Yu G. RIAIL: An index method for reachability query in large scale graphs. Ruan Jian Xue Bao/Journal of Software, 2014,25:213-224 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/14039.htm
    [24] Cheng J, Shang Z, Cheng H, Wang H, Yu JX. K-Reach: Who is in your small world? In: Proc. of the 2012 VLDB Endowment. 2012. 1292-1303. [doi: 10.14778/2350229.2350247]
    [25] Li M, Gao H, Zou Z. K-Reach query processing based on graph compression. Ruan Jian Xue Bao/Journal of Software, 2014,25(4): 797-812 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4567.htm [doi: 10.13328/j.cnki.jos.004567]
    [26] Cohen E, Halperin E, Kaplan H, Zwick U. Reachability and distance queries via 2-hop labels. In: Proc. of the 13th Annual ACM-SIAM Symp. on Discrete Algorithms. 2002. 937-946.
    [27] Bramandia R, Choi B, Ng WK. On incremental maintenance of 2-hop labeling of graphs. In: Proc. of the 17th Int'l Conf. on World Wide Web. 2008. 845-854. [doi: 10.1145/1367497.1367611]
    [28] Wei F. TEDI: Efficient shortest path query answering on graphs. In: Proc. of the 2010 Int'l Conf. on Management of Data. 2010. 99-110. [doi: 10.1145/1807167.1807181]
    [29] Paige R, Tarjan RE. Three partition refinement algorithms. SIAM Journal on Computing, 1987,16(6):973-989. [doi: 10.1137/02160 62]
    附中文参考文献:
    [12] 王璐,孟小峰.位置大数据隐私保护研究综述.软件学报,2014,25(4):693-712. http://www.jos.org.cn/1000-9825/4551.htm [doi: 10. 13328/j.cnki.jos.004551]
    [23] 解宁,申德荣,冯朔,寇月,聂铁铮,于戈.RIAIL:大规模图上的可达性查询索引方法.软件学报,2014,25:213-224. http://www.jos. org.cn/1000-9825/14039.htm
    [25] 李鸣鹏,高宏,邹兆年.基于图压缩的k可达查询处理.软件学报,2014,25(4):797-812. http://www.jos.org.cn/1000-9825/4567.htm [doi: 10.13328/j.cnki.jos.004567]
    Related
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

刘向宇,李佳佳,安云哲,周大海,夏秀峰.一种保持结点可达性的高效社会网络图匿名算法.软件学报,2016,27(8):1904-1921

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:December 16,2015
  • Revised:April 14,2016
  • Online: August 08,2016
You are the first2033305Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063