一种保持结点可达性的高效社会网络图匿名算法
作者:
基金项目:

国家自然科学基金(61502316,61502317);沈阳航空航天大学校博士启动金(15YB36)


Efficient Algorithm on Anonymizing Social Networks with Reachability Preservation
Author:
Fund Project:

National Natural Science Foundation of China (61502316, 61502317); Doctor Startup Fund of Shenyang Aerospace University of China (15YB36)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [33]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    为了保护社会网络隐私信息,提出了多种社会网络图匿名化技术.图匿名化目的在于通过图修改操作来防止隐私泄露,同时保证匿名图在社会网络分析和图查询方面的数据可用性.可达性查询是一种基本图查询操作,可达性查询精度是衡量图数据可用性的一项重要指标.然而,当前研究忽略了图匿名对结点可达性的影响,导致较大的可达性信息损失.为了保持匿名图中结点的可达性,提出了可达性保持图匿名化(reachability preserving anonymization,简称RPA)算法,其基本思想是将结点进行分组并采取贪心策略进行匿名,从而减少匿名过程中的可达性信息损失.为了保证RPA算法的实用性,针对其执行效率进行优化,首先提出采用可达区间来高效地评估边添加操作所导致的匿名损失;其次,通过采用候选邻居索引,进一步加速RPA算法对每个结点的匿名过程.基于真实社会网络数据的实验结果表明了RPA算法的高执行效率,同时验证了生成匿名图在可达性查询方面的高精度.

    Abstract:

    As a proven effective solution to privacy preservation, graph anonymization has been studied extensively. The goal of graph anonymization is to avoid disclosure of privacy in social networks through graph modifications while at the same time preserving data utility of the anonymized graph for social network analysis and graph queries. Reachability is an important graph data utility as reachable queries are not only common on graph databases but also serving as fundamental operations for many other graph queries. However, the reachability of each vertex in the anonymized graph is severely distorted after the anonymization due to neglecting that the reachability is highly sensitive to edge modifications. This work solves the problem by designing a reachability preserving anonymization (RPA) algorithm. The main idea of RPA is to organize vertices into groups and greedily anonymizes each vertex with low impact on reachability. A number of techniques are designed to make RPA efficient. Firstly, reachable interval is proposed to efficiently measure the anonymization cost incurred by an edge addition. Secondly, an index structure, CN-index is adopted to accelerate anonymizing each vertex. Extensive experiments on real datasets demonstrate that RPA performs with high efficiency and the generated anonymized social networks preserve high data utility on reachable queries.

    参考文献
    [1] Liu K, Terzi E. Towards identity anonymization on graphs. In: Proc. of the 2008 ACM SIGMOD Int'l Conf. on Management of Data. 2008. 93-106. [doi: 10.1145/1376616.1376629]
    [2] Zhou B, Pei J. Preserving privacy in social networks against neighborhood attacks. In: Proc. of the 24th IEEE Int'l Conf. on Data Engineering. 2008. 506-515. [doi: 10.1109/ICDE.2008.4497459]
    [3] Hay M, Miklau G, Jensen D, Towsley D. Resisting structural identification in anonymized social networks. In: Proc. of the 34th Int'l Conf. on Very Large Databases. 2008. 102-114. [doi: 10.14778/1453856.1453873]
    [4] Zou L, Chen L, Ozsu MT. K-Automorphism: A general framework for privacy preserving network publication. In: Proc. of the 35th Int'l Conf. on Very Large Databases. 2009. 946-957. [doi: 10.14778/1687627.1687734]
    [5] Cormode G, Srivastava D, Yu T, Zhang Q. Anonymizing bipartite graph data using safe groupings. In: Proc. of the 34th Int'l Conf. on Very Large Databases. 2008. 833-844. [doi: 10.14778/1453856.1453947]
    [6] Bhagat S, Cormode G, Krishnamurthy B, Srivastava D. Class-Based graph anonymization for social network data. In: Proc. of the 35th Int'l Conf. on Very Large Databases. 2009. 766-777. [doi: 10.14778/1687627.1687714]
    [7] Fard AM, Wang K, Yu PS. Limiting link disclosure in social network analysis through subgraph-wise perturbation. In: Proc. of the 15th Int'l Conf. on Extending Database Technology. 2012. 109-119. [doi: 10.1145/2247596.2247610]
    [8] Gao J, Xu JY, Jin R, Zhou J, Wang T, Yang D. Neighborhood-Privacy protected shortest distance computing in cloud. In: Proc. of the 2011 ACM SIGMOD Int'l Conf. on Management of Data. 2011. 409-420. [doi: 10.1145/1989323.1989367]
    [9] Wang Y, Zheng B. Preserving privacy in social networks against connection fingerprint attacks. In: Proc. of the 2015 IEEE Int'l Conf. on Data Engineering. 2015. 54-65. [doi: 10.1109/ICDE.2015.7113272]
    [10] Cheng J, Fu AWC, Liu J. K-Isomorphism: Privacy preserving network publication against structural attacks. In: Proc. of the 2010 ACM SIGMOD Int'l Conf. on Management of Data. 2010. 459-470. [doi: 10.1145/1807167.1807218]
    [11] Yuan M, Chen L, Yu PS. Personalized privacy protection in social networks. In: Proc. of the 36th Int'l Conf. on Very Large Databases. 2010. 141-150. [doi: 10.14778/1921071.1921080]
    [12] Wang L, Meng XF. Location privacy preservation in big data era: A survey. Ruan Jian Xue Bao/Journal of Software, 2014,25(4): 693-712 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4551.htm [doi: 10.13328/j.cnki.jos.004551]
    [13] Huo Z, Meng X, Zhang R. Feel free to check-in: Privacy alert against hidden location inference attacks in GeoSNs. In: Proc. of the 18th Int'l Conf. on Database Systems for Advanced Applications. 2013. 377-391. [doi: 10.1007/978-3-642-37487-6_29]
    [14] Liu X, Yang X. Protecting sensitive relationships against inference attacks in social networks. In: Proc. of the 17th Int'l Conf. on Database Systems for Advanced Applications. 2012. 335-350. [doi: 10.1007/978-3-642-29038-1_25]
    [15] Yuan M, Chen L, Yu PS, Yu T. Protecting sensitive labels in social network data anonymization. IEEE Trans. on Knowledge and Data Engineering, 2013,25(3):633-647. [doi: 10.1109/TKDE.2011.259]
    [16] Wang Y, Xie L, Zheng B, Lee KCK. High utility K-anonymization for social network publishing. Knowledge and Information Systems, 2014,41(3):697-725. [doi: 10.1007/s10115-013-0674-2]
    [17] Agrawal R, Borgida A, Jagadish HV. Efficient management of transitive relationships in large data and knowledge bases. In: Proc. of the 1989 ACM SIGMOD Int'l Conf. on Management of Data. 1989. 253-262. [doi: 10.1145/67544.66950]
    [18] Anyanwu K, Sheth A. r-Queries: Enabling querying for semantic associations on the semantic Web. In: Proc. of the 12th Int'l Conf. on World Wide Web. 2003. 690-699. [doi: 10.1145/775152.775249]
    [19] Wang H, He H, Yang J, Yu PS, Yu JX. Dual labeling: Answering graph reachability queries in constant time. In: Proc. of the 22nd Int'l Conf. on Data Engineering. 2006. 75-75. [doi: 10.1109/ICDE.2006.53]
    [20] Chen Y, Chen Y. An efficient algorithm for answering graph reachability queries. In: Proc. of the 20th Int'l Conf. on Data Engineering. 2008. 893-902. [doi: 10.1109/ICDE.2008.4497498]
    [21] Cheng J, Yu JX, Lin X, Wang H, Yu PS. Fast computing reachability labelings for large graphs with high compression rate. In: Proc. of the 11th Int'l Conf. on Extending Database Technology. 2008. 193-204. [doi: 10.1145/1353343.1353370]
    [22] Seufert S, Anand A, Bedathur S, Weikum G. Ferrari: Flexible and efficient reachability range assignment for graph indexing. In: Proc. of the 2013 IEEE Int'l Conf. on Data Engineering. 2013. 1009-1020. [doi: 10.1109/ICDE.2013.6544893]
    [23] Xie N, Shen D, Feng S, Kou Y, Nie T, Yu G. RIAIL: An index method for reachability query in large scale graphs. Ruan Jian Xue Bao/Journal of Software, 2014,25:213-224 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/14039.htm
    [24] Cheng J, Shang Z, Cheng H, Wang H, Yu JX. K-Reach: Who is in your small world? In: Proc. of the 2012 VLDB Endowment. 2012. 1292-1303. [doi: 10.14778/2350229.2350247]
    [25] Li M, Gao H, Zou Z. K-Reach query processing based on graph compression. Ruan Jian Xue Bao/Journal of Software, 2014,25(4): 797-812 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4567.htm [doi: 10.13328/j.cnki.jos.004567]
    [26] Cohen E, Halperin E, Kaplan H, Zwick U. Reachability and distance queries via 2-hop labels. In: Proc. of the 13th Annual ACM-SIAM Symp. on Discrete Algorithms. 2002. 937-946.
    [27] Bramandia R, Choi B, Ng WK. On incremental maintenance of 2-hop labeling of graphs. In: Proc. of the 17th Int'l Conf. on World Wide Web. 2008. 845-854. [doi: 10.1145/1367497.1367611]
    [28] Wei F. TEDI: Efficient shortest path query answering on graphs. In: Proc. of the 2010 Int'l Conf. on Management of Data. 2010. 99-110. [doi: 10.1145/1807167.1807181]
    [29] Paige R, Tarjan RE. Three partition refinement algorithms. SIAM Journal on Computing, 1987,16(6):973-989. [doi: 10.1137/02160 62]
    附中文参考文献:
    [12] 王璐,孟小峰.位置大数据隐私保护研究综述.软件学报,2014,25(4):693-712. http://www.jos.org.cn/1000-9825/4551.htm [doi: 10. 13328/j.cnki.jos.004551]
    [23] 解宁,申德荣,冯朔,寇月,聂铁铮,于戈.RIAIL:大规模图上的可达性查询索引方法.软件学报,2014,25:213-224. http://www.jos. org.cn/1000-9825/14039.htm
    [25] 李鸣鹏,高宏,邹兆年.基于图压缩的k可达查询处理.软件学报,2014,25(4):797-812. http://www.jos.org.cn/1000-9825/4567.htm [doi: 10.13328/j.cnki.jos.004567]
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

刘向宇,李佳佳,安云哲,周大海,夏秀峰.一种保持结点可达性的高效社会网络图匿名算法.软件学报,2016,27(8):1904-1921

复制
分享
文章指标
  • 点击次数:5000
  • 下载次数: 6416
  • HTML阅读次数: 2766
  • 引用次数: 0
历史
  • 收稿日期:2015-12-16
  • 最后修改日期:2016-04-14
  • 在线发布日期: 2016-08-08
文章二维码
您是第19727728位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号