Data Anonymization Approach for Microdata with Relational and Transaction Attributes
Author:
Affiliation:

Fund Project:

National Natural Science Foundation of China (61272054, 61572130, 61632008, 61320106007, 61502100, 61402104); Jiangsu Provincial Natural Science Foundation (BK20150628, BK20140648, BK20150637); Fundamental Research Funds for the Central Universities (2242014R30010); Jiangsu Provincial Key Technology R&D Program (BE2014603); Qinglan Project of Jiangsu Province; Program of Jiangsu Provincial Key Laboratory of Network and Information Security (BM2003201); Program of Key Laboratory of Computer Network and Information Integration of the Ministry of Education of China (93K-9)

  • Article
  • | |
  • Metrics
  • |
  • Reference [33]
  • |
  • Related
  • | | |
  • Comments
    Abstract:

    When publishing datasets that contain relational and transaction attributes, referred to as RT-data for briefness, either type of data may suffer from linking attacks. Anonymizing both of them is essential. However, previous approaches suffer from huge information loss during anonymizing RT-data, and they fail to preserve the utility of datasets. To address this problem, an anonymization model, (k,l)-diversity is proposed to ensure privacy by guaranteeing l-diversity on each equivalence class and k-anonymity on transaction data. In addition, two heuristic algorithms named APA and PAA, which anonymize RT-data in different orders, are also provided to achieve (k,l)-diversity. Extensive experiments based on real-world dataset show that APA and PAA outperform existing approaches in terms of execution time and information loss.

    Reference
    [1] Sweeney L. k-Anonymity:A model for protecting privacy. Int'l Journal on Uncertainty, Fuzziness and Knowledge-Based Systems, 2002,10(5):557-570.[doi:10.1142/S0218488502001648]
    [2] Poulis G, Loukides G, Gkoulalas-Divanis A, Skiadopoulos S. Anonymizing data with relational and transaction attributes. In:Proc. of the European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD). Berlin, Heidelberg:Springer-Verlag, 2013. 353-369.[doi:10.1007/978-3-642-40994-3_23]
    [3] Zhou SG, Li F, Tao YF, Xiao XK. Privacy preservation in database applications:A survey. Chinese Journal of Computers, 2009, 32(5):847-861(in Chinese with English abstract).[doi:10.3724/SP.J.1016.2009.00847]
    [4] Meyerson A, Williams R. On the complexity of optimal K-anonymity. In:Proc. of the 23rd ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems. ACM Press, 2004. 223-228.[doi:10.1145/1055558.1055591]
    [5] Aggarwal CC. On k-anonymity and the curse of dimensionality. In:Proc. of the 31st Int'l Conf. on Very Large Data Bases. VLDB Endowment, 2005. 901-909.
    [6] Yang XC, Liu XY, Wang B, Yu G. K-Anonymization approaches for supporting multiple constraints. Ruan Jian Xue Bao/Journal of Software, 2006,17(5):1222-1231(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/17/1222.htm[doi:10.1360/jos171222]
    [7] Wong RCW, Li J, Fu AWC, Wang K. (α,k)-Anonymity:An enhanced k-anonymity model for privacy preserving data publishing. In:Proc. of the 12th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. ACM Press, 2006. 754-759.[doi:10.1145/1150402.1150499]
    [8] Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M. L-Diversity:Privacy beyond k-anonymity. ACM Trans. on Knowledge Discovery from Data, 2007,1(1):Article 3.[doi:10.1145/1217299.1217302]
    [9] Li N, Li T, Venkatasubramanian S. t-Closeness:Privacy beyond k-anonymity and l-diversity. In:Proc. of the IEEE 23rd Int'l Conf. on Data Engineering. IEEE, 2007. 106-115.[doi:10.1109/ICDE.2007.367856]
    [10] Xiao XK, Tao YF. M-Invariance:Towards privacy preserving re-publication of dynamic datasets. In:Proc. of the 2007 ACM SIGMOD Int'l Conf. on Management of Data. ACM Press, 2007. 689-700.[doi:10.1145/1247480.1247556]
    [11] Xiao XK, Tao YF. Personalized privacy preservation. In:Proc. of the 2006 ACM SIGMOD Int'l Conf. on Management of Data. ACM Press, 2006. 229-240.[doi:10.1007/978-0-387-70992-5_19]
    [12] Dwork C. Differential privacy. In:Automata, Languages and Programming. Berlin, Heidelberg:Springer-Verlag, 2006. 1-12.[doi:10.1007/11787006]
    [13] Sweeney L. Achieving k-anonymity privacy protection using generalization and suppression. Int'l Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2002,10(5):571-588.[doi:10.1142/S021848850200165X]
    [14] Park H, Shim K. Approximate algorithms for K-anonymity. In:Proc. of the 2007 ACM SIGMOD Int'l Conf. on Management of Data. ACM Press, 2007. 67-78.[doi:10.1145/1247480.1247490]
    [15] LeFevre K, DeWitt DJ, Ramakrishnan R. Incognito:Efficient full-domain K-anonymity. In:Proc. of the 2005 ACM SIGMOD Int'l Conf. on Management of Data. ACM Press, 2005. 49-60.[doi:10.1145/1066157.1066164]
    [16] Xu J, Wang W, Pei J, Wang X, Shi B, Fu AWC. Utility-Based anonymization using local recoding. In:Proc. of the 12th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. ACM Press, 2006. 785-790.[doi:10.1145/1150402.1150504]
    [17] LeFevre K, DeWitt DJ, Ramakrishnan R. Mondrian multidimensional K-anonymity. In:Proc. of the 22nd Int'l Conf. on Data Engineering. IEEE Computer Society, 2006. 25-36.[doi:10.1109/ICDE.2006.101]
    [18] Xiao XK, Tao YF. Anatomy:Simple and effective privacy preservation. In:Proc. of the 32nd Int'l Conf. on Very Large Data Bases. VLDB Endowment, 2006. 139-150.
    [19] Tao YF, Chen H, Xiao XK, Zhou SG, Zhang D. ANGEL:Enhancing the utility of generalization for privacy preserving publication. IEEE Trans. on Knowledge and Data Engineering, 2009,21(7):1073-1087.[doi:10.1109/TKDE.2009.65]
    [20] Zhang XJ, Meng XF. Differential privacy in data publication and analysis. Chinese Journal of Computers, 2014,37(4):927-949(in Chinese with English abstract).[doi:10.3724/SP.J.1016.2014.00927]
    [21] Xiong P, Zhu TQ, Wang XF. A survey on differential privacy and applications. Chinese Journal of Computers, 2014,37(1):101-122(in Chinese with English abstract).[doi:10.3724/SP.J.1016.2014.00101]
    [22] Xiao XK, Wang G, Gehrke J. Differential privacy via wavelet transforms. IEEE Trans. on Knowledge and Data Engineering, 2011, 23(8):1200-1214.[doi:10.1109/TKDE.2010.247]
    [23] Chen R, Fung BC, Yu PS, Desai BC. Correlated network data publication via differential privacy. The VLDB Journal, 2014,23(4):653-676.[doi:10.1007/s00778-013-0344-8]
    [24] Ghinita G, Tao YF, Kalnis P. On the anonymization of sparse high-dimensional data. In:Proc. of IEEE 24th Int'l Conf. on Data Engineering. Washington:IEEE Computer Society, 2008. 715-724.[doi:10.1109/ICDE.2008.4497480]
    [25] Xu Y, Fung B, Wang K, Fu A, Pei J. Publishing sensitive transactions for itemset utility. In:Proc. of 8th IEEE Int'l Conf. on Data Mining. Washington:IEEE Computer Society, 2008. 1109-1114.[doi:10.1109/ICDM.2008.98]
    [26] Terrovitis M, Mamoulis N, Kalnis P. Privacy-Preserving anonymization of set-valued data. In:Proc. of the VLDB Endow. VLDB Endowment, 2008. 115-125.[doi:10.14778/1453856.1453874]
    [27] He Y, Naughton JF. Anonymization of set-valued data via top-down, local generalization. In:Proc. of the VLDB Endow. VLDB Endowment, 2009. 934-945.[doi:10.14778/1687627.1687733]
    [28] Terrovitis M, Mamoulis N, Kalnis P. Local and global recoding methods for anonymizing set-valued data. The VLDB Journal, 2011,20(1):83-106.[doi:10.1007/s00778-010-0192-8]
    附中文参考文献:
    [3] 周水庚,李丰,陶宇飞,肖小奎.面向数据库应用的隐私保护研究综述.计算机学报,2009,32(5):847-861.[doi:10.3724/SP.J.1016. 2009.00847]
    [6] 杨晓春,刘向宇,王斌,于戈.支持多约束的K-匿名化方法.软件学报,2006,17(5):1222-1231. http://www.jos.org.cn/1000-9825/17/1222.htm[doi:10.1360/jos171222]
    [20] 张啸剑,孟小峰.面向数据发布和分析的差分隐私保护.计算机学报,2014,37(4):927-949.[doi:10.3724/SP.J.1016.2014.00927]
    [21] 熊平,朱天清,王晓峰.差分隐私保护及其应用.计算机学报,2014,37(1):101-122.[doi:10.3724/SP.J.1016.2014.00101]
    Related
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

龚奇源,杨明,罗军舟.面向关系-事务数据的数据匿名方法.软件学报,2016,27(11):2828-2842

Copy
Share
Article Metrics
  • Abstract:3299
  • PDF: 4520
  • HTML: 2659
  • Cited by: 0
History
  • Received:November 09,2015
  • Revised:February 23,2016
  • Online: May 05,2016
You are the first2032397Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063