Multi-Party Privacy-Preserving Record Linkage Approach
Author:
Affiliation:

Fund Project:

National Natural Science Foundation of China (61472070, 61672142); National Grand Fundamental Research Program of China (973) (2012CB316201)

  • Article
  • | |
  • Metrics
  • |
  • Reference [17]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Multi-party privacy-preserving record linkage is the process of identifying records that correspond to the same real-world entities across several databases without revealing any sensitive information about these entities. With the increasing amount of data and the real-world data quality issues (such as spelling errors and wrong order), scalability and fault tolerance of PPRL have become the main challenges. At present, most of the existing multi-party PPRL methods apply exact match without fault-tolerant. There are a few other PPRL approximate methods with fault-tolerant, but when dealing with the existing data quality issues, due to the low fault-tolerance and high time cost, they cannot effectively find out the common entities between databases. To tackle this issue, this paper proposes a multi-party PPRL approximate approach combined with bloom filter, secure summation, dynamic threshold, check mechanism, and improved Dice similarity function. First, bloom filter is used to convert each record in the databases to an array of 1 and 0. Then, ratio of bit 1 is calculated for each corresponding position, and dynamic threshold and check mechanism are used to determine matched position.Finally, the similarity between records is calculated by improved Dice similarity function to judge whether records are matched. Experimental results show the proposed method has good scalability and higher fault tolerance than the existing multi-party PPRL approximate method with good precision.

    Reference
    [1] Elmagarmid AK, Panagiotis GI, Verykios SV. Duplicate record detection:A survey. IEEE Trans. on Knowledge and Data Engineering, 2007,19(1):1-16.[doi:10.1109/TKDE.2007.250581]
    [2] Vatsalan D, Christen P. Scalable privacy-preserving record linkage for multiple databases. In:Proc. of the 23th Int'l Conf. on Information and Knowledge Management. New York:ACM Press, 2014. 1795-1798.[doi:10.1145/2661829.2661875]
    [3] Al-Lawati A, Lee D, McDaniel P. Blocking-Aware private record linkage. In:Proc. of the Int'l Conf. on IQIS. 2005. 59-68.[doi:10.1145/1077501.1077513]
    [4] Bonomi L, Xiong L, Chen R, Fung BCM. Frequent grams based embedding for privacy preserving record linkage. In:Proc. of the 21th Int'l Conf. on Information and Knowledge Management. New York:ACM Press, 2012. 1597-1601.[doi:10.1145/2396761. 2398480]
    [5] Bonomi L, Xiong L, Chen R, Fung BCM. Privacy preserving record linkage via grams projections. Computer Science, 2012.
    [6] Clifton C, Kantarcioglu M, Doan A, Schadow G, Vaidya J, Elmagarmid A, Suciu D. Privacy preserving data integration and sharing. In:Proc. of the 9th SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. ACM Press, 2004. 19-26.[doi:10.1145/1008694.1008698]
    [7] Inan A, Kantarcioglu M, Ghinita G, Bertino E. Private record matching using differential privacy. In:Proc. of the 13th Int'l Conf. on Extending Database Technology. ACM Press, 2010. 123-134.[doi:10.1145/1739041.1739059]
    [8] Kuzu M, Kantarcioglu M, Inan A, Bertino E, Durham E, Malin B. Efficient privacy-aware record integration. In:Proc. of the Int'l Conf. on Extending Database Technology. ACM Press, 2013. 167-178.[doi:10.1145/2452376.2452398]
    [9] Quantin C, Bouzelat H, Allaert FAA, Faivre J, Dusserre L. How to ensure data security of an epidemiological follow-up:Quality assessment of an anonymous record linkage Procedure. The Int'l Journal of Medical Informatics, 1998,49(1):117-122.[doi:10.1016/S1386-5056(98) 00019-7]
    [10] O'Keefe CM, Yung M, Gu L, Baxter R. Privacy-Preserving data linkage protocols. In:Proc. of the Workshop on Privacy in the Electronic Society. ACM Press, 2004. 94-102.[doi:10.1145/1029179.1029203]
    [11] Kantarcioglu M, Jiang W, Malin B. A privacy-preserving framework for integrating person-specific databases. In:Proc. of the Int'l Conf. on Privacy in Statistical Databases. 2008. 298-314.[doi:10.1007/978-3-540-87471-3_25]
    [12] Mohammed N, Fung BCM, Debbabi M. Anonymity meets game theory:Secure data integration with malicious participants. The Int'l Journal on Very Large Data Bases, 2011,20(4):567-588.[doi:10.1007/s00778-010-0214-6]
    [13] Lai P, Yiu S, Chow K, Chong C, Hui L. An efficient Bloom filter based solution for multi-party private matching. In:Proc. of the Conf. on SAM. 2006.
    [14] Schnell R, Bachteler T, Reiher J. Privacy preserving record linkage using Bloom filters. MIBM, 2009,9(1):41.[doi:10.1186/1472-6947-9-41]
    [15] Vatsalan D, Christen P, O'Keefe CM, Verykios VS. An evaluation framework for privacy-preserving record linkage. Journal of Privacy and Confidentiality, 2014,6(1):35-75.
    [16] Christen P, Vatsalan D. Flexible and extensible generation and corruption of personal data. In:Proc. of the 23th Int'l Conf. on Information and Knowledge Management. New York:ACM Press, 2013. 1165-1168.[doi:10.1145/2505515.2507815]
    [17] Christen P. Data Matching-Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. SpringerVerlag, 2012.[doi:10.1007/978-3-642-31164-2]
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

韩姝敏,申德荣,聂铁铮,寇月,于戈.一种基于隐私保护下的多方记录链接方法.软件学报,2017,28(9):2281-2292

Copy
Share
Article Metrics
  • Abstract:4271
  • PDF: 6145
  • HTML: 3659
  • Cited by: 0
History
  • Received:July 11,2016
  • Revised:November 10,2016
  • Online: September 02,2017
You are the first2043741Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063