Abstract:Multi-party privacy-preserving record linkage is the process of identifying records that correspond to the same real-world entities across several databases without revealing any sensitive information about these entities. With the increasing amount of data and the real-world data quality issues (such as spelling errors and wrong order), scalability and fault tolerance of PPRL have become the main challenges. At present, most of the existing multi-party PPRL methods apply exact match without fault-tolerant. There are a few other PPRL approximate methods with fault-tolerant, but when dealing with the existing data quality issues, due to the low fault-tolerance and high time cost, they cannot effectively find out the common entities between databases. To tackle this issue, this paper proposes a multi-party PPRL approximate approach combined with bloom filter, secure summation, dynamic threshold, check mechanism, and improved Dice similarity function. First, bloom filter is used to convert each record in the databases to an array of 1 and 0. Then, ratio of bit 1 is calculated for each corresponding position, and dynamic threshold and check mechanism are used to determine matched position.Finally, the similarity between records is calculated by improved Dice similarity function to judge whether records are matched. Experimental results show the proposed method has good scalability and higher fault tolerance than the existing multi-party PPRL approximate method with good precision.