Abstract:Recently,many countries and regions have enacted data security policies,such as the General Data Protection Regulation proposed by the EU.The release of related laws and regulations has aggravated the problem of data silos,which makes it difficult to share data among various data owners.The Data Federation is a possible solution to this problem.Data federation refers to the calculation of query tasks jointly performed by multiple data owners without protecing their original data and combining privacy computing technologies such as secure multi-party computing.This concept has become a research trend in recent years,and a series of representative systems have been proposed such as SMCQL and Conclave.However,for the core relational database system join query,the existing data federation system still has the following problems.First of all,the join query type is single,it is difficult to meet the query requirements under complex join conditions.Secondly,the algorithm performance has huge improvement space,because the existing systems often call the security tool library directly,which has high running time and communication overhead.Therefore,we propose a data federation join algorithm to address the above issues.The main contributions of this paper are as follows.Firstly,we design and implement multiparty-oriented federation security operators,which can support a variety of operations.Secondly,we propose a federated theta-join algorithm and an optimization strategy to significantly reduce the security computation cost.Finally,we verify the performance of this paper based on the benchmark dataset TPC-H.The experimental results show that the proposed algorithm can reduce the runtime and communication overhead by 61.33% and 95.26% compared with the existing data federation system SMCQL and Conclave.