Abstract:Similarity join is one of the hottest topics in the field of data management, and it has been widely applied in many fields. However, existing similarity join methods cannot meet the increasing demands in the real world. This paper define generalized bisimilarity join as a new similarity join to expend the applications of the similarity join research by introducing the satisfaction operator on various data types with individual thresholds. Two efficient methods, SJS(sub-join set) and MFV(mapping-filtering-verification), are proposed to solve this problem. A large amount of experiments conducted on both real-world and synthetic datasets demonstrate the correctness and the effectiveness of the proposed methods.