Abstract:Star-JOIN queries based on local differential privacy (LDP) have attracted a lot of attention from researchers in recent years. Existing Star-JOIN queries based on the OLH mechanism and hierarchical tree structures face issues such as privacy leakage risks at the root node and the lack of guidance on selecting an appropriate τ value for the τ-truncation mechanism. To remedy the shortcomings of the existing algorithms, this study proposes an effective Star-JOIN query algorithm, longitudinal path random response for join (LPRR-JOIN), to satisfy the requirements of LDP. In the LPRR-JOIN algorithm, full advantage is taken of the longitudinal path structure of the hierarchical tree and the GRR mechanism to propose an algorithm called LPRR to perturb users’ tuples. This algorithm utilizes the combinations of nodes along the longitudinal paths of all attributes as the perturbation domain. In the LPRR-JOIN algorithm, tuples are mapped by each user to corresponding node combinations, followed by local perturbation of the mapped tuples using the GRR mechanism. To guard against frequency attacks on the fact table, the algorithm permits users to locally truncate the count of their tuples based on a threshold τ, where tuples are deleted if their count exceeds τ and supplemented if it falls below τ. Two solutions are proposed within LPRR-JOIN to compute the optimal τ value. The first is to solve the optimization equation over bias caused by τ-truncation and perturbation variance due to LPRR. The second is to obtain the distribution of τ under the constraints of LDP and compute the median value from the distribution. The LPRR-JOIN algorithm employs an overall error function constructed from the bias and perturbation variance resulting from τ-truncation to derive an optimal τ value through the optimization of the error objective function. Additionally, by integrating a user grouping strategy, the algorithm ascertains the overall distribution of τ values and identifies a suitable τ value using the median. When compared with current algorithms across three diverse multi-relational datasets, experimental outcomes demonstrate the superiority of the LPRR-JOIN algorithm in query response performance.