Abstract:Differential privacy, with its powerful privacy protection ability, has been applied in random forest algorithms to solve privacy leakage problem. However, directly applying differential privacy to random forest will seriously reduce the classification accuracy of the model. Therefore, in order to alleviate the contradiction between privacy protection and model accuracy, this paper proposes a novel differential privacy random forest training algorithm, called eDPRF. Specifically, we design a decision tree construction method based on permute-and-flip mechanism, which utilizes the efficient query output advantage of this mechanism to design corresponding utility functions to achieve precise output of split features and labels. At the same time, we design a privacy budget allocation strategy based on composition theorem, which improves the privacy budget utilization rate of nodes by obtaining training subsets without replacement sampling and adjusting internal budgets through differentiation. Finally, privacy analysis and experimental results show that proposed algorithm outperforms similar algorithms in terms of classification accuracy given the same privacy budget.