Abstract:With the broader adoption of machine learning (ML) in security-critical fields, the requirements for the explainability of ML are also increasing. The explainability aims at helping people understand models’ internal working principles and decision basis, which adds their realibility. However, the research on understanding ML models, such as random forest (RF), is still in the infant stage. Considering the strict and standardized characteristics of formal methods and their wide application in the field of ML in recent years, this work leverages formal methods and logical reasoning to develop a machine learning interpretability method for explaining the prediction of RF. Specifically, the decision-making process of RF is encoded into first-order logic formula, and the proposed approach is centered around minimal unsatisfiable cores (MUC) and local interpretation of feature importance and counterfactual sample generation method are provided. Experimental results on several public datasets illustrate the high quality of the proposed feature importance measurement, and the counterfactual sample generation method outperforms the state-of-the-art method. Moreover, from the perspective of user friendliness, the user report can be generated according to the analysis results of counterfactual samples, which can provide suggestions for users to improve their own situation in real-life applications.