Abstract:As emerging technologies develop rapidly, domain software puts forward new requirements for development efficiency. In addition, as a declarative programming language with concise syntax and well-defined semantics, Datalog can help developers solve complex problems rapidly and achieve smooth development and thus has attracted wide attention in recent years. However, when solving real-world problems, the existing single-machine Datalog engines are often limited by the size of memory capacity and possess no scalability. To solve these problems, this study designs and implements a Datalog engine based on out-of-core computing. Firstly, a series of operators supporting out-of-core computing are designed to compute the Datalog program, and then the program is converted into a C++ program with the operators. Next, the study designs a partition strategy based on Hash and a minimum replacement scheduling strategy based on search tree pruning. After that, the corresponding partition files are scheduled and computed to generate the final results. Based on this method, the study establishes the prototype tool DDL (disk-based Datalog engine) and selects widely used real-world Datalog programs to conduct experiments on both synthetic and real-world datasets. The experimental results show that DDL has positive performance and high scalability.