[关键词]
[摘要]
SQL是一种被广泛应用于操作关系数据库的编程语言, 很多用户(如数据分析人员和初级程序员等)由于缺少编程经验和SQL语法知识, 导致在编写SQL查询程序时会碰到各种困难. 当前, 使用程序合成方法根据<输入-输出>样例表自动生成相应的SQL查询程序, 吸引了越来越多人的关注. 所提ISST (正反例归纳合成)方法, 能够根据用户编辑的含有少量元组的<输入-输出>示例表自动合成满足用户期望的SQL查询程序. ISST方法包括5个主要阶段: 构建SQL查询程序草图、扩展工作表数据、划分正反例集合、归纳谓词和验证排序. 在PostgreSQL在线数据库上验证SQL查询程序, 并依据奥卡姆剃刀原则对已合成的SQL查询程序候选集打分排序. 使用Java语言实现了ISST方法, 并在包含28条样例的测试集上进行验证, ISST方法能正确合成其中的24条测试样例, 平均耗时2 s.
[Key word]
[Abstract]
SQL is a programming language that is widely used to operate relation databases. Many users (such as data analysts and junior programmers) will encounter various difficulties when writing SQL query programs due to the lack of programming experience and knowledge of SQL syntax. Currently, the research on the automatic synthesis of SQL query programs from the <input-output> (I/O) example tables has attracted more and more attention. The inductive SQL synthesis with positive and negative tuples (ISST) method proposed in this study can automatically synthesize SQL query programs that meet the users’ expectations by the I/O example tables edited by users and containing a small number of tuples. The ISST method contains five main stages: constructing the SQL query program sketches, expanding the worksheet data, dividing the sets of positive and negative examples, inductively synthesizing selection predicates, and sorting after verifying. The candidate set of SQL query programs is verified on the online database PostgreSQL, and the candidate set of synthesized SQL query programs is scored and sorted according to the principle of Occam’s razor. The ISST method is implemented using the Java language and then is evaluated on a test set containing 28 samples. The results reveal that the ISST method can correctly synthesize 24 of the samples, which takes an average of 2 seconds.
[中图分类号]
[基金项目]
上海市科委项目(19511132000)