基于变异的正则表达式反例测试串生成算法

doi:10.13328/j.cnki.jos.006925

微信服务号

微信订阅号

2025年6月16日 9:54 星期一

首页 > 过刊浏览>2024年第35卷第7期 >3355-3376. DOI:10.13328/j.cnki.jos.006925

PDF HTML阅读 XML下载导出引用引用提醒

基于变异的正则表达式反例测试串生成算法
DOI:
                        10.13328/j.cnki.jos.006925
                    
CSTR:
                        
                    
作者:
                        郑黎晓郑黎晓
华侨大学 计算机科学与技术学院, 福建 厦门 361021
在期刊界中查找
在百度中查找
在本站中查找
余李林余李林
华侨大学 计算机科学与技术学院, 福建 厦门 361021
在期刊界中查找
在百度中查找
在本站中查找
陈海明陈海明
计算机科学国家重点实验室 (中国科学院 软件研究所 ), 北京 100190
在期刊界中查找
在百度中查找
在本站中查找
陈祖希陈祖希
华侨大学 计算机科学与技术学院, 福建 厦门 361021
在期刊界中查找
在百度中查找
在本站中查找
骆翔宇骆翔宇
华侨大学 计算机科学与技术学院, 福建 厦门 361021
在期刊界中查找
在百度中查找
在本站中查找
汪小勇汪小勇
卡斯柯信号有限公司, 上海 200070
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:郑黎晓(1983-),女,博士,副教授,CCF专业会员,主要研究领域为软件测试,形式语言与自动机.;余李林(1997-),男,博士生,主要研究领域为软件工程,数据科学,自然语言处理;陈海明(1966-),男,博士,研究员,博士生导师,CCF高级会员,主要研究领域为软件设计方法和形式规约,程序语言;陈祖希(1981-),男,博士,讲师,CCF专业会员,主要研究领域为形式化方法;骆翔宇(1974-),男,博士,教授,CCF专业会员,主要研究领域为形式化方法,模型检测,时态逻辑,多智能体系统;汪小勇(1976-),男,博士,高级工程师,主要研究领域为交通运输系统信息与控制
通讯作者:骆翔宇, E-mail: luoxy@hqu.edu.cn
中图分类号:TP311
基金项目:国家自然科学基金(61872339); 福建省自然科学基金(2021J01316, 2021J01320); 中央高校基本科研业务费专项资金(ZQN-1010); 厦门市自然科学基金(3502Z20227191); 上海市自然科学基金(22ZR1422200)

Mutation-based Generation Algorithm of Negative Test Strings from Regular Expressions

Author:

ZHENG Li-Xiao
ZHENG Li-Xiao
College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
在期刊界中查找
在百度中查找
在本站中查找
YU Li-Lin
YU Li-Lin
College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
在期刊界中查找
在百度中查找
在本站中查找
CHEN Hai-Ming
CHEN Hai-Ming
State Key Laboratory of Computer Science (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找
CHEN Zu-Xi
CHEN Zu-Xi
College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
在期刊界中查找
在百度中查找
在本站中查找
LUO Xiang-Yu
LUO Xiang-Yu
College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
在期刊界中查找
在百度中查找
在本站中查找
WANG Xiao-Yong
WANG Xiao-Yong
CASCO Signal Ltd., Shanghai 200070, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

正则表达式在计算机科学的许多领域具有广泛应用. 然而, 由于正则表达式语法比较复杂, 并且允许使用大量元字符, 导致开发人员在定义和使用时容易出错. 测试是保证正则表达式语义正确性的实用和有效手段, 常用的方法是根据被测表达式生成一些字符串, 并检查它们是否符合预期. 现有的测试数据生成大多只关注正例串, 而研究表明, 实际开发中存在的错误大部分在于定义的语言比预期语言小, 这类错误只能通过反例串才能发现. 研究基于变异的正则表达式反例测试串生成. 首先通过变异向被测表达式中注入缺陷得到一组变异体, 然后在被测表达式所定义语言的补集中选取反例字符串揭示相应变异体所模拟的错误. 为了能够模拟复杂缺陷类型, 以及避免出现变异体特化而无法获得反例串的问题, 引入二阶变异机制. 同时采取冗余变异体消除、变异算子选择等优化技术对变异体进行约简, 从而控制最终生成的测试集规模. 实验结果表明, 与已有工具相比, 所提算法生成的反例测试串规模适中, 并且具有较强的揭示错误能力.

关键词:正则表达式;正则语言;字符串生成;变异测试;变异体约简

Abstract:

Regular expressions are widely used in various areas of computer science. However, due to the complex syntax and the use of a large number of meta-characters, regular expressions are quite error-prone when defined and used by developers. Testing is a practical and effective way to ensure the semantic correctness of regular expressions. The most common method is to generate a set of character strings according to the tested expression and check whether they comply with the intended language. Most of the existing test data generation focuses only on positive strings. However, empirical study shows that a majority of errors during actual development are manifested by the fact that the defined language is smaller than the intended one. In addition, such errors can only be detected by negative strings. This study investigates the generation of negative strings from regular expressions based on mutation. The study first obtains a set of mutants by injecting defects into the tested expression through mutation and then selects a negative character string in the complementary set of the language defined by the tested expression to reveal the error simulated by the corresponding mutant. In order to simulate complex defects and avoid the problem that the negative strings cannot be obtained due to the specialization of mutants, a second-order mutation mechanism is adopted. Meanwhile, optimization techniques such as redundant mutant elimination and mutation operator selection are used to reduce the mutants, so as to control the size of the finally generated test set. The experimental results show that the proposed algorithm can generate negative test strings with a moderate size and have strong error detection ability compared with the existing tools.

Key words:regular expression;regular language;string generation;mutation testing;mutant reduction

引用本文

郑黎晓,余李林,陈海明,陈祖希,骆翔宇,汪小勇.基于变异的正则表达式反例测试串生成算法.软件学报,2024,35(7):3355-3376

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2022-05-24
最后修改日期:2022-10-26
录用日期:
在线发布日期: 2023-08-30
出版日期: 2024-07-06

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码