主动自动机学习中的等价查询算法优化
作者:
作者简介:

潘雁(1995-),男,博士生,主要研究领域为软件逆向,协议脆弱性分析;祝跃飞(1962-),男,博士,教授,博士生导师,主要研究领域为网络安全,密码学

通讯作者:

祝跃飞,E-mail:yfzhu17@sina.com

基金项目:

国家重点研发计划(2019QY1300)


Optimization of Equivalence Query Algorithm in Active Automata Learning
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [47]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    模型学习是一种获取黑盒软件系统行为模型的有效方法,可分为主动学习和被动学习.主动学习是基于字母表构造测试用例,通过与黑盒系统主动交互,可在多项式时间内得到目标系统的最小完备自动机,其中等价查询仍是开发和应用主动自动机学习工具的障碍之一.通过探讨反例对于学习算法的影响,定义假设的比较规则,提出测试用例构造的两个原则,同时依据原则对Wp-method等价查询算法改进,产生更优的假设,有效降低查询的数量,并基于LearnLib开源工具,分别以3类自动机为实验对象验证原则和改进算法的有效性.

    Abstract:

    As an effective technique for black-box state machine models of software systems, model learning (a.k.a. automata learning) can be divided into active and passive learning. Based on given input and output alphabets, the minimum complete state machine of the target system can be obtained in polynomial time through active interaction with the black box system. And the algorithm of equivalence query is still a big obstacle to the development and application of active automata learning tools. This study discusses the influence of counterexamples on the learning algorithms with the discrimination tree, and defines the comparison rules of hypotheses, and proposes two principles of constructing test cases. According to the principle, the Wp-method equivalence query algorithm is improved to produce better hypotheses and effectively reduce the number of queries and symbols. Based on the LearnLib, three kinds of automata are used as experimental objects to verify the effectiveness of the principle and the improved algorithm.

    参考文献
    [1] Ali S, Sun HL, Zhao YW. Model learning: A survey of foundations, tools and applications. Frontiers of Computer Science, 2021, 15(5): 155210. [doi: 10.1007/s11704-019-9212-z]
    [2] Chalupar G, Peherstorfer S, Poll E, de Ruiter J, de Ruiter JEJ. Automated reverse engineering using LEGO. In: Proc. of the 8th USENIX Workship on Offensive Technologies. San Diego: USENIX, 2014. 1–10.
    [3] de Ruiter J, Poll E. Protocol state fuzzing of TLS implementations. In: Proc. of the 24th USENIX Conf. on Security Symp. Washington, DC: USENIX Association, 2015. 193–206.
    [4] Aslam K, Cleophas L, Schiffelers R, van den Brand M. Interface protocol inference to aid understanding legacy software components. Software and Systems Modeling, 2020, 19(6): 1519–1540. [doi: 10.1007/s10270-020-00809-2]
    [5] Gold EM. Complexity of automaton identification from given data. Information & Control, 1978, 37(3): 302–320. [doi: 10.1016/S0019-9958(78)90562-4]
    [6] Angluin D. Learning regular sets from queries and counterexamples. Information & Computation, 1987, 75(2): 87–106. [doi: 10.1016/0890-5401(87)90052-6]
    [7] Chow TS. Testing software design modeled by finite-state machines. IEEE Transactions on Software Engineering, 1978, SE-4(3): 178–187. [doi: 10.1109/TSE.1978.231496]
    [8] Fujiwara S, Bochmann GV, Khendek F, Amalou M, Ghedamsi A. Test selection based on finite state models. IEEE Transactions on Software Engineering, 1991, 17(6): 591–603. [doi: 10.1109/32.87284]
    [9] Daniel LA, Poll E, de Ruiter J. Inferring OpenVPN state machines using protocol state fuzzing. In: Proc. of the 2018 IEEE European Symp. on Security and Privacy Workshops (EuroS&PW). London: IEEE, 2018. 11–19.
    [10] Guo JX, Gu CX, Chen X, Wei FS. Model learning and model checking of IPSec implementations for internet of things. IEEE Access, 2019, 7: 171322–171332. [doi: 10.1109/ACCESS.2019.2956062]
    [11] Fiterău-Broştean P, Jonsson B, Merget R, de Ruiter J, Sagonas K, Somorovsky J. Analysis of DTLS implementations using protocol state fuzzing. In: Proc. of the 29th USENIX Security Symp. Boston: USENIX Association, 2020. 2523–2540.
    [12] Moon SJ, Helt J, Yuan YF, Bieri Y, Banerjee S, Sekar V, Wu WF, Yannakakis M, Zhang Y. Alembic: Automated model inference for stateful network functions. In: Proc. of the 16th USENIX Conf. on Networked Systems Design and Implementation. Boston: USENIX Association, 2019. 699–718.
    [13] Aarts FD. Tomte: Bridging the gap between active learning and real-world systems [Ph.D. Thesis]. Nijmegen: Radboud Universiteit, 2014.
    [14] Aarts F, Jonsson B, Uijen J, Vaandrager F. Generating models of infinite-state communication protocols using regular inference with abstraction. Formal Methods in System Design, 2015, 46(1): 1–41. [doi: 10.1007/s10703-014-0216-x]
    [15] Shahbaz M, Groz R. Inferring mealy machines. In: Cavalcanti A, Dams DR, eds. Proc. of the 2009 Int’l Symp. on Formal Methods. Eindhoven: Springer, 2009. 207–222.
    [16] Kearns MJ, Vazirani UV. An Introduction to Computational Learning Theory. Cambridge: MIT Press, 1994.
    [17] Sabnani KK, Dahbura AT. A protocol testing generation procedure. Computer Networks and ISDN Systems, 1988, 15(5): 285–297.
    [18] 崔玲, 张建标, 公备, 吴丽影. 基于集合覆盖的Wp方法测试集约简方法. 北京工业大学学报, 2016, 42(9): 1332–1337. [doi: 10.11936/bjutxb2015120046]
    Cui L, Zhang JB, Gong B, Wu LY. Approach for reduction test Suite of Wp method based on set covering problem. Journal of Beijing University of Technology, 2016, 42(9): 1332–1337 (in Chinese with English abstract). [doi: 10.11936/bjutxb2015120046]
    [19] Yang N, Aslam K, Schiffelers R, Lensink L, Hendriks D, Cleophas L, Serebrenik A. Improving model inference in industry by combining active and passive learning. In: Proc. of the 26th IEEE Int’l Conf. on Software Analysis, Evolution and Reengineering (SANER). Hangzhou: IEEE, 2019. 253–263.
    [20] Rivest RL, Schapire RE. Inference of finite automata using homing sequences. Information & Computation, 1993, 103(2): 299–347. [doi: 10.1006/inco.1993.1021]
    [21] Howar FM. Active learning of interface programs [Ph.D. Thesis]. Dortmund: TU Dortmund University, 2012.
    [22] Isberner M, Steffen B. An abstract framework for counterexample analysis in active automata learning. In: Proc. of the 12th Int’l Conf. on Grammatical Inference. Kyoto: PMLR, 2014. 79–93.
    [23] Isberner M, Howar F, Steffen B. The TTT algorithm: A redundancy-free approach to active automata learning. In: Bonakdarpour B, Smolka SA, eds. Runtime Verification. RV 2014. Lecture Notes in Computer Science, vol. 8734. Toronto: Springer, 2014. 307–322.
    [24] Isberner M. Foundations of active automata learning: An algorithmic perspective [Ph.D. Thesis]. Dortmund: TU Dortmund University, 2015.
    [25] Cassel S, Howar F, Jonsson B, Merten M, Steffen B. A succinct canonical register automaton model. Journal of Logical and Algebraic Methods in Programming, 2015, 84(1): 54–66. [doi: 10.1016/j.jlamp.2014.07.004]
    [26] Howar F, Steffen B, Jonsson B, Cassel S. Inferring canonical register automata. In: Kuncak V, Rybalchenko A, eds. Proc. of the Int’l Workshop on Verification, Model Checking, and Abstract Interpretation. Philadelphia: Springer, 2012. 251–266.
    [27] Argyros G, Stais I, Jana S, Keromytis AD, Kiayias A. SFADiff: Automated evasion attacks and fingerprinting using black-box differential automata learning. In: Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security. Vienna: ACM, 2016. 1690–1701.
    [28] Raffelt H, Steffen B. LearnLib: A library for automata learning and experimentation. In: Baresi L, Heckel R, eds. Proc. of the 2006 Int’l Conf. on Fundamental Approaches to Software Engineering. Vienna: Springer, 2006. 377–380.
    [29] Cassel S, Howar F, Jonsson B, Steffen B. Software Engineering and Formal Methods. Learning extended finite state machines. In: Giannakopoulou D, Salaün G, eds. Proc. of the Int’l Conf. on Software Engineering and Formal Methods. Grenoble: Springer, 2014. 250–264.
    [30] Aarts F, Heidarian F, Kuppens H, Olsen P, Vaandrager F. Automata learning through counterexample guided abstraction refinement. In: Giannakopoulou D, Méry D, eds. Proc. of the Int’l Symp. on Formal Methods. Paris: Springer, 2012. 10–27.
    [31] Cho CY, Basić D, Shin ECR, Song D. Inference and analysis of formal models of botnet command and control protocols. In: Proc. of the 17th ACM Conf. on Computer and Communications Security. Chicago: Association for Computing Machinery, 2010. 426–439.
    [32] Rasool A, Alpár G, de Ruiter J. State machine inference of QUIC. arXiv:1903.04384, 2019.
    [33] Fiterău-Broştean P, Janssen R, Vaandrager F. Combining model learning and model checking to analyze TCP implementations. In: Chaudhuri S, Farzan A, eds. Proc. of the 2016 Int’l Conf. on Computer Aided Verification. Toronto: Springer, 2016. 454–471.
    [34] Verleg P. Inferring SSH state machines using protocol state fuzzing [MS. Thesis]. Nijmegen: Radboud University, 2016.
    [35] 申莹珠, 顾纯祥, 陈熹, 张协力, 卢政宇. 基于模型学习的OpenVPN系统脆弱性分析. 软件学报, 2019, 30(12): 3750–3764. http://www.jos.org.cn/1000-9825/5612.htm
    Shen YZ, Gu CX, Chen X, Zhang XL, Lu ZY. Vulnerability analysis of OpenVPN system based on model learning. Ruan Jian Xue Bao/Journal of Software, 2019, 30(12): 3750–3764 (in Chinese with English abstract) http://www.jos.org.cn/1000-9825/5612.htm.
    [36] 申莹珠. 基于模型学习的安全协议脆弱性分析关键技术研究 [硕士学位论文]. 郑州: 战略支援部队信息工程大学, 2018.
    Shen YZ. Research on key technology of security protocols vulnerability analysis based on model learning [MS. Thesis]. Zhengzhou: Information Engineering University, 2018 (in Chinese with English abstract).
    [37] 王辰, 吴礼发, 洪征, 郑成辉, 庄洪林. 一种基于域知识的协议状态机主动推断算法. 计算机科学, 2015, 42(12): 233–239.
    Wang C, Wu LF, Hong Z, Zheng CH, Zhuang HL. Domain-specific algorithm of protocol state machine active inference. Computer Science, 2015, 42(12): 233–239 (in Chinese with English abstract).
    [38] Fiterău-Broştean P, Lenaerts T, Poll E, de Ruiter J, Vaandrager F, Verleg P. Model learning and model checking of SSH implementations. In: Proc. of the 24th ACM SIGSOFT Int’l SPIN Symp. on Model Checking of Software. Santa Barbara: Association for Computing Machinery, 2017. 142–151.
    [39] Howar F, Jonsson B, Vaandrager F. Combining black-box and white-box techniques for learning register automata. In: Steffen B, Woeginger G, eds. Computing and Software Science. Lecture Notes in Computer Science, vol. 10000. Springer, 2019. 563–588.
    [40] Smetsers R, Moerman J, Janssen M, Verwer S. Complementing model learning with mutation-based fuzzing. arXiv:1611.02429, 2016.
    [41] Irfan MN, Oriat C, Groz R. Angluin style finite state machine inference with non-optimal counterexamples. In: Proc. of the 1st Int’l Workshop on Model Inference in Testing. Trento: Association for Computing Machinery, 2010. 11–19.
    [42] Türker UC, Hierons RM, Jourdan GV. Minimizing characterizing sets. Science of Computer Programming, 2021, 208: 102645. [doi: 10.1016/j.scico.2021.102645]
    [43] Aichernig BK, Tappler M. Efficient active automata learning via mutation testing. Journal of Automated Reasoning, 2019 , 63(4): 1103-1134. [doi: 10.1007/s10817-018-9486-0]
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

潘雁,祝跃飞.主动自动机学习中的等价查询算法优化.软件学报,2023,34(7):3241-3255

复制
分享
文章指标
  • 点击次数:670
  • 下载次数: 2437
  • HTML阅读次数: 1422
  • 引用次数: 0
历史
  • 收稿日期:2021-07-26
  • 最后修改日期:2021-09-27
  • 在线发布日期: 2022-09-23
  • 出版日期: 2023-07-06
文章二维码
您是第20057561位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号