1/(1-1/k)-Optimal Algorithm for Regular Expression Grouping
Author:
Affiliation:

  • Article
  • | |
  • Metrics
  • |
  • Reference [18]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Dividing regular expression sets into multiple groups is an important process to solve the problem of DFA state explosion. Previous grouping algorithms are heuristic or are done by brute-force, which have poor grouping results. This paper analyzes the reasons of states explosion and summarizes conflicting relationship among regular expressions of some types. When conflicts are non-negative and independent, the optimum k-grouping problem of regular expression sets can be reduced to the maximum k-cut problem, which is NP-hard. Based on the idea of local searching, a new grouping algorithm named GRELS is introduced to solve the problem efficiently, which is 1/(1-1/k) -approximation for maximum k-cut problem. Comparing with previous grouping algorithms, GRELS has the minimum number of states for the same number of groups, and requires the least time to update grouping results when pattern sets change.

    Reference
    [1] Thompson K. Programming techniques: Regular expression search algorithm. Communications of the ACM, 1968,11(6):419-422. [doi: 10.1145/363347.363387]
    [2] Myers G. A four Russians algorithm for regular expression pattern matching. Journal of the ACM, 1992,39(2):432-448. [doi: 10.1145/128749.128755]
    [3] Hopcroft JE, Motwani R, Ullman JD. Introduction to Automata Theory, Languages, and Computation. 2nd ed., Boston: Addison Wesley, 2000. 1-50.
    [4] Kumar S, Chandrasekaran B, Turner JS, Varghese G. Curing regular expressions matching from insomnia, amnesia, and acalculia. In: Proc. of the ANCS 2007. New York: ACM Press, 2007. 155-164. [doi: 10.1145/1323548.1323574]
    [5] Becchi M, Crowley P. A hybrid finite automaton for practical deep packet inspection. In: Proc. of the CoNEXT 2007. New York: ACM Press, 2007. [doi: 10.1145/1364654.1364656]
    [6] Smith R, Estan C, Jha S. XFA: Faster signature matching with extended automata. In: Proc. of the S&P 2008. New York: IEEE Inc., 2008. 187-201. [doi: 10.1109/SP.2008.14]
    [7] Yu F, Chen ZF, Diao YL, Lakshman TV, Katz RH. Fast and memory-efficient regular expression matching for deep packet inspection. In: Proc. of the ANCS 2006. New York: ACM Press, 2006. 93-102. [doi: 10.1145/1185347.1185360]
    [8] Xu Q, E YP, Ge JG, Qian HL. Efficient regular expression compression algorithm for deep packet inspection. Journal of Software, 2009,20(8):2214-2226 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/3311.htm [doi: 10.3724/SP.J.1001. 2009.03311]
    [9] Becchi M, Cadambi S. Memory-Efficient regular expression search using state merging. In: Proc. of the INFOCOM 2007. Piscataway: IEEE Inc., 2007. 1064-1072. [doi: 10.1109/INFCOM.2007.128]
    [10] Becchi M, Franklin M, Crowley P. A workload for evaluating deep packet inspection architectures. In: Proc. of the IISWC 2008. Piscataway: IEEE Computer Society, 2008. 79-89. [doi: 10.1109/IISWC.2008.4636093]
    [11] Majumder A, Rastogi R, Vanama S. Scalable regular expression matching on data streams. In: Proc. of the SIGMOD 2008. New York: ACM Press, 2008. 161-172. [doi: 10.1145/1376616.1376635]
    [12] Rohrer J, Atasu K, Lunteren JV, Hagleitner C. Memory-Efficient distribution of regular expressions for fast deep packet inspection. In: Proc. of the CODES+ISSS 2009. New York: ACM Press, 2009. 147-154. [doi: 10.1145/1629435.1629456]
    [13] Kumar S, Dharmapurikar S, Yu F, Crowley P, Turner J. Algorithms to accelerate multiple regular expressions matching for deep packet inspection. ACM SIGCOMM Computer Communication Review, 2006,36(4):339-350. [doi: 10.1145/1159913.1159952]
    [14] Kumar S, Turner J, Williams J. Advanced algorithms for fast and scalable deep packet inspection. In: Proc. of the ANCS 2006. New York: ACM Press, 2006. 81-92. [doi: 10.1145/1185347.1185359]
    [15] Becchi M, Crowley P. An improved algorithm to accelerate regular expression evaluation. In: Proc. of the ANCS 2007. New York: ACM Press, 2007. 145-154. [doi: 10.1145/1323548.1323573]
    [16] Ficara D, Giordano S, Procissi G, Vitucci F, Antichi G, Pietro AD. An improved DFA for fast regular expression matching. ACM SIGCOMM Computer Communication Review, 2008,38(5):29-40. [doi: 10.1145/1452335.1452339]
    [17] Liu TW, Yang YF, Liu YB, Sun Y, Guo L. An efficient regular expressions compression algorithm from a new perspective. In: Proc. of the INFOCOM 2011. Piscataway: IEEE Inc., 2011. 2129-2137. [doi: 10.1109/INFCOM.2011.5935024]
    [18] Frieze A, Jerrum M. Improved approximation algorithms for MAX k-CUT and MAX BISECTION. Algorithmica, 1997,18(1): 67-81. [doi: 10.1007/BF02523688]
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

柳厅文,孙永,卜东波,郭莉,方滨兴.正则表达式分组的1/(1-1/k)-近似算法.软件学报,2012,23(9):2261-2272

Copy
Share
Article Metrics
  • Abstract:4630
  • PDF: 6777
  • HTML: 0
  • Cited by: 0
History
  • Received:September 17,2010
  • Revised:April 28,2011
  • Online: September 05,2012
You are the first2038288Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063