Data Stream Prediction Based on Episode Rule Matching
Author:
Affiliation:

  • Article
  • | |
  • Metrics
  • |
  • Reference [23]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    This paper proposes an algorithm called Predictor. This algorithm uses an automaton per matched episode rule with general form. With the aim of finding the latest minimal and non-overlapping occurrence of all antecedents, Predictor simultaneously tracks the state transition of each automaton by a single scanning of data stream, which can not only map the boundless streaming data into the finite state space but also avoid over-matching episode rules. In addition, the results of Predictor contain the occurring intervals and occurring probabilities of future episodes. Theoretical analysis and experimental evaluation demonstrate Predictor has higher prediction efficiency and prediction precision.

    Reference
    [1] Babcock B, Babu S, Datar M, Motwani R, Widom J. Models and issues in data stream systems. In: Popa L, ed. Proc. of the 21st ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. New York: ACM Press, 2002. 1-16. [doi: 10.1145/543613.543615]
    [2] Julisch K, Dacier M. Mining intrusion detection alarms for actionable knowledge. In: Proc. of the 8th ACM SIGKDD Int’l Conf. on Knowledge Discovery in Data Mining. New York: ACM Press, 2002. 366-375. [doi: 10.1145/775047.775101]
    [3] Ng A, Fu AW. Mining frequent episodes for relating financial events and stock trends. In: Whang KY, Jeon J, Shim K, Srivastava J, eds. Proc. of the 7th Pacific-Asia Conf. on Knowledge Discovery and Data Mining. 2003. 27-39. [doi: 10.1007/3-540-36175-8_4]
    [4] Cortes C, Fisher K, Pregibon D, Rogers A. Hancock: A language for extracting signatures from data streams. In: Proc. of the 6th ACM SIGKDD Int’l Conf. on Knowledge Discovery in Data Mining. New York: ACM Press, 2000. 9-17. [doi: 10.1145/347090.347094]
    [5] Fletcher AK, Rangan S, Goyal VK. Estimation from lossy sensor data: Jump linear modeling and Kalman filtering. In: Proc. of the 3rd Int’l Symp. on Information Processing in Sensor Networks. 2004. 251-258. [doi: 10.1145/984622.984659]
    [6] Jain A, Chang EY, Wang YF. Adaptive stream resource management using Kalman Filter. In: Weikum G, Konig AC, Debloch S, eds. Proc. of the 2004 ACM SIGMOD Int’l Conf. on Management of Data. New York: ACM Press, 2004. 11-22. [doi: 10.1145/1007568.1007573]
    [7] Papadimitriou S, Sun JM, Faloutsos C. Streaming pattern discovery in multiple time series. In: Bohm K, Jensen CS, Haas LM, Kersten ML, Larson PA, Ooi BC, eds. Proc. of the 31st Int’l Conf. on Very Large Data Bases. Toronto: Morgan Kaufmann Publishers, 2005. 697-708.
    [8] Pokrajac D, Hoskinson RL, Obradovic Z. Modeling spatial temporal data with a short observation history. Knowledge and Information Systems, 2003,5(3):368-386. [doi: 10.1007/s10115-002-0094-1]
    [9] Lazarevic A, Kanapady R, Kamath C. Effective localized regression for damage detection in large complex mechanical structures. In: Kim W, Kohavi R, Gehrke J, DuMouchel W, eds. Proc. of the 10th ACM SIGKDD Int’l Conf. on Knowledge Discovery in Data Mining. New York: ACM Press, 2004. 450-459. [doi: 10.1145/1014052.1014103]
    [10] Laxman S, Tankasali V, White RW. Stream prediction using a generative model based on frequent episodes in event sequences. In: Li Y, Liu B, Sarawagi S, eds. Proc. of the 14th ACM SIGKDD Int’l Conf. on Knowledge Discovery in Data Mining. New York: ACM Press, 2008. 453-461. [doi: 10.1145/1401890.1401947]
    [11] Cho CW, Zheng Y, Chen ALP. Continuously matching episode rules for predicting future events over event streams. In: Dong GZ, Lin XM, Wang W, Yang Y, Yu JX, eds. Proc. of the 9th Asia-Pacific Web Conf. and the 8th Int’l Conf. on Web-Age Information Management. 2007. 884-891. [doi: 10.1007/978-3-540-72524-4_91]
    [12] Cho CW, Zheng Y, Wu YH, Chen ALP. A tree-based approach for event prediction using episode rules over event streams. In: Bhowmick SS, Kung J, Wagner R, eds. Proc. of the 19th Int’l Conf. on Database and Expert Systems Applications. 2008. 225-240. [doi: 10.1007/978-3-540-85654-2_24]
    [13] Cho CW, Wu YH, Yen SJ, Zheng Y, Chen AL P. On-Line rule matching for event prediction. The VLDB Journal, Online First, 2010. [doi: 10.1007/s00778-010-0197-3]
    [14] Yang Q, Li TY, Wang K. Building association-rule based sequential classifiers for Web-document prediction. Data Mining and Knowledge Discovery, 2004,8(3):253-273. [doi: 10.1023/B:DAMI.0000023675.04946.f1]
    [15] Mannila H, Toivonen H. Discovering generalized episodes using minimal occurrences. In: Simoudis E, Han JW, Fayyad UM, eds. Proc. of the 2nd ACM SIGKDD Int’l Conf. on Knowledge Discovery in Data Mining. New York: ACM Press, 1996. 146-151.
    [16] Laxman S, Sastry PS, Unnikrishnan KP. Discovering frequent episodes and learning hidden Markov models: A formal connection. IEEE Trans. on Knowledge and Data Engineering, 2005,17(11):1505-1517. [doi: 10.1109/TKDE.2005.181]
    [17] Zhu HS, Wang P, He XM, Li YJ, Wang W, Shi BL. Efficient episode mining with minimal and non-overlapping occurrences. In: Webb GI, Liu B, Zhang CQ, Gunopulos D, Wu XD, eds. Proc. of the 10th Int’l Conf. on Data Mining. Sydney: IEEE Computer Society, 2010. 1211-1216. [doi: 10.1109/ICDM.2010.25]
    [18] Altinel M, Franklin MJ. Efficient filtering of XML documents for selective dissemination of information. In: Abbadi AE, Brodie ML, Chakravarthy S, Dayal U, Kamel N, Schlageter G, Whang KY, eds. Proc. of the 26th Int’l Conf. on Very Large Data Bases. Toronto: Morgan Kaufmann Publishers, 2000. 53-64.
    [19] Peng F, Chawathe SS. XPath queries on streaming data. In: Halevy AY, Ives ZG, Doan AH, eds. Proc. of the 2003 ACM SIGMOD Int’l Conf. on Management of Data. New York: ACM Press, 2003. 431-442. [doi: 10.1145/872757.872810]
    [20] Viglas SD, Naughton JF. Rate-Based query optimization for streaming information sources. In: Franklin MJ, Moon B, Ailamaki A, eds. Proc. of the 2002 ACM SIGMOD Int’l Conf. on Management of Data. New York: ACM Press, 2002. 37-48. [doi: 10.1145/564691.564697]
    [21] Wu E, Diao YL, Rizvi S. High-Performance complex event processing over streams. In: Chaudhuri S, Hristidis V, Polyzotis N, eds. Proc. of the 2006 ACM SIGMOD Int’l Conf. on Management of Data. New York: ACM Press, 2006. 407-418. [doi: 10.1145/1142473.1142520]
    [22] Mannila H, Toivonen H, Verkamo AI. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1997,1(3):259-289. [doi: 10.1023/A:1009748302351]
    [23] http://www.cnki.net
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

朱辉生,汪卫,施伯乐.基于情节规则匹配的数据流预测.软件学报,2012,23(5):1183-1194

Copy
Share
Article Metrics
  • Abstract:4004
  • PDF: 5774
  • HTML: 0
  • Cited by: 0
History
  • Received:March 28,2011
  • Revised:July 21,2011
  • Online: April 29,2012
You are the first2032348Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063