• Article
  • | |
  • Metrics
  • |
  • Reference [10]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    In this paper, the methods are investigate for online,frequent paRem mining of stream data,with the following contributions:(1) based on heuristic methodology and sample theory,step-by-step data stream mining method is used to estimate potential paRern set;(2)will find any length paRern not only single item pattern;(3)to find more appropriate length of each segment satisfying accuracy requirement,Hoeffding bound theory was introduced and revised to make it more suit for pattern mining;(4)a maintenance approach for estimating frequent patterns is developed for on.1ine analysis.Based on this design,estimation and maintenance algorithms are proposed for efficient analysis of data streams.This performance study compares the proposed algorithms and identifies the most accuracy-,memory-and time-efficient algorithms for stream data analysis.

    Reference
    [1] Garofalakis M,Gehrke J,Rastogi R.Querying and mining data streams:You only get one look.In:Tutorial at 2002 ACM-SIGMOD Int’l Conf.on Management of Data(SIGMOD 02).Madison,WI,2002.
    [2] Carney D,Cetintemel U,Chemiack M,Convey C,Lee S,Seidman G,Stonebraker M,Tatbul N,Zdonik S.Monitoring streams—A new class of data management applications.In:Proc.of the 28th Int’l Cone on Very Large Data Bases.2002.
    [3] Cones C,Fisher K,Pregibon D,Rogers A,Smith F.Hancock:A language for extracting signatures from data streams.In:Proc.of the 2000 ACM SIGKDD Int’l Cone on Knowledge Discovery and Data Mining.2000.9~l7.
    [4] Arasu A,Babcock B,Babu S,Datar M,Ito K,Nishizawa I,Rosenstein J,Widom J.STREAM:The Stanford stream data manager.In:Proc.of the ACM Int’l Conf.on Management of Data(SIGMOD 2003).2003.
    [5] Avnur R,Hellerstein JM.Eddies:Continuously adaptive query processing.In:Proc.of the 2000 ACM SIGMOD Int’l Cone on Management ofData.2000.261~272.
    [6] Guha S,Mishra N,Motwani R,O’Callaghan L.Clustering data streams.In:Proc.of the Annual Symp.on Foundations of Computer Science(FOES 2000).2000.
    [7] O'Callaghan L,Mishra N,Meyerson A,Guha S,Motwani R.High-Performance clustering of streams and large data Sets.In:Proc.of the 2002 Int’l Conf.on Data Engineering(ICDE 2002).2002.
    [8] Domingos P,Hulten G.Mining high-speed data streams.In:Proc.of the 2000 ACM SIGKDD Int’l Conf on Knowledge Discovery and Data Mining.2000.7l~80.
    [9] Hulten G,Spencer L,Domingus P.Mining time-changing data streams.In:Proc.of the 2001 ACM SIGKDD Int’l Cone on Knowledge Discovery and Data Mining.2001.http://citeseer.nj.nec.com/hulten01mining.html
    [10] Chen Y,Dong G,Han J,wah BW,Wang J.Multi-Dimensional regression analysis of time-series data streams.In:VLDB Cone 2002. [ll] Charikar M,Chen K,Farach-Colton M.Finding frequent items in data streams.In:Proc.of the 29th Int’I Colloquium on Automata,Languages and Programming.2002. [12] Manku GS,Motwani R.Approximate frequency counts over streaming data.In:Proc.of the 28th Int’l Cone on Very Large Data Bases(VLDB 2002).2002. [13] Gibbons PB,Matias Y.New sampling-based summary statistics for improving approximate query answers.In:Proc.of the 1998 ACM SIGMOD 1998. 331~342.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

宋国杰,唐世渭,杨冬青,王腾蛟.数据流中频繁模式的评估与维护.软件学报,2004,15(zk):20-27

Copy
Share
Article Metrics
  • Abstract:3106
  • PDF: 4728
  • HTML: 0
  • Cited by: 0
History
You are the first2035299Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063