面向流数据分类的在线学习综述
作者:
作者简介:

翟婷婷(1988-),女,河南济源人,博士,讲师,主要研究领域为机器学习,模式识别;朱俊武(1972-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为知识工程,本体论,机制设计,云计算;高阳(1972-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为大数据分析,机器学习,多智能体系统,视频/图像处理.

通讯作者:

zhtt.go@gmail.com

基金项目:

国家重点研发计划(2017YFB0702600,2017YFB0702601);国家自然科学基金(61906165,61432008,61872313);江苏省高等学校自然科学研究项目(19KJB520064)


Survey of Online Learning Algorithms for Streaming Data Classification
Author:
Fund Project:

National Key Research and Development Program of China (2017YFB0702600, 2017YFB0702601); National Natural Science Foundation of China (61906165, 61432008, 61872313); Natural Science Foundation of the Jiangsu Higher Education Institutions of China (19KJB520064)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [144]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    流数据分类旨在从连续不断到达的流式数据中增量学习一个从输入变量到类标变量的映射函数,以便对随时到达的测试数据进行准确分类.在线学习范式作为一种增量式的机器学习技术,是流数据分类的有效工具.主要从在线学习的角度对流数据分类算法的研究现状进行综述.具体地,首先介绍在线学习的基本框架和性能评估方法,然后着重介绍在线学习算法在一般流数据上的工作现状,在高维流数据上解决"维度诅咒"问题的工作现状,以及在演化流数据上处理"概念漂移"问题的工作现状,最后讨论高维和演化流数据分类未来仍然存在的挑战和亟待研究的方向.

    Abstract:

    The objective of streaming data classification is to learn incrementally a decision function that maps input variables to a label variable, from continuously arriving streaming data, so as to accurately classify the test data that may arrive anytime. The online learning paradigm, as an incremental machine learning technology, is an effective tool for classification of streaming data. This paper mainly summarizes, from the perspective of online learning, the recent development of algorithms for streaming data classification. Specifically, the basic framework and the performance evaluation methodology of online learning are first introduced. Then, the latest development of online learning algorithms for general streaming data, for alleviating the "curse of dimensionality" problem in high-dimensional streaming data, and for resolving the "concept drifting" problem in evolving streaming data are reviewed respectively. Finally, future challenges and promising research directions for classification of high-dimensional and evolving streaming data are also discussed.

    参考文献
    [1] Aggarwal CC. A survey of stream classification algorithms. In:Data Classification:Algorithms and Applications. CRC Press, 2014. 245-274.
    [2] Krempl G, Zliobaite I, Brzezinski D, Hüllermeier E, Last M, Lemaire V, Noack T, Shaker A, Sievi S, Spiliopoulou M, Stefanowski J. Open challenges for data stream mining research. SIGKDD Explorations, 2014,16(1):1-10.
    [3] Zhai TT. Online learning algorithms for classification of streaming data[Ph.D. Thesis]. Nanjing:Nanjing University, 2018(in Chinese with English abstract).
    [4] Vapnik V. An overview of statistical learning theory. IEEE Trans. on Neural Networks, 1999,10(5):988-999.
    [5] Shalev-Shwartz S, Singer Y. Online learning:Theory, algorithms, and applications[Ph.D. Thesis]. Jerusalem:Hebrew University, 2007.
    [6] Shalev-Shwartz S. Online learning and online convex optimization. Foundations and Trends in Machine Learning, 2012,4(2):107-194.
    [7] Hazan E. Introduction to online convex optimization. Foundations and Trends in Optimization, 2016,2(3/4):157-325.
    [8] Cesa-Bianchi N, Conconi A, Gentile C. On the generalization ability of online learning algorithms. IEEE Trans. on Information Theory, 2004,50(9):2050-2057.
    [9] Shalev-Shwartz S, Singer Y, Srebro N. Pegasos:Primal estimated sub-gradient solver for SVM. In:Proc. of the Int'l Conf. on Machine Learning (ICML 2007). 2007. 807-814.
    [10] Zhang L, Yi J, Jin R, Lin M, He X. Online kernel learning with a near optimal sparsity bound. In:Proc. of the Int'l Conf. on Machine Learning (ICML 2013). 2013. 621-629.
    [11] Daniely A, Gonen A, Shalev-Shwartz S. Strongly adaptive online learning. In:Proc. of the Int'l Conf. on Machine Learning (ICML 2015). 2015. 1405-1411.
    [12] György A, Szepesvári C. Shifting regret, mirror descent, and matrices. In:Proc. of the Int'l Conf. on Machine Learning (ICML 2016). 2016. 2943-2951.
    [13] Shamir O, Szlak L. Online learning with local permutations and delayed feedback. In:Proc. of the 34th Int'l Conf. on Machine Learning (ICML 2017). 2017. 3086-3094.
    [14] Quanrud K, Khashabi D. Online learning with adversarial delays. In:Proc. of the Advances in Neural Information Processing Systems (NIPS 2015). 2015. 1270-1278.
    [15] Luo H, Agarwal A, Cesa-Bianchi N, Langford J. Efficient second order online learning by sketching. In:Proc. of the Advances in Neural Information Processing Systems (NIPS 2016). 2016. 902-910.
    [16] Erven T, Koolen WM. MetaGrad:Multiple learning rates in online learning. In:Proc. of the Advances in Neural Information Processing Systems (NIPS 2016). 2016. 3666-3674.
    [17] Luo H, Wei CY, Zheng K. Efficient online portfolio with logarithmic regret. In:Proc. of the Advances in Neural Information Processing Systems (NIPS 2018). 2018. 8245-8255.
    [18] Gillen S, Jung C, Kearns MJ, Roth A. Online learning with an unknown fairness metric. In:Proc. of the Advances in Neural Information Processing Systems (NIPS 2018). 2018. 2605-2614.
    [19] Lu J, Hoi SCH, Wang J, Zhao P, Liu Z. Large scale online kernel learning. Journal of Machine Learning Research, 2016,17:47:1-47:43.
    [20] Shi T, Zhu J. Online Bayesian passive-aggressive learning. Journal of Machine Learning Research, 2017,18:33:1-33:39.
    [21] Le T, Nguyen TD, Nguyen V, Phung DQ. Approximation vector machines for large-scale online learning. Journal of Machine Learning Research, 2017,18:111:1-111:55.
    [22] Lei Y, Shi L, Guo Z. Convergence of unregularized online learning algorithms. Journal of Machine Learning Research, 2017,18:171:1-171:33.
    [23] Chaudhuri S, Tewari A. Online learning to rank with top-k feedback. Journal of Machine Learning Research, 2017,18:103:1-103:50.
    [24] Wang J, Zhao P, Hoi SCH, Jin R. Online feature selection and its applications. IEEE Trans. on Knowledge and Data Engineering, 2014,26(3):698-710.
    [25] Wang J, Wang M, Li P, Liu L, Zhao Z, Hu X, Wu X. Online feature selection with group structure analysis. IEEE Trans. on Knowledge and Data Engineering, 2015,27(11):3029-3041.
    [26] Hao S, Lu J, Zhao P, Zhang C, Hoi SCH, Miao C. Second-order online active learning and its applications. IEEE Trans. on Knowledge and Data Engineering, 2018,30(7):1338-1351.
    [27] Hoi SCH, Sahoo D, Lu J, Zhao P. Online learning:A comprehensive survey. CoRR, 2018, abs/1802.02871. http://arxiv.org/abs/1802.02871
    [28] Li ZJ, Li YX, Wang F, He GL, Kuang L. Online learning algorithms for big data analytics:A survey. Journal of Comuter Research and Develoment, 2015,52(8):1707-1721(in Chinese with English abstract).
    [29] Pan ZS, Tang SQ, Qiu JY, Hu GY. Survey on online learning algorithms. Journal of Data Acquisition & Processing, 2016,31(6):1067-1082(in Chinese with English abstract).
    [30] Rosenblatt F. The perceptron:A probabilistic model for information storage and organization in the brain. Psychological Review, 1958, 65(6):386-408.
    [31] Shalev-Shwartz S, Singer Y, Srebro N, Cotter A. Pegasos:Primal estimated sub-gradient solver for SVM. Mathematical Programming, 2011,127(1):3-30.
    [32] Zinkevich M. Online convex programming and generalized infinitesimal gradient ascent. In:Proc. of the Int'l Conf. on Machine Learning (ICML). 2003. 928-936.
    [33] Cesa-Bianchi N, Lugosi G. Prediction, Learning, and Games. New York:Cambridge University Press, 2006. 40-66.
    [34] Yuan GX, Ho CH, Lin CJ. Recent advances of large-scale linear classification. Proc. of the IEEE, 2012,100(9):2584-2603.
    [35] Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y. Online passive-aggressive algorithms. Journal of Machine Learning Research, 2006,7:551-585.
    [36] Hazan E, Agarwal A, Kale S. Logarithmic regret algorithms for online convex optimization. Machine Learning, 2007,69(2/3):169-192.
    [37] Dredze M, Crammer K, Pereira F. Confidence-weighted linear classification. In:Proc. of the Int'l Conf. on Machine Learning (ICML). 2008. 264-271.
    [38] Crammer K, Dredze M, Pereira F. Exact convex confidence-weighted learning. In:Proc. of the Advances in Neural Information Processing Systems (NIPS 2008). 2008. 345-352.
    [39] Crammer K, Dredze M, Pereira F. Confidence-weighted linear classification for text categorization. Journal of Machine Learning Research, 2012,13:1891-1926.
    [40] Crammer K, Kulesza A, Dredze M. Adaptive regularization of weight vectors. Machine Learning, 2013,91(2):155-187.
    [41] Wang J, Zhao P, Hoi SCH. Exact soft confidence-weighted learning. In:Proc. of the Int'l Conf. on Machine Learning (ICML). 2012.
    [42] Kivinen J, Smola AJ, Williamson RC. Online learning with kernels. IEEE Trans. on Signal Processing, 2004,52(8):2165-2176.
    [43] Wang Z, Crammer K, Vucetic S. Breaking the curse of kernelization:Budgeted stochastic gradient descent for large-scale SVM training. Journal of Machine Learning Research, 2012,13(1):3103-3131.
    [44] Dekel O, Shalev-Shwartz S, Singer Y. The forgetron:A kernel-based perceptron on a fixed budget. In:Proc. of the Advances in Neural Information Processing Systems (NIPS 2005). 2005. 259-266.
    [45] Cavallanti G, Cesa-Bianchi N, Gentile C. Tracking the best hyperplane with a simple budget perceptron. Machine Learning, 2007, 69(2/3):143-167.
    [46] Wang Z, Vucetic S. Online passive-aggressive algorithms on a budget. In:Proc. of the 13th Int'l Conf. on Artificial Intelligence and Statistics (AISTATS 2010). 2010. 908-915.
    [47] Zhao P, Wang J, Wu P, Jin R, Hoi SCH. Fast bounded online gradient descent algorithms for scalable kernel-based online learning. In:Proc. of the Int'l Conf. on Machine Learning (ICML). Edinburgh, 2012.
    [48] Orabona F, Keshet J, Caputo B. Bounded kernel-based online learning. Journal of Machine Learning Research, 2009,10:2643-2666.
    [49] Wang Z, Vucetic S. Twin vector machines for online learning on a budget. In:Proc. of the SIAM Int'l Conf. on Data Mining (SDM 2009). 2009. 906-917.
    [50] Jin R, Hoi SCH, Yang T. Online multiple kernel learning:Algorithms and mistake bounds. In:Proc. of the 21st Int'l Conf. on Algorithmic Learning Theory (ALT 2010). 2010. 390-404.
    [51] Hoi SCH, Jin R, Zhao P, Yang T. Online multiple kernel classification. Machine Learning, 2013,90(2):289-316.
    [52] Diethe T, Girolami MA. Online learning with (multiple) kernels:A review. Neural Computation, 2013,25(3):567-625.
    [53] Lu J, Hoi SCH, Sahoo D, Zhao P. Budget online multiple kernel learning. CoRR, 2015, abs/1511.04813.
    [54] Domingos P, Hulten G. Mining high-speed data streams. In:Proc. of the 6th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2000). 2000. 71-80.
    [55] Jin R, Agrawal G. Efficient decision tree construction on streaming data. In:Proc. of the 9th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2003). 2003. 571-576.
    [56] Gama J, Rocha R, Medas P. Accurate decision trees for mining high-speed data streams. In:Proc. of the 9th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2003). 2003. 523-528.
    [57] Holmes G, Kirkby R, Pfahringer B. Stress-testing hoeffding trees. In:Proc. of the 9th European Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD 2005). 2005. 495-502.
    [58] Pfahringer B, Holmes G, Kirkby R. New options for Hoeffding trees. In:Proc. of the 20th Australian Joint Conf. on Artificial Intelligence. 2007. 90-99.
    [59] Hashemi S, Yang Y, Mirzamomen Z, Kangavari MR. Adapted one-versus-all decision trees for data stream classification. IEEE Trans. on Knowledge and Data Engineering, 2009,21(5):624-637.
    [60] Rutkowski L, Pietruczuk L, Duda P, Jaworski M. Decision trees for mining data streams based on the McDiarmid's bound. IEEE Trans. on Knowledge and Data Engineering, 2013,25(6):1272-1279.
    [61] Rutkowski L, Jaworski M, Pietruczuk L, Duda P. Decision trees for mining data streams based on the Gaussian approximation. IEEE Trans. on Knowledge and Data Engineering, 2014,26(1):108-119.
    [62] Rutkowski L, Jaworski M, Pietruczuk L, Duda P. The CART decision tree for mining data streams. Information Sciences, 2014,266:1-15.
    [63] Liang C, Zhang Y, Shi P, Hu Z. Learning accurate very fast decision trees from uncertain data streams. Int'l Journal of Systems Science, 2015,46(16):3032-3050.
    [64] Bifet A, Holmes G, Pfahringer B, Kirkby R, Gavaldà R. New ensemble methods for evolving data streams. In:Proc. of the 15th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2009). 2009. 139-148.
    [65] Bifet A, Frank E, Holmes G, Pfahringer B. Accurate ensembles for data streams:Combining restricted Hoeffding trees using stacking. In:Proc. of the 2nd Asian Conf. on Machine Learning (ACML 2010). 2010. 225-240.
    [66] Pham XC, Dang MT, Dinh SV, Hoang S, Nguyen TT, Liew AWC. Learning from data stream based on random projection and Hoeffding tree classifier. In:Proc. of the Int'l Conf. on Digital Image Computing:Techniques and Applications (DICTA 2017). 2017. 1-8.
    [67] Rutkowski L, Jaworski M, Pietruczuk L, Duda P. A new method for data stream mining based on the misclassification error. IEEE Trans. on Neural Networks and Learning Systems, 2015,26(5):1048-1059.
    [68] Jaworski M, Duda P, Rutkowski L. New splitting criteria for decision trees in stationary data streams. IEEE Trans. on Neural Networks and Learning Systems, 2018,29(6):2516-2529.
    [69] Kakade SM, Shalev-Shwartz S, Tewari A. Efficient bandit algorithms for online multiclass prediction. In:Proc. of the Int'l Conf. on Machine Learning (ICML 2008). 2008. 440-447.
    [70] Valizadegan H, Jin R, Wang S. Learning to trade off between exploration and exploitation in multiclass bandit prediction. In:Proc. of the 17th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2011). 2011. 204-212.
    [71] Chen G, Chen G, Zhang J, Chen S, Zhang C. Beyond banditron:A conservative and efficient reduction for online multiclass prediction with bandit setting model. In:Proc. of the IEEE Int'l Conf. on Data Mining (ICDM 2009). 2009. 71-80.
    [72] Hazan E, Kale S. Newtron:An efficient bandit algorithm for online multiclass prediction. In:Proc. of the Advances in Neural Information Processing Systems (NIPS 2011). 2011. 891-899.
    [73] Crammer K, Gentile C. Multiclass classification with bandit feedback using adaptive regularization. Machine Learning, 2013,90(3):347-383.
    [74] Beygelzimer A, Orabona F, Zhang C. Efficient online bandit multiclass learning with O%(√T) regret. In:Proc. of the 34th Int'l Conf. on Machine Learning (ICML 2017). 2017. 488-497.
    [75] Allwein EL, Schapire RE, Singer Y. Reducing multiclass to binary:A unifying approach for margin classifiers. Journal of Machine Learning Research, 2000,1:113-141.
    [76] Auer P, Cesa-Bianchi N, Fischer P. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 2002,47(2/3):235-256.
    [77] Sculley D. Online active learning methods for fast label-efficient spam filtering. In:Proc. of the 4th Conf. on Email and Anti-Spam (CEAS 2007). 2007.
    [78] Chu W, Zinkevich M, Li L, Thomas A, Tseng BL. Unbiased online active learning in data streams. In:Proc. of the 17th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2011). 2011. 195-203.
    [79] Lughofer E, Pratama M. Online active learning in data stream regression using uncertainty sampling based on evolving generalized fuzzy models. IEEE Trans. on Fuzzy Systems, 2018,26(1):292-309.
    [80] Cesa-Bianchi N, Gentile C, Zaniboni L. Worst-case analysis of selective sampling for linear-threshold algorithms. In:Proc. of the Advances in Neural Information Processing Systems (NIPS 2004). 2004. 241-248.
    [81] Cesa-Bianchi N, Gentile C, Zaniboni L. Worst-case analysis of selective sampling for linear classification. Journal of Machine Learning Research, 2006,7:1205-1230.
    [82] Littlestone N. Learning quickly when irrelevant attributes abound:A new linear-threshold algorithm. Machine Learning, 1988,2(4):285-318.
    [83] Zhao P, Hoi SCH. Cost-sensitive online active learning with application to malicious URL detection. In:Proc. of the 19th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2013). 2013. 919-927.
    [84] Lu J, Zhao P, Hoi SCH. Online passive-aggressive active learning. Machine Learning, 2016,103(2):141-183.
    [85] Hao S, Zhao P, Lu J, Hoi SCH, Miao C, Zhang C. SOAL:Second-order online active learning. In:Proc. of 16th IEEE Int'l Conf. on Data Mining (ICDM 2016). 2016. 931-936.
    [86] Lughofer E. Online active learning:A new paradigm to improve practical useability of data stream modeling methods. Information Sciences, 2017,415:356-376.
    [87] Cesa-Bianchi N, Shalev-Shwartz S, Shamir O. Efficient learning with partially observed attributes. Journal of Machine Learning Research, 2011,12:2857-2878.
    [88] Zolghadr N, Bartók G, Greiner R, György A, Szepesvári C. Online learning with costly features and labels. In:Proc. of the Advances in Neural Information Processing Systems (NIPS 2013). 2013. 1241-1249.
    [89] Hazan E, Koren T. Linear regression with limited observation. In:Proc. of the 29th Int'l Conf. on Machine Learning (ICML 2012). 2012.
    [90] Cesa-Bianchi N, Shalev-Shwartz S, Shamir O. Online learning of noisy data. IEEE Trans. on Information Theory, 2011,57(12):7907-7931.
    [91] Figueiredo MAT. Adaptive sparseness for supervised learning. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2003,25(9):1150-1159.
    [92] Guyon I, Elisseeff A. An introduction to variable and feature selection. Journal of Machine Learning Research, 2003,3:1157-1182.
    [93] Yu L, Liu H. Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 2004,5:1205-1224.
    [94] Brown G, Pocock AC, Zhao MJ, Luján M. Conditional likelihood maximisation:A unifying framework for information theoretic feature selection. Journal of Machine Learning Research, 2012,13:27-66.
    [95] Tan M, Tsang IW, Wang L. Towards ultrahigh dimensional feature selection for big data. Journal of Machine Learning Research, 2014,15(1):1371-1429.
    [96] Rao NS, Nowak RD, Cox CR, Rogers TT. Classification with the sparse group lasso. IEEE Trans. on Signal Processing, 2016,64(2):448-463.
    [97] Shalev-Shwartz S, Srebro N, Zhang T. Trading accuracy for sparsity in optimization problems with sparsity constraints. SIAM Journal on Optimization, 2010,20(6):2807-2832.
    [98] Duchi JC, Shalev-Shwartz S, Singer Y, Chandra T. Efficient projections onto the ℓ1 ball for learning in high dimensions. In:Proc. of the Int'l Conf. on Machine Learning (ICML 2008). 2008. 272-279.
    [99] Condat L. Fast projection onto the simplex and the ℓ1 ball. Mathematical Programming, 2016,158(1/2):575-585.
    [100] Duchi JC, Singer Y. Efficient online and batch learning using forward backward splitting. Journal of Machine Learning Research, 2009,10:2899-2934.
    [101] Langford J, Li L, Zhang T. Sparse online learning via truncated gradient. Journal of Machine Learning Research, 2009,10:777-801.
    [102] Xiao L. Dual averaging methods for regularized stochastic learning and online optimization. Journal of Machine Learning Research, 2010,11:2543-2596.
    [103] Duchi JC, Shalev-Shwartz S, Singer Y, Tewari A. Composite objective mirror descent. In:Proc. of the 23rd Conf. on Learning Theory (COLT 2010). 2010. 14-26.
    [104] Duchi JC, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 2011,12:2121-2159.
    [105] Wang D, Wu P, Zhao P, Wu Y, Miao C, Hoi SCH. High-dimensional data stream classification via sparse online learning. In:Proc. of the IEEE Int'l Conf. on Data Mining (ICDM 2014). 2014. 1007-1012.
    [106] Zhai T, Koriche F, Wang H, Gao Y. Tracking sparse linear classifiers. IEEE Trans. on Neural Networks and Learning Systems, 2019,30(7):2079-2092.[doi:10.1109/TNNLS.2018.2877433]
    [107] Wu Y, Hoi SCH, Mei T, Yu N. Large-scale online feature selection for ultra-high dimensional sparse data. ACM Trans. on Knowledge Discovery from Data, 2017,11(4):48:1-48:22.
    [108] Zhai T, Wang H, Koriche F, Gao Y. Online feature selection by adaptive sub-gradient methods. In:Proc. of the European Conf. on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD 2018), Part II. 2018. 430-446.[doi:10.1007/978-3-030-10928-8_26.
    [109] Gama J, Zliobaite I, Bifet A, Pechenizkiy M, Bouchachia A. A survey on concept drift adaptation. ACM Computing Surveys, 2014, 46(4):44:1-44:37.
    [110] Zhai T, Gao Y, Wang H, Cao L. Classification of high-dimensional evolving data streams via a resource-efficient online ensemble. Data Mining and Knowledge Discovery, 2017,31(5):1242-1265.
    [111] Hosseini MJ, Gholipour A, Beigy H. An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowledge and Information Systems, 2016,46(3):567-597.
    [112] Gama J, Fernandes R, Rocha R. Decision trees for mining data streams. Intelligent Data Analysis, 2006,10(1):23-45.
    [113] Bifet A, Holmes G, Pfahringer B, Frank E. Fast perceptron decision tree learning from evolving data streams. In:Proc. of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2010. 299-310.
    [114] Bifet A, Pfahringer B, Read J, Holmes G. Efficient data stream classification via probabilistic adaptive windows. In:Proc. of the 28th Annual ACM Symp. on Applied Computing (SAC 2013). 2013. 801-806.
    [115] Hulten G, Spencer L, Domingos P. Mining time-changing data streams. In:Proc. of the 7th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2001). 2001. 97-106.
    [116] Elwell R, Polikar R. Incremental learning of concept drift in nonstationary environments. IEEE Trans. on Neural Networks, 2011, 22(10):1517-1531.
    [117] Brzezinski D, Stefanowski J. Accuracy updated ensemble for data streams with concept drift. In:Proc. of the Int'l Conf. on Hybrid Artificial Intelligence Systems. 2011. 155-163.
    [118] Brzezinski D, Stefanowski J. Reacting to different types of concept drift:The accuracy updated ensemble algorithm. IEEE Trans. on Neural Networks and Learning Systems, 2014,25(1):81-94.
    [119] Bonab HR, Can F. GOOWE:Geometrically optimum and online-weighted ensemble classifier for evolving data streams. ACM Trans. on Knowledge Discovery from Data, 2018,12(2):25:1-25:33.
    [120] Zhao QL, Jiang YH, Lu YT. Ensemble model and algorithm with recalling and forgetting mechanisms for data stream mining. Ruan Jian Xue Bao/Journal of Software, 2015,26(10):2567-2580(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4747.htm[doi:10.13328/j.cnki.jos.004747]
    [121] Oza NC. Online bagging and boosting. In:Proc. of the IEEE Int'l Conf. on Systems, Man and Cybernetics (SMC 2005). 2005. 2340-2345.
    [122] Kolter JZ, Maloof MA. Dynamic weighted majority:An ensemble method for drifting concepts. Journal of Machine Learning Research, 2007,8:2755-2790.
    [123] Bifet A, Gavalda R. Learning from time-changing data with adaptive windowing. In:Proc. of the 7th SIAM Int'l Conf. on Data Mining (SDM 2007). 2007. 443-448.
    [124] Bifet A, Holmes G, Pfahringer B. Leveraging bagging for evolving data streams. In:Proc. of the Joint European Conf. on Machine Learning and Knowledge Discovery in Databases. 2010. 135-150.
    [125] Minku LL, Yao X. DDD:A new ensemble approach for dealing with concept drift. IEEE Trans. on Knowledge and Data Engineering, 2012,24(4):619-633.
    [126] Brzezinski D, Stefanowski J. Combining block-based and online methods in learning ensembles from concept drifting data streams. Information Sciences, 2014,265:50-67.
    [127] Minku LL, White AP, Yao X. The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans. on Knowledge and Data Engineering, 2010,22(5):730-742.
    [128] Gama J, Sebastiao R, Rodrigues PP. On evaluating stream learning algorithms. Machine Learning, 2013,90(3):317-346.
    [129] Xu GY, Han M, Wang SF, Jia T. Summarization of data stream ensemble classification algorithm. Application Research of Computers, 2020,37(1) (in Chinese with English abstract). http://www.arocmag.com/article/01-2020-01-001.html[doi:10.19734/j.issn.1001-3695.2018.09.0510]
    [130] Cabral DRL, Barro RSM. Concept drift detection based on Fisher's exact test. Information Sciences, 2018,442/443:220-234.
    [131] Pesaranghader A, Viktor H, Paquet E. Reservoir of diverse adaptive learners and stacking fast Hoeffding drift detection methods for evolving data streams. Machine Learning, 2018,107(11):1711-1743.
    [132] Liu A, Lu J, Liu F, Zhang G. Accumulating regional density dissimilarity for concept drift detection in data streams. Pattern Recognition, 2018,76:256-272.
    [133] Pan WB, Cheng G, Guo XJ, Huang SX. An adaptive classification approach based on information entropy for network traffic in presence of concept drift. Chinese Journal of Computers, 2017,40(7):1556-1571(in Chinese with English abstract).
    [134] Foster D, Kale S, Karloff H. Online sparse linear regression. In:Proc. of the 29th Annual Conf. on Learning Theory (COLT 2016). 2016,49:960-970.
    [135] Zliobaite I, Bifet A, Pfahringer B, Holmes G. Active learning with drifting streaming data. IEEE Trans. on Neural Networks and Learning Systems, 2014,25(1):27-39.
    [136] Mohamad S, Mouchaweh MS, Bouchachia A. Active learning for data streams under concept drift and concept evolution. In:Proc. of the Workshop on Large-scale Learning from Data Streams in Evolving Environments (STREAMEVOLV 2016) co-located with the 2016 European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2016). 2016.
    [137] Park CH, Kang Y. An active learning method for data streams with concept drift. In:Proc. of the IEEE Int'l Conf. on Big Data (BigData 2016). 2016. 746-752.
    附中文参考文献:
    [3] 翟婷婷.面向流数据分类的在线学习算法研究[博士学位论文].南京:南京大学,2018.
    [28] 李志杰,李元香,王峰,何国良,匡立.面向大数据分析的在线学习算法综述.计算机研究与发展,2015,52(8):1707-1721.
    [29] 潘志松,唐斯琪,邱俊洋,胡谷雨.在线学习算法综述.数据采集与处理,2016,31(6):1067-1082.
    [120] 赵强利,蒋艳凰,卢宇彤.具有回忆和遗忘机制的数据流挖掘模型与算法.软件学报,2015,26(10):2567-2580. http://www.jos.org.cn/1000-9825/4747.htm[doi:10.13328/j.cnki.jos.004747]
    [129] 许冠英,韩萌,王少峰,贾涛.数据流集成分类算法综述.计算机应用研究,2020,37(1). http://www.arocmag.com/article/01-2020-01-001.html[doi:10.19734/j.issn.1001-3695.2018.09.0510]
    [133] 潘吴斌,程光,郭晓军,黄顺翔.基于信息熵的自适应网络流概念漂移分类方法.计算机学报,2017,40(7):1556-1571.
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

翟婷婷,高阳,朱俊武.面向流数据分类的在线学习综述.软件学报,2020,31(4):912-931

复制
相关视频

分享
文章指标
  • 点击次数:3560
  • 下载次数: 13811
  • HTML阅读次数: 5629
  • 引用次数: 0
历史
  • 收稿日期:2019-02-22
  • 最后修改日期:2019-07-11
  • 在线发布日期: 2020-01-14
  • 出版日期: 2020-04-06
文章二维码
您是第20600961位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号