概念漂移数据流半监督分类综述
作者:
作者简介:

文益民(1969-),男,博士,教授,博士生导师,CCF杰出会员,主要研究领域为机器学习,数据流分类,媒体分析与数据挖掘;
易新河(1969-),女,助理研究员,主要研究领域为媒体数据挖掘,教育数据分析;
刘帅(1994-),男,硕士,主要研究领域为机器学习,数据挖掘,推荐系统;
刘长杰(1994-),男,硕士,主要研究领域为机器学习,半监督学习,数据挖掘;
缪裕青(1966-),女,博士,教授,CCF专业会员,主要研究领域为机器学习,数据挖掘,情感分析.

通讯作者:

文益民,E-mail:ymwen@guet.edu.cn

基金项目:

广西自然科学基金(2018GXNSFDA138006);国家自然科学基金(61866007);教育部人文社会科学研究项目(17JDGC022);广西图像图形与智能处理重点实验室项目(GIIP2005,GIIP201505,GIIP201706)


Survey on Semi-supervised Classification of Data Streams with Concept Drifts
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [103]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    在开放环境下,数据流具有数据高速生成、数据量无限和概念漂移等特性.在数据流分类任务中,利用人工标注产生大量训练数据的方式昂贵且不切实际.包含少量有标记样本和大量无标记样本且还带概念漂移的数据流给机器学习带来了极大挑战.然而,现有研究主要关注有监督的数据流分类,针对带概念漂移的数据流的半监督分类的研究尚未引起足够的重视.因此,在全面收集数据流半监督分类研究工作的基础上,对现有带概念漂移的数据流的半监督分类算法进行了多角度划分;并以算法采用的分类器类型为线索,对已有的多个算法进行了介绍与总结,包括现有数据流半监督分类采用的概念漂移检测方法;在一些被广泛使用的真实数据集和人工数据集上,对部分代表性数据流半监督分类算法进行了多方面的比较与分析;最后,提出了当前概念漂移数据流半监督分类中一些值得进一步深入探讨的问题.实验结果表明:数据流半监督分类算法的分类准确率与众多因素有关,但与数据分布的变化关系最大.本综述将有助于感兴趣的研究者快速进入数据流半监督分类问题领域.

    Abstract:

    In the open environment, data streams have the characteristics of high-speed data generation, unlimited data volume, and concept drift. In the task of data stream classification, it is expensive and impractical to generate a large amount of training data by manual annotation. A data stream with a small number of samples labeled and a large number of samples unlabeled and with concept drifts presents a great challenge to machine learning. However, the existing research mainly focuses on supervised classification of data streams, while semi-supervised classification of data streams with concept drifts has not yet attracted attention enough. Therefore, based on the comprehensive collection of the work of semi-supervised classification of data streams, this study sorts the existing semi-supervised data stream classification algorithms into several types from several aspects, describes and summarizes many existing algorithms based on the types of classifiers used in the algorithms and the concept drift detection methods utilized. On some widely employed real and synthetic datasets, several representative semi-supervised classification algorithms for data streams are chosen to be compared and analyzed in many aspects. Finally, this study proposes some issues that are worthy to be further discussed in future for semi-supervised classification of data streams with concept drifts. The experimental results show that the classification accuracy of the algorithms for semi-supervised data stream classification is related to many factors, but it has the greatest relationship with the changes of data distribution. This review will help the interested researchers quickly enter into the field of semi-supervised classification of data streams.

    参考文献
    [1] Lian D, Xie X, Chen E. Discrete matrix factorization and extension for fast item recommendation. IEEE Trans. on Knowledge and Data Engineering (TKDE), 2021, 33(5): 1919-1933. [doi: 10.1109/TKDE.2019.2951386]
    [2] Wang H, Chen EH, Liu Q, et al. A united approach to learning sparse attributed network embedding. In: Proc. of the IEEE Int’l Conf. on Data Mining. Piscataway: IEEE, 2018. 557-566. [doi: 10.1109/ICDM.2018.00071]
    [3] Zhang ML, Li YK, Liu XY, et al. Binary relevance for multi-label learning: An overview. Frontiers of Computer Science, 2018, 12(2): 191-202. [doi: 10.1007/s11704-017-7031-7]
    [4] Guo GD, Chen LF, Ye YF, et al. Cluster validation method for determining the number of clusters in categorical sequences. IEEE Trans. on Neural Networks and Learning Systems, 2016, 28(12): 2936-2948. [doi: 10.1109/TNNLS.2016.2608354]
    [5] Zhao XW, Liang JY. An attribute weighted clustering algorithm for mixed data based on information entropy. Journal of Computer Research and Development, 2016, 53(5): 1018-1028 (in Chinese with English abstract). [doi: 10.7544/issn1000-1239.2016.20150131]
    [6] Huai BX, Chen EH, Zhu HS, et al. Toward personalized context recognition for mobile users: A semi-supervised Bayesian HMM approach. ACM Trans. on Knowledge Discovery from Data, 2014, 9(2): 1-29. [doi: 10.1145/2629504]
    [7] Zhang ML, Zhou ZH. Exploiting unlabeled data to enhance ensemble diversity. Data Mining and Knowledge Discovery, 2013, 26(1): 98-129. [doi: 10.1007/s10618-011-0243-9]
    [8] Guo GD, Li N, Chen LF. Concept drift detection for data streams based on mixture model. Journal of Computer Research and Development, 2014, 51(4): 731-742 (in Chinese with English abstract). [doi: 10.7544/issn1000-1239.2014.20120582]
    [9] Bai L, Cheng XQ, Liang JY, et al. An optimization model for clustering categorical data streams with drifting concepts. IEEE Trans. on Knowledge and Data Engineering, 2016, 28(11): 2871-2883. [doi: 10.1109/TKDE.2016.2594068]
    [10] Zhang X, Liu C, Suen C. Towards robust pattern recognition: A review. Proc. of the IEEE, 2020, 108(6): 894-922. [doi: 10.1109/ JPROC.2020.2989782]
    [11] Wang J, Lan C, Liu C, et al. Generalizing to unseen domains: A survey on domain generalization. In: Proc. of the 30th Int’l Joint Conf. on Artificial Intelligence. New York: ACM, 2021. 4627-4635. [doi: 10.24963/ijcai.2021/628]
    [12] Sayed-Mouchaweh M, Lughofer E. Learning in Non-stationary Environments: Methods and Applications. Germany: Springer, 2013. [doi: 10.1007/978-1-4419-8020-5]
    [13] Gama J, Ganguly A, Omitaomu O, et al. Knowledge discovery from data streams. Intelligent Data Analysis, 2009, 13(3): 403-404. [doi: 10.3233/IDA-2009-0372]
    [14] Zhao P, Zhou ZH. Learning from distribution-changing data streams via decision tree model reuse. Scientia Sinica Informationis, 2021, 51(1): 1-12. (in Chinese with English abstract). [doi: 10.1360/SSI-2020-0170]
    [15] Hosseini MJ, Gholipour A, Beigy H. An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowledge and Information Systems, 2016, 46(3): 567-597.[doi: 10.1007/s10115-015-0837-4]
    [16] Xu WH, Qin Z, Chang Y. Semi-supervised learning based ensemble classifier for stream data. Pattern Recognition and Artificial Intelligence, 2012, 25(2): 292-299 (in Chinese with English abstract). [doi: 10.16451/j.cnki.issn1003-6059.2012.02.010]
    [17] Wang Y, Li T. Improving semi-supervised co-forest algorithm in evolving data streams. Applied Intelligence, 2018, 48(10): 3248-3262. [doi: 10.1007/s10489-018-1149-7]
    [18] Ahmadi Z, Beigy H. Semi-supervised ensemble learning of data streams in the presence of concept drift. In: Proc. of the 7th Int’l Conf. on Hybrid Artificial Intelligent Systems. Berlin: Springer, 2012. 526-537. [doi: 10.1007/978-3-642-28931-6_50]
    [19] Krawczyk B, Minku LL, Gama J, et al. Ensemble learning for data stream analysis: A survey. Information Fusion, 2017, 37: 132-156. [doi: 10.1016/j.inffus.2017.02.004]
    [20] Zhu Y, Li Y. Semi-supervised streaming learning with emerging new labels. In: Proc. of the 34th AAAI Conf. on Artificial Intelligence. Menlo Park: AAAI, 2020. 7015-7022. [doi: 10.1609/aaai.v34i04.6186]
    [21] Noorbehbahani F, Fanian A, Mousavi R, et al. An incremental intrusion detection system using a new semi-supervised stream classification method. Int’l Journal of Communication Systems, 2015, 30(4): 1-26.[doi: 10.1002/dac.3002]
    [22] Hu Y, Baraldi P, Maio F, et al. A systematic semi-supervised self-adaptable fault diagnostics approach in an evolving environment. Mechanical Systems and Signal Processing, 2017, 88: 413-427. [doi: 10.1016/j.ymssp.2016.11.004]
    [23] Sedhai S, Sun A. Semi-supervised spam detection in Twitter stream. IEEE Trans. on Computational Social Systems, 2017, 5(1): 169-175. [doi: 10.1109/TCSS.2017.2773581]
    [24] Matuszyk P, Spiliopoulou M. Stream-based semi-supervised learning for recommender systems. Machine Learning, 2017, 106(6): 771-798. [doi: 10.1007/s10994-016-5614-4]
    [25] Grzenda M, Bustillo A. Semi-supervised roughness prediction with partly unlabeled vibration data streams. Journal of Intelligent Manufacturing, 2019, 30(2): 933-945. [doi: 10.1007/s10845-018-1413-z]
    [26] Schlimmer JC, Granger RH. Incremental learning from noisy data. Machine Learning, 1986, 1(3): 317-354. [doi: 10.1007/ BF00116895]
    [27] Kuncheva LI. Classifier ensembles for changing environments. In: Proc. of the Int’l Workshop on Multiple Classifier Systems. Berlin: Springer, 2004. 1-15. [doi: 10.1007/978-3-540-25966-4_1]
    [28] Tsymbal A. The problem of concept drift: Definitions and related work. Technical Report, Department of Computer Science Trinity College Dublin, 2004.
    [29] Wang T, Li ZJ, Yan YJ, et al. A survey of classification of data stream. Journal of Computer Research and Development, 2007, 44(11): 1809-1815 (in Chinese with English abstract). [doi: 10.1038/onc.2012.370]
    [30] Žliobaitė I. Learning under Concept Drift: An Overview. Computer Science, 2010.
    [31] Gama J. A survey on learning from data streams: Current and future trends. Progress in Artificial Intelligence, 2012, 1(1): 45-55. [doi: 10.1007/s13748-011-0002-6]
    [32] Hoens TR, Polikar R, Chawla NV. Learning from streaming data with concept drift and imbalance: An overview. Progress in Artificial Intelligence, 2012, 1(1): 89-101.[doi: 10.1007/s13748-011-0008-0]
    [33] Wen YM, Qiang BH, Fan ZG. A survey of the classification of data streams with concept drift. CAAI Trans. on Intelligent Systems, 2013, 8(2): 95-104 (in Chinese with English abstract). [doi: 10.3969/j.issn.1673-4785.201208012]
    [34] Gama J, Žliobaitė I, Bifet A, et al. A survey on concept drift adaptation. ACM Computing Surveys, 2014, 46(4): 1-37. [doi: 10.1145/2523813]
    [35] Krempl G, Žliobaite I, Brzeziński D, et al. Open challenges for data stream mining research. ACM SIGKDD Explorations Newsletter, 2014, 16(1): 1-10. [doi: 10.1145/2674026.2674028]
    [36] Gonçalves JPM, de Carvalho Santos SGT, de Barros RSM, et al. A comparative study on concept drift detectors. Expert Systems with Applications, 2014, 41(18): 8144-8156.[doi: 10.1016/j.eswa.2014.07.019]
    [37] Ditzler G, Roveri M, Alippi C, et al. Learning in nonstationary environments: A survey. IEEE Computational Intelligence Magazine, 2015, 10(4): 12-25. [doi: 10.1109/MCI.2015.2471196]
    [38] Nguyen HL, Woon YK, Ng WK. A survey on data stream clustering and classification. Knowledge and Information Systems, 2015, 45(3): 535-569. [doi: 10.1007/s10115-014-0808-1]
    [39] Ding J, Han M, Li J. Review of concept drift data streams mining techniques. Computer Science, 2016, 43(12): 24-29,62 (in Chinese with English abstract). [doi: 10.11896/j.issn.1002-137x.2016.12.004]
    [40] Webb GI, Hyde R, Cao H, et al. Characterizing concept drift. Data Mining and Knowledge Discovery, 2016, 30(4): 964-994. [doi: 10.1007/s10618-015-0448-4]
    [41] Iwashita AS, Papa JP. An overview on concept drift learning. IEEE Access, 2018, 7: 1532-1547.[doi: 10.1109/ACCESS.2018. 2886026]
    [42] Khamassi I, Sayed-Mouchaweh M, Hammami M, et al. Discussion and review on evolving data streams and concept drift adapting. Evolving Systems, 2018, 9(1): 1-23.[doi: 10.1007/s12530-016-9168-2]
    [43] Lu J, Liu A, Dong F, et al. Learning under concept drift: A review. IEEE Trans. on Knowledge and Data Engineering, 2019, 31(12): 2346-2363. [doi: 10.1109/TKDE.2018.2876857]
    [44] Dyer KB, Polikar R. Semi-supervised learning in initially labeled non-stationary environments with gradual drift. In: Proc. of the 2012 Int’l Joint Conf. on Neural Networks. Piscataway: IEEE, 2012. 1-9. [doi: 10.1109/IJCNN.2012.6252541]
    [45] Dyer KB, Capo R, Polikar R. Compose: A semi-supervised learning framework for initially labeled nonstationary streaming data. IEEE Trans. on Neural Networks and Learning Systems, 2013, 25(1): 12-26. [doi: 10.1109/TNNLS.2013.2277712]
    [46] Ferreira RS, Zimbrão G, Alvim LGM. AMANDA: Semi-supervised density-based adaptive model for non-stationary data with extreme verification latency. Information Sciences, 2019, 488: 219-237. [doi: 10.1016/j.ins.2019.03.025]
    [47] Zhang P, Zhu XQ, Tan JL, et al. Classifier and cluster ensembles for mining concept drifting data streams. In: Proc. of the 2010 IEEE Int’l Conf. on Data Mining. Piscataway: IEEE, 2010. 1175-1180. [doi: 10.1109/ICDM.2010.125]
    [48] Domingos P, Hulten G. Mining high-speed data streams. In: Proc. of the 6th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM, 2000. 71-80. [doi: 10.1145/347090.347107]
    [49] Hulten G, Spencer L, Domingos P. Mining time-changing data streams. In: Proc. of the 7th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM, 2001. 97-106. [doi: 10.1145/502512.502529]
    [50] Žliobaitė I. Adaptive training set formation[Ph.D. Thesis]. Vilniaus: Vilnius University, 2010.
    [51] Kelly MG, Hand DJ, Adams NM. The impact of changing populations on classifier performance. In: Proc. of the 5th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM, 1999. 367-371. [doi: 10.1145/312129.312285]
    [52] Feng ZX, Wang M, Yang SY, et al. Incremental semi-supervised classification of data streams via self-representative selection. Applied Soft Computing, 2016, 47: 389-394. [doi: 10.1016/j.asoc.2016.02.023]
    [53] Parker BS, Khan L. Detecting and tracking concept class drift and emergence in non-stationary fast data streams. In: Proc. of the 29th AAAI Conf. on Artificial Intelligence. Menlo Park: AAAI, 2015. 2908-2913. [doi: 10.5555/2888116.2888121]
    [54] Khezri S, Tanha J, Ahmadi A, et al. A novel semi-supervised ensemble algorithm using a performance-based selection metric to non-stationary data streams. Neurocomputing, 2021, 442(6): 125-145. [doi: 10.1016/j.neucom.2021.02.031]
    [55] Masud MM, Gao J, Khan L, et al. A practical approach to classify evolving data streams: Training with limited amount of labeled data. In: Proc. of the 8th IEEE Int’l Conf. on Data Mining. Piscataway: IEEE, 2008. 929-934. [doi: 10.1109/ICDM.2008.152]
    [56] Masud MM, Woolam C, Gao J, et al. Facing the reality of data stream classification: Coping with scarcity of labeled data. Knowledge and Information Systems, 2012, 33(1): 213-244. [doi: 10.1007/s10115-011-0447-8]
    [57] Woolam C, Masud MM, Khan L. Lacking labels in the stream: Classifying evolving stream data with few labels. In: Proc. of the Int’l Symp. on Methodologies for Intelligent Systems. Berlin: Springer, 2009. 552-562. [doi: 10.1007/978-3-642-04125-9_58]
    [58] Qin KK, Wen YM. Semi-supervised classification of concept drift data stream based on local component replacement. In: Proc. of the Int’l CCF Conf. on Artificial Intelligence. Singapore: Springer, 2018. 98-112. [doi: 10.1007/978-981-13-2122-1_8]
    [59] Casalino G, Castellano G, Mencar C. Incremental adaptive semi-supervised fuzzy clustering for data stream classification. In: Proc. of the 2018 IEEE Conf. on Evolving and Adaptive Intelligent Systems. Piscataway: IEEE, 2018. 1-7. [doi: 10.1109/EAIS.2018. 8397172]
    [60] Wen YM, Liu S. Semi-supervised classification of data streams by BIRCH ensemble and local structure mapping. Journal of Computer Science and Technology, 2020, 35(2): 295-304.[doi: 10.1007/s11390-020-9999-y]
    [61] Din SU, Shao J, Kumar J, et al. Online reliable semi-supervised learning on evolving data streams. Information Sciences, 2020, 525(7): 153-171. [doi: 10.1016/j.ins.2020.03.052]
    [62] Zheng X, Li P, Hu X, et al. Semi-supervised classification on data streams with recurring concept drift and concept evolution. Knowledge-Based Systems, 2021, 215(3): 106749. [doi: 10.1016/j.knosys.2021.106749]
    [63] Li PP, Wu XD, Hu XG. Mining recurring concept drifts with limited labeled streaming data. ACM Trans. on Intelligent Systems and Technology, 2012, 13(2): 241-252. [doi: 10.1145/2089094.2089105]
    [64] Wu XD, Li PP, Hu XG. Learning from concept drifting data streams with unlabeled data. Neurocomputing, 2012, 92: 145-155. [doi: 10.1016/j.neucom.2011.08.041]
    [65] Nguyen HL, Ng WK, Woon YK, et al. Concurrent semi-supervised learning of data streams. In: Proc. of the 13th Int’l Conf. on Data Warehousing and Knowledge Discovery. Berlin, Heidelberg: Springer, 2011. 445-459. [doi: 10.1007/978-3-642-23544-3_34]
    [66] Anastasovski G, Popstojanova KG. Classification of partially labeled malicious web traffic in the presence of concept drift. In: Proc. of the 8th Int’l Conf. on Software Security and Reliability-Companion. Piscataway: IEEE, 2014. 130-139. [doi: 10.1109/SERE-C. 2014.31]
    [67] Hu XG, Ma LW, Li PP. Data stream ensemble classification algorithm based on tri-training. Journal on Data Acquisition and Processing, 2017, 32(5): 853-860 (in Chinese with English abstract). [doi: 10.16337/j.1004-9037.2017.05.001]
    [68] Wen YM, Zhou Q, Xue Y, et al. Transfer learning for semi-supervised classification of non-stationary data streams. In: Proc. of the 27th Int’l Conf. on Neural Information Processing. Berlin: Springer, 2020. 468-477. [doi: 10.13140/2.1.3891.2644]
    [69] Liu C, Wen Y, Xue Y. Semi-supervised classification of data streams based on adaptive density peak clustering. In: Proc. of the 27th Int’l Conf. on Neural Information Processing. Cham: Springer, 2020. 639-650. [doi: 10.1007/978-3-030-63833-7_54]
    [70] Jr JRB, Lopes A, Liang Z. Partially labeled data stream classification with the semi-supervised k-associated graph. Journal of the Brazilian Computer Society, 2012, 18(4): 299-310. [doi: 10.1007/s13173-012-0072-8]
    [71] Jia Y, Yan S, Zhang C. Semi-supervised classification on evolutionary data. In: Proc. of the 21st Int’l Joint Conf. on Artificial Intelligence. New York: ACM, 2009. 1083-1088.
    [72] Zhang P, Zhu XQ, Guo L. Mining data streams with labeled and unlabeled training examples. In: Proc. of the 9th IEEE Int’l Conf. on Data Mining. Piscataway: IEEE, 2009. 627-636.[doi: 10.1109/ICDM.2009.76]
    [73] Zhu X, Jin R. Multiple information sources cooperative learning. In: Proc. of the 21sth Int’l Joint Conf. on Artificial Intelligence. New York: ACM, 2009. 1369-1375.
    [74] Guo B, Menon J, Willette B. Surface reconstruction using alpha shapes. Computer Graphics Forum, 2010, 16(4): 177-190. [doi: 10.1111/1467-8659.00178]
    [75] Li Y, Wang Y, Liu Q, et al. Incremental semi-supervised learning on streaming data. Pattern Recognition, 2019, 88: 383-396. [doi: 10.1016/j.patcog.2018.11.006]
    [76] Silva CASD, Krohling RA. Semi-supervised online elastic extreme learning machine with forgetting parameter to deal with concept drift in data streams. In: Proc. of the 2019 Int’l Joint Conf. on Neural Networks. Piscataway: IEEE, 2019. 1-8.[doi: 10.1109/IJCNN.2019.8852361]
    [77] Widyantoro DH, Yen J. Relevant data expansion for learning concept drift from sparsely labeled data. IEEE Trans. on Knowledge and Data Engineering, 2005, 17(3): 401-412. [doi: 10.1109/TKDE.2005.48]
    [78] Haque A, Khan L, Baron M. Sand: Semi-supervised adaptive novel class detection and classification over data stream. In: Proc. of the 30th AAAI Conf. on Artificial Intelligence. New York: ACM, 2016. 1652-1658.
    [79] Haque A, Khan L, Baron M, et al. Efficient handling of concept drift and concept evolution over stream data. In: Proc. of the IEEE 32nd Int’l Conf. on Data Engineering. Piscataway: IEEE, 2016. 481-492. [doi: 10.1109/ICDE.2016. 7498264]
    [80] Sriwatanasakdi N, Numao M, Fukui K. Concept drift detection for graph-structured classifiers under scarcity of true labels. In: Proc. of the IEEE 29th Int’l Conf. on Tools with Artificial Intelligence. Piscataway: IEEE, 2017. 461-468. [doi: 10.1109/ICTAI.2017. 00077]
    [81] Li N. Clustering assumption based classification algorithm for stream data. Pattern Recognition and Artificial Intelligence, 2017, 30(1): 1-10 (in Chinese with English abstract). [doi: 10.16451/j.cnki.issn1003-6059.201701001]
    [82] Masud MM, Gao J, Khan L, et al. Classification and novel class detection in data streams with active mining. In: Proc. of the Pacific-Asia Conf. on Advances in Knowledge Discovery and Data Mining. Berlin: Springer, 2010. 311-324. [doi: 10.1007/978-3- 642-13672-6_31]
    [83] Žliobaitė I, Bifet A, Pfahringer B, et al. Active learning with drifting streaming data. IEEE Trans. on Neural Networks and Learning Systems, 2014, 25(1): 27-39. [doi: 10.1109/TNNLS.2012.2236570]
    [84] Arabmakki E, Kantardzic M. SOM-based partial labeling of imbalanced data stream. Neurocomputing, 2017, 262: 120-133. [doi: 10.1016/j.neucom.2016.11.088]
    [85] Zhu X, Zhang P, Lin X, et al. Active learning from stream data using optimal weight classifier ensemble. IEEE Trans. on Systems Man and Cybernetics Part B, 2010, 40(6): 1607-1628. [doi: 10.1109/TSMCB.2010.2042445]
    [86] Bifet A, Gavaldà R. Learning from time-changing data with adaptive windowing. In: Proc. of the 2007 SIAM Int’l Conf. on Data Mining. Philadelphia: SIAM, 2007. 443-448. [doi: 10.1137/1.9781611972771.42]
    [87] Wang HY, Hu XG, Li PP. Semi-supervised short text stream classification based on vector representation and label propagation. Pattern Recognition and Artificial Intelligence, 2018, 31(7): 634-642 (in Chinese with English abstract). [doi: 10.16451/j.cnki. issn1003-6059.201807006]
    [88] Gao J, Fan W, Jiang J, et al. Knowledge transfer via multiple model local structure mapping. In: Proc. of the 14th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM, 2008. 283-291. [doi: 10.1145/1401890.1401928]
    [89] Kim Y, Park CH. An efficient concept drift detection method for streaming data under limited labeling. IEICE Trans. on Information and Systems, 2017, 100(10): 2537-2546.[doi: 10.1587/transinf.2017EDP7091]
    [90] Tan CH, Lee V, Salehi M. Online semi-supervised concept drift detection with density estimation. arXiv: 1909.11251, 2019.
    [91] Bifet A, Holmes G, Kirkby R, et al. Moa: Massive online analysis. Journal of Machine Learning Research, 2010, 11(5): 1601-1604.
    [92] Elwell R, Polikar R. Incremental learning of concept drift in nonstationary environments. IEEE Trans. on Neural Networks, 2011, 22(10): 1517-1531. [doi: 10.1109/TNN.2011.2160459]
    附中文参考文献:
    [5] 赵兴旺, 梁吉业. 一种基于信息熵的混合数据属性加权聚类算法. 计算机研究与发展, 2016, 53(5): 1018-1028. [doi: 10.7544/ issn1000-1239.2016.20150131]
    [8] 郭躬德, 李南, 陈黎飞. 一种基于混合模型的数据流概念漂移检测算法. 计算机研究与发展, 2014, 51(4): 731-742. [doi: 10. 7544/issn1000-1239.2014.20120582]
    [14] 赵鹏, 周志华. 基于决策树模型重用的分布变化流数据学习. 中国科学: 信息科学, 2021, 51(1): 1-12. [doi: 10.1360/SSI-2020- 0170]
    [16] 徐文华, 覃征, 常扬. 基于半监督学习的数据流集成分类算法. 模式识别与人工智能, 2012, 25(2): 292-299. [doi: 10.16451/j. cnki.issn1003-6059.2012.02.010]
    [29] 王涛, 李舟军, 颜跃进, 等. 数据流挖掘分类技术综述. 计算机研究与发展, 2007, 44(11): 1809-1815. [doi: 10.1038/onc.2012. 370]
    [33] 文益民, 强保华, 范志刚. 概念漂移数据流分类研究综述. 智能系统学报, 2013, 8(2): 95-104. [doi: 10.3969/j.issn.1673-4785. 201208012]
    [39] 丁剑, 韩萌, 李娟. 概念漂移数据流挖掘算法综述. 计算机科学, 2016, 43(12): 24-29,62. [doi: 10.11896/j.issn.1002-137x.2016. 12.004]
    [67] 胡学钢, 马利伟, 李培培. 一种基于Tri-training的数据流集成分类算法. 数据采集与处理, 2017, 32(5): 853-860. [doi: 10.16337/j.1004-9037.2017.05.001]
    [81] 李南. 基于聚类假设的数据流分类算法. 模式识别与人工智能, 2017, 30(1): 1-10. [doi: 10.16451/j.cnki.issn1003-6059. 201701001]
    [87] 王海燕, 胡学钢, 李培培. 基于向量表示和标签传播的半监督短文本数据流分类算法. 模式识别与人工智能, 2018, 31(7): 634-642. [doi: 10.16451/j.cnki.issn1003-6059.201807006]
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

文益民,刘帅,缪裕青,易新河,刘长杰.概念漂移数据流半监督分类综述.软件学报,2022,33(4):1287-1314

复制
分享
文章指标
  • 点击次数:1911
  • 下载次数: 6645
  • HTML阅读次数: 4419
  • 引用次数: 0
历史
  • 收稿日期:2021-05-30
  • 最后修改日期:2021-07-16
  • 在线发布日期: 2021-10-26
  • 出版日期: 2022-04-06
文章二维码
您是第19705252位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号