[关键词]
[摘要]
基于Shapelet的时间序列分类算法具有可解释性,且分类准确率高、分类速度快.在这些算法中,Shapelet学习算法不依赖于单一分类器,能够学习出不在原始时间序列中的Shapelet,可以取得较高的分类准确率,同时还可以保证Shapelet发现和分类器构建同时完成;但如果产生的Shapelet过多,会增加依赖参数,导致训练时间太长,分类速度低,动态更新困难,且相似重复的Shapelet会降低分类的可解释性.提出一种选择性提取方法,用于更精准地选择Shapelet候选集,并改变学习方法以加速Shapelet学习过程;方法中提出了两个优化策略,通过对原始训练集采用时间序列聚类,可以得到原始时间序列中没有的Shapelet,同时在选择性提取算法中加入投票机制,以解决产生Shapelet过多的问题.实验表明,该算法在保持较高准确率的同时,可以显著地提高训练速度.
[Key word]
[Abstract]
The time series classification algorithm based on Shapelet has the characteristics of interpretability, high classifica-tion accuracy and fast classification speed. Among these Shapelet-based algorithms, learning Shapelet algorithm does not rely on a single classifier, and Shapelet that is not in the original time series can be learned, which can achieve a high classification accuracy and ensure that Shapelet discovery and classifier construction are completed at the same time. However, if too many Shapelets are generated, it will increase the dependent parameters, resulting in too long training time, low classification speed, and difficult dynamic updates. And similar redundancy Shapelets will reduce the interpretability of the classification. This study proposes a new selective extraction algorithm to select Shapelet candidate set and change the learning method to accelerate the learning process of Shapelet and puts forward two optimization strategies. By using time series clustering for the original training set, Shapelets not in the original time series can be obtained. Meanwhile, a voting mechanism is added into the selective extraction algorithm to solve the problem of excessive Shapelet generation. Experiments show that the proposed algorithm can improve the training speed while maintaining high accuracy.
[中图分类号]
[基金项目]
国家自然科学基金(61872222);山东省重点研发计划(2018GGX101019);山东大学未来学者计划