Abstract:Time series shapelets are subsequences of time series that can maximally represent a class. One of the most promising approaches to solve the problem of time series classification is to separate the process of finding shapelets from classification algorithm by adopting a shapelet transformation. The main advantages of that technique are that it optimizes the process of shapelets selection and different classification strategies could be applied. Important limitations also exist in that method. First, although the number of shapelets selected for the transformation directly affects the classification result, the quantity of shapelets which yields the best data for classification is hard to be decided. Second, previous algorithms often inevitably result in similar shapelets among the selected shapelets. This work addresses the latter problem by introducing an efficient and effective shapelet pruning technique to filter similar shapelets and decrease the number of candidate shapelets at the same time. On this basis, a shapelet coverage method is proposed for selecting the number of shapelets for a given dataset. Experiments using the classic benchmark datasets for time series classification demonstrate that the proposed transformation can improve classification accuracy.