Abstract:There is a huge wealth of sequence data available in real-world applications. The task of sequential pattern mining serves to mine important patterns from the sequence data. Given a sequence S, a certain threshold, and gap constraints, this paper aims to discover frequent patterns whose supports in S are no less than the given threshold value. There are flexible wildcards in pattern P, and the number of the wildcards between any two successive elements of P fulfills the user-specified gap constraints. The study designs an efficient mining algorithm: One-Off Mining, whose mining process satisfies the One-Off condition under which each character in the given sequence can be used at most once in all occurrences of a pattern. Experiments on DNA sequences show that this method performs better in time and completeness than the related sequential pattern mining algorithms.