Journal:Informatica
Volume 24, Issue 1 (2013), pp. 87–102
Abstract
Frequent sequence mining is one of the main challenges in data mining and especially in large databases, which consist of millions of records. There is a number of different applications where frequent sequence mining is very important: medicine, finance, internet behavioural data, marketing data, etc. Exact frequent sequence mining methods make multiple passes over the database and if the database is large, then it is a time consuming and expensive task. Approximate methods for frequent sequence mining are faster than exact methods because instead of doing multiple passes over the original database, they analyze a much shorter sample of the original database formed in a specific way. This paper presents Markov Property Based Method (MPBM) – an approximate method for mining frequent sequences based on kth order Markov models, which makes only several passes over the original database. The method has been implemented and evaluated using real-world foreign exchange database and compared to exact and approximate frequent sequent mining algorithms.