Probabilistic algorithm for mining frequent sequences
Articles
Julija Pragarauskaitė
Matematikos ir informatikos institutas
Gintautas Dzemyda
Matematikos ir informatikos institutas
Published 2010-12-21
https://doi.org/10.15388/LMR.2010.57
PDF

Keywords

frequent sequence mining
probabilistic algorithm
data mining

How to Cite

Pragarauskaitė J. and Dzemyda G. (2010) “Probabilistic algorithm for mining frequent sequences”, Lietuvos matematikos rinkinys, 51(proc. LMS), pp. 313–318. doi: 10.15388/LMR.2010.57.

Abstract

Frequent sequence mining in large volume databases is important in many areas, e.g., biological, climate, financial databases. Exact frequent sequence mining algorithms usually read the whole database many times, and if the database is large enough, then frequent sequence mining is very long or requires supercomputers. A new probabilistic algorithm for mining frequent sequences is proposed. It analyzes a random sample of the initial database. The algorithm makes decisions
about the initial database according to the random sample analysis results and performs much faster than the exact mining algorithms. The probability of errors made by the probabilistic algorithm is estimated using statistical methods.

PDF
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Please read the Copyright Notice in Journal Policy