Font Size: a A A

The Research On Key Problems Of Sequential Patterns Mining

Posted on:2009-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y JiFull Text:PDF
GTID:2178360245471701Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Knowledge discovery in databases (KDD) is a rapidly emerging research field relevant to artificial intelligence and database system. The discovery of sequential patterns is an important field in KDD.The discovery of sequential patterns need to scan databases for multiple times, and discover all the frequent sequences, therefore the time performance of these algorithms is poor.In the practical use, setting min_support is a sensitive task. The work of this dissertation aims at the problems mentioned above. The main contribution of this dissertation is as follows:(1) A sequential patterns mining algorithm NewGSP based on the dynamic min_sup is proposed. Traditional sequential pattern mining needs people set up the min_sup, so there must be some limitations. In this dissertation, the dynamic min_sup will be set up by using statistics method. It has better performance in time and space properties over traditional sequential pattern mining algorithms using packed candidate sequences.(2) A new closed sequential patterns mining algorithm G-CloSpan based on graph is proposed. The mining of closed sequence not only provides the same information, but also is more compact and effective. F2SG is introduced to express the sequential information related to the mining task by scanning the transaction database only one time. Based on this, the graph representation of the database can fully utilize the property of item ordering in the mining process, thus improving the generation efficiency frequent closed sequences. Theory analysis and experiments show that it has better performance in time and space properties over traditional closed sequential pattern mining algorithms.(3) Based on the research above, an experimental system of mining sequential patterns is carried out, and the algorithms are validated both experimentally and theoretically.
Keywords/Search Tags:Data Mining, sequential pattern, statistics method, frequent sequence graph, closed sequential patterns
PDF Full Text Request
Related items