Font Size: a A A

Knowledge Discovery Using Sequential Pattern Mining

Posted on:2018-05-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:EL MEHDI SADEG ALI HAJ MELADFull Text:PDF
GTID:1368330551961139Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Mining of sequential pattern algorithms is the most important in data mining field and is the key of many knowledge discovery applications.However,running such applications need memory and time,particularly when dealing with vast amounts databases.Choosing the unsuitable support threshold is the main factors to consume additional memory as well as time.On the other hand,it may present huge numerous of frequent patterns and that is hard to obtain the useful patterns,and it is not easy to compare the results.The problem itself will be increased and be more complicated,especially if the sequences are long such as stream sequences.To solve this problem,we redefine the problem of mining sequence patterns as the problem of mining the Top-K Sequential Patterns,where K is the number of sequential patterns to be set by the user.The current best algorithms for this problem are TSP,TKS.This study introduces the research on the conception of developing an effective pattern sequential model to overcome the aforementioned problem by organizing discovered patterns.There are three aims of this study:1)To reduce memory consumption:a dynamic technic is proposed,where the minimum support is set dynamically instead of static;and,the algorithm is based on pseudo-projection and BI-Directional Extension collectively;2)To reduce time consumption:we supported the algorithm with three space pruning functions in order to update the minimum support upon the discovered patterns.Finally,a more efficient algorithm than standard algorithms is proposed;3)improve the accuracy of multilinear regression energy system model through cleaning training sets using the proposed efficient algorithm.The extensive study and experiments were done on various real datasets with different sizes,which demonstrates that the proposed algorithm is more efficient compared with the other related algorithms.
Keywords/Search Tags:Pattern recognition, Sequence mining, Top-K algorithm, Knowledge discovery, Data mining
PDF Full Text Request
Related items