Font Size: a A A

The FP-Growth Algorithm With Positive And Negative Items And Its Application In Log Analysis

Posted on:2020-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z P PanFull Text:PDF
GTID:2438330575959329Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Association rule Mining is one of the important research methods in data mining,and its purpose is to find useful information from large database.People use data mining technology to obtain a lot of useful information from the data,which drives the development of human science and technology.However,the huge amount of data is a difficult problem for people to carry out data mining,if there is no efficient mining algorithm,people will spend a lot of time in data mining.And most of the data mining currently studied is just to excavate the positive relationship between transactions,and often ignore the existence of negative correlation between transactions.However,in many areas of real life,it is not enough to excavate the positive association rules alone,and it is necessary to take into account the negative correlation between the data being excavated,so as to improve the descriptive power of association rules.In view of the above problems,this paper studies the following three points:(1)This paper studies the FP-Growth algorithm which contains the positive and negative projects of transaction databaseConsidering that after the introduction of negative projects,it will multiply the amount of the original data,resulting in the number of items set is too large,the length of the branches of the constructed FP-tree will be too long,the spatial share of the FP-tree will be too large,and the mining efficiency will be reduced.In order to solve this problem,this paper improves the construction method of FP-tree,that is,constructs FP-tree by using the method of dynamic insertion node,and reverses all the pointers,thus generating a new type of FP-tree,thus reducing the generation cost of FP-tree.In this paper,a mining algorithm-MAX-IFPA algorithm with maximum frequency pattern is proposed,and all the maximum frequent item sets are excavated by using the new FP-tree constructed.Through the comparison experiment with other algorithms,it is proved that the mining algorithm proposed in this paper has higher efficiency than other algorithms when digging according to frequent item sets.(2)The improvement of FP-Growth algorithm based on multiple minimum support degree is studiedSetting the value of a single level of support too high may cause the information that is useful to us to be abandoned due to the low frequency of occurrence,and it is contrary to the original intention of introducing negative projects,and too low a single support value may cause us to produce a large number of useless rules.Therefore,in order to solve this problem,this paper introduces the concept of minimum project support degree on the basis of the new FP-tree,and puts forward the maximum frequent pattern mining algorithm MS_IFPA based on multiple minimum support degree,by providing different methods of minimum support degree value to different data items,Allows us to successfully unearth rules that are useful to us while effectively avoiding the generation of a large number of useless rules.(3)The improved algorithm is applied to the log analysis of the forensics systemFirstly,the log file data is collected from the client of the system and submitted to the server side of the system.On the server side,the system first uses the algorithm of this paper to preprocess the submitted data,then carries on the evidence analysis and fusion to the preprocessed data,and finally generates the forensics report for the user to view.
Keywords/Search Tags:Association rules, Positive and negative items, FP-Growth algorithm, Data mining
PDF Full Text Request
Related items