Font Size: a A A

Research On Knowledge Discovery Of Association Rules By Integrating Domain Knowledge

Posted on:2020-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z W ZhouFull Text:PDF
GTID:2428330575971569Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,knowledge plays an increasingly important role in social and economic activities.However,people also face the dilemma of "data explosion and poor knowledge".As an important component of association rule mining method,Apriori algorithm is widely used in knowledge discovery activities to discover the implicit rule knowledge of data,and has achieved good results.However,in terms of performance,Apriori algorithm has the disadvantages of scanning the database for many times and generating too many candidate item sets,resulting in low efficiency of algorithm execution.In application,the algorithm can derive a large number of rule results,which greatly reduces the availability of rule knowledge.With the explosive growth of the data size to be processed,the low efficiency of the algorithm and the problem of "knowledge overload" become more and more prominent.It is well known that domain knowledge has an important enlightening and constraining effect on rule knowledge discovery and refinement.At the same time,high-quality association rule knowledge discovery should include not only the mining of initial rules,but also the refinement of secondary mining of rules.In view of this,this thesis studies the knowledge discovery of association rules fused with domain knowledge and proposes a new Apriori improved algorithm.Aiming at the problem of rule overload and over-technology orientation of conventional knowledge discovery method,the knowledge discovery of association rules is divided into two stages,and a rule reprocessing method integrating domain knowledge is designed and proposed.Firstly,based on the analysis of Apriori algorithm and its improvement research,this thesis proposes an improved algorithm for Apriori based on matrix by combining optimization strategies such as matrix partitioning,matrix transformation,term set compression and support count.The algorithm can effectively reduce the matrix size,reduce the search space and improve the operation efficiency.Then,aiming at the problems of traditional Apriori algorithm,such as rule overload,low value and disorganization,this thesis proposes a rule result refinement method of fusion domain knowledge clustering,and analyzes the deficiency of DBSCAN algorithm in rule clustering of fusion domain knowledge,and then proposes an improved DBSCAN algorithm based on k neighborhood.This algorithm organizes and classifies the rule results reasonably so as to facilitate users to use the rules.For outliers after clustering,LOF algorithm is used in this thesis to calculate the degree of outliers,and the detailed features of local data are found in order to improve the accuracy of outliers,and the rule knowledge represented by outliers is taken as the output of rule mining results.Finally,the validity of the proposed algorithm is proved by experimental analysis.
Keywords/Search Tags:Domain Knowledge, Knowledge Discovery, Association Rules, Apriori Algorithm, Rule Clustering
PDF Full Text Request
Related items