Font Size: a A A

Research And Application Of Anomaly Detection And Correction For Knowledge Modeling

Posted on:2019-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:F WangFull Text:PDF
GTID:2428330599463852Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The importance of data quality in the field of data mining is more and more significant.It affects the efficiency and application effect of intelligent models.In order to improve the recognition accuracy and the generation ability of intelligent models,it is necessary to detect and revise the abnormality of objects in the dataset used to construct the intelligent models.On the basis of the formal description of dataset and decision tree,the Gini-index gain is used as bisection criterion for continuous condition attributes,and the construction of binary decision tree is based on the recursive algorithm,all the objects in leaf nodes have same labels.The information entropy is applied to evaluate the distribution of objects by their labels in the leaf nodes of pruned decision tree,to implement the revision of the abnormal labels of objects.Essentially,the construction and pruning of decision tree are the division and merging of continuous data space of condition attributes by Gini-index and information entropy to revise the objects' labels.All the experiments and applications prove that the construction and pruning of decision tree are effective and successful method for optimization of the objects' labels.It is one of important researches on decision tree to void decision tree over-fitting and improve its generation ability.The main construction procedure of decision tree is based on classification ability of condition attributes to select the branch nodes.And the classification ability is presented by some measure functions,such as Tsallis entropy.Tsallis entropy is used to completely construct the decision tree.The decision tree is the optimal one by changing parameter of Tsallis entropy to obtain according to its accuracy,then it's pruned by the balance criterion to further be optimized.It is proved by experiments that the optimal Tsallis entropy expresses better the classification ability of condition attributes than others to construct decision tree.On the basis of the integration of Tsallis-based decision tree algorithm with balance-pruning,the decisiontree has more strong generation ability,and it's successfully used to recognize the lithology with seismic data as well as data optimization.
Keywords/Search Tags:Anomaly detection, Decision tree, Tsallis entropy, Pruning algorithm, Data optimization
PDF Full Text Request
Related items