Font Size: a A A

Discretization Uncertainty Evaluation And Its Applied Research In Association Rules Mining

Posted on:2019-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:G S FengFull Text:PDF
GTID:2428330545485858Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
Mapping attribute values into discrete characters is a key step for discovering the association rules from continuous attributes.The optimal discretization of continuous attributes is an uncertainty problem.Discretization uncertainty can be propagated and accumulated during association rule mining,which has a direct effect on the usability and applicability of the output results for mining association rules.This research proposes a novel index of discretization uncertainty evaluation and develops a quantitative analysis method for the propagation effect of discretization uncertainty in association rules mining.First,to address the limitations of existing discretization evaluation indices in describing accuracy and operation efficiency,this work suggests a discretization uncertainty index based on individuals.This method takes the standard score as the general similarity measure in and between the intervals and evaluates discretization reliability according to the relative position of individuals in each interval.The experiment shows the new evaluation index is consistent with commonly used metrics.Under the premise of guaranteeing the validity of discrete evaluation,the proposed method has higher description accuracy and operation efficiency and has more advantages for massive data processing and special distribution detection than extant approaches.Based on the calculation of the discretization uncertainty index proposed in this research,the maximum class variance/minimum class variance is introduced to divide the discretization further into uncertain and reliable intervals.Then,the propagation of discretization uncertainty in association rule mining is evaluated.The formulas of rule uncertainty evaluation indices are also redefined in light of the reliable offset caused by discretization uncertainty in the calculation of association rule evaluation indices(namely,Support,Confidence and Lift).This approach achieves accurate and complete evaluation of the association rules.Discrete uncertainty evaluation and its propagation in rule mining are applied to the detection of the combination of discrete methods in mining.Accordingly,the effect of discretization uncertainty in the application of association rules is minimized.To verify its effectiveness,the method was applied to the traffic effect study of grain production in Jianghan Plain,Hubei Province.Results show the evaluation method for uncertainty in discretization effectively detects the uncertainty distribution of a real continuous data set under different discretization methods.Moreover,the discretization uncertainty in association rules mining are quantifiable.Consequently,the combination of discretization methods with the least influence on the reliability of association rules was detected.
Keywords/Search Tags:Continuous attributes, Discretization, Uncertainty, Association rules, Reliable offset
PDF Full Text Request
Related items