Font Size: a A A

Improvement And Research Of Attribute Reduction Algorithm Based On Information Entropy

Posted on:2015-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:N JiangFull Text:PDF
GTID:2428330488999788Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Attribute reduction algorithm based on information entropy is a product of the combination of rough set theory and information theory,which includes three typical representative algorithm—Mutual Information-Based Algorithm for Reduction of Knowledge(MIBARK)sets mutual information changing.caused by adding an attribute in decision tables as heuristic information,Conditional Entropy-based Algorithm for Reduction of Knowledge with Computing Core(CEBARKCC)takes core attributes as a starting point,and constantly choose non-core conditional attribute which has the smallest conditional entropy to add into the core attribute sets;Conditional Entropy-Based Algorithm for Reduction of Knowledge Not Computing core(CEBARKNC)sets conditional attributes as a starting point,and constantly delete the attribute which has the greatest conditional entropy corresponding decision attribute.CEBARKCC is based on core attributes,firstly ensure the attributes which must be preserved,and then constantly add attribute which has smallest conditional entropy by comparing of conditional entropy,take the judgment whether conditional entropy equals or not to select or reject attribute,it has more simple calculation steps than MIBARK,and more stringent than CEBARKNC,therefore,CEBARKCC has more applied research value.However,CEBARKCC simply choose conditional entropy as metrics to add or remove attribute,result in that the implementation efficiency of the algorithm is not high,and the reduction results are not satisfactory.So,it's necessary to do improvement and research for CEBARKCC.The main work which have been done of this thesis is as follows:1.For inadequate of CEBARKCC,this thesis introduces the conditional entropy difference,namely takes conditional entropy changing size caused by adding an attribute as measurement way of attribute importance,proposes the improved algorithm CEBARKCCI.Using the improved algorithm can achieve more efficient reduction process;through comparative analysis of experimental data,it is more optimal than CEBARKCC on time efficiency;2.For the problem that CEBARKCCI increased uncertainty of attribute reduction and increase the number of reduction,this thesis propose a new attribute importance measurement method,.and construct a new attribute reduction algorithm based on conditional entropy gain ratio.It adds conditional entropy between attribute need to choose and decision attribute as molecules on the basis of original attribute importance,constitute fractional attribute importance expression form,improves sorting of attribute importance,improve the normative of sorting,and eliminates the value confusion problem of original algorithm,making it to have universal applicability.Using this new algorithm to do attribute reduction can get less attribute number than traditional methods.Through UCI database and CTR data tables experiment verification analysis,this algorithm can get less reduction results.
Keywords/Search Tags:Rough set, Information entropy, Conditional entropy, Attribute reduction, Attribute importance
PDF Full Text Request
Related items