Font Size: a A A

The Optimization Research Of ID3 And Application In Component Library

Posted on:2012-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:D LiFull Text:PDF
GTID:2218330368481868Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Every walk of life accumulate mass data constantly along with rapid development of information technique and the diversity method of obtaining data. Facing expansile data sea how to use the data resource, find information and knowledge behind the data have become a widely concerned problem in business domain. Accordingly, with drive of people's effective requirement, data mining technique emerges at a historic moment, and develops rapidly in every field of life. The method of decision tree used in data classification is an important task of data mining domain.ID3 (Interactive Dicremiser versions 3)algorithm is one of the most frequently-used decision tree methods, and it is widely applied in machine learning domain because of its much advantage. But we find lots of defect about ID 3 in practical application. So the paper indepth researches the defect of ID3 and improved algorithm, and gives the rational prioritization scheme about predigesting the heuristic function of ID3 and overcoming the problem of variety bias to perfect the ID3. Firstly, the paper approximately derives the information gain formulae to remove the logarithm operation, and we derive the simplified heuristic function that is the same with several sorts and possesses universal property and universality. The shortcut calculation of ID3 selects the attribute whose information gain is the least as attributetest, and avoids logarithm operation when calculating information gain. So the shortcut calculation of ID3 decreases calculated amount and improves the execution efficiency of arithmetic. Secondly, the paper introduces the equilibrium function to overcome the problem of variety bias. The equilibrium function balances the relation between number of attribute value and information gain, then we can derive the new standard of Choosing Attributes. After instance analysis and algorithm comparison, the selected attributetest is more logical through modified ID3. Then the rules from decision tree more answer for the needs of people.Lastly, the paper realizes the application of ID3 optimization algorithm in component library through an instance. According to the application process, we integrate the history record table of component and feedback table of user into new data set which is used to ID3 optimization algorithm. Finally, we derive decision tree and distill rules from decision tree. According to these rules, reuser of component can understand and select component, and economize the decision time.
Keywords/Search Tags:ID3 algorithm, Equilibrium function, ID3 optimization algorithm, Component library
PDF Full Text Request
Related items