Font Size: a A A

The Research Of Rough Set Attribute Reduction Algorithm In Numeral Character Recognition

Posted on:2006-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:M WuFull Text:PDF
GTID:2178360182956529Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
The selection of attributes, which involves the complexity and performance of induction algorithms, is. a central problem in machine learning. Rough set models are good at selecting target concept relevant attributes and eliminating surplus ones that would deteriorate induction methods prediction ability. The crisp Rough set deals with nominal attribute objects originally, and have many limitations in handling cardinal attributes objects. The focus of this article is cardinal attributes oriented reduction algorithm and its application in numeral character recognition.First several basic algorithms of Rough set methods are introduced with an analysis of the time complexity such as some efficient and incremental computing of indiscernibly relation and an improved algorithm of positive region. The fitness function of the shortest reduction GA is analyzed. By defining a new fitness function and introducing specific valid bits mutation the efficiency of GA is increased.Then as to the interferential or noisy data reduction this paper suggests an attribute selection principle based on the attribute consistency of the objects of the same class. Two attribute assessments integrated with consistency are defined: one is consistency within class criterion Jc, in the form of entropy, measuring the randomness of the attribute class condition probability, the other is the attribute synthetic discrimination ability W( ยท ) based on the (dis)similar probabilities of attributes in (different)same classes. Both of two make use of statistics and partly eliminate the rough set sensitivity to noise. Some search methods such as a global optimizing method of Jc, a filter method and a heuristic method SDAR are designed.Next the limits of the application of rough set to cardinal variables are studied, and a cardinal attribute reduction SDAR-SIMR based on similarity rough set model is designed. SDAR-SIMR is a parameterized continuous attributes reduction. In this method the interval between two objects of different classes is controlled by similar threshold.Finally above reduction methods are applied to numeral character recognition to select useful attributes. The experimental results show the reduction algorithms integrated with consistency criteria obtain less rules than those existing algorithms,and the rules mate rate and the recognition rate is much higher too.
Keywords/Search Tags:rough set, attributes reduction, feature selections consistency, branch and bounds filters genetic algorithm, similarity based rough set models, character recognition
PDF Full Text Request
Related items