Font Size: a A A

Text Classification Model Based On Fuzzy-Rough Sets Theory

Posted on:2006-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:X F FuFull Text:PDF
GTID:2168360152482877Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
One of the goals of information processing is to get the most valuable information from the huge and vast text data. In the automated text classification, the existence of polysemy and synonymy cause many texts not being absolutely classified, so the boundaries of those classes are rough. In addition, the boundaries of many classes are vague due to overlapping classes, so some text samples don't absolutely belong to a certain class. Both the cases would lead to classification deviations.Based on above analyses, this paper introduces the Fuzzy-rough sets theory to deal with text classification. Fuzzy-rough sets theory put the Fuzzy sets theory and Rough sets theory together, which make full use of the two theories to deal with uncertainty informations. Rough sets theory can capture the indiscernibility due to insufficient features, that is the rough uncertainty owing to the granularity of the knowledge. And Fuzzy sets theory can capture the fuzzy uncertainty due to the overlapping classes. The two theories can deal with the two different uncertainties respectively, so it is advisable to combine them to deal with incomplete knowledge.The main creatives of this paper are as follows:1, Given a new model, Fuzzy-rough Sets models, for automated text classification.2, Using neighbor-space, can obtain a suitable value of k for k-NN. This model makes more suitable semantic explain for text classification. Compare with exist models, it improve the precision and recall of text classification, and don't increase the compute complex degree. However, there are still some unresolved problems. How to set the adjust parameter is one problem. And how to get more effective distance for Neighbor-space is another problem. All these problems are to be resolved in our future research.
Keywords/Search Tags:automated text classification, Fuzzy-rough Sets, Fuzzy-rough, membership function, neighbor-space, k-NN
PDF Full Text Request
Related items