Font Size: a A A

The Study Of Applying Fuzzy Knowledge Process Theory To Chinese Text Categorization

Posted on:2005-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:L TanFull Text:PDF
GTID:2168360122488696Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Text Categorization(TC) is an application technology that assign one or more appropriate classes to a text according to some strategy based on the text's content. In view of whether it has the stationary class system. It can be divided into supervised automated classification and unsupervised automated clustering. With the rapid increase of the Internet online text information, TC is more important in field of the information process. Text automated categorization is widely applied in every fields of text process and information retrieval, has became the key technique in process and organize large scale text information, and impulses the information process to the direction of automation. The problem investigated in this thesis is that the text automated classification with the known class system.Because of the complication of natural language, great uncertain and fuzzy in describing and understanding natural language, so the recognition of text's classes is fuzzy. It is justified to express these pattern with fuzzy by the fuzzy theory. Many experiences show that lots of practical classification cannot give the precise result that one object belong to one class, while only give the probability of one object belong to one class. So we can apply the fuzzy knowledge process technology to the field of text categorization, and we can get more precise classification results by using appropriate fuzzy.This thesis in the background of science literature information process, investigates the text categorization techniques based-on fuzzy knowledge process both in theory and application deeply, and the main works is as follows:1. Applies the fuzzy set theory of fuzzy mathematics to the field of text classification, and gives the systemic theoretical and practical study of fuzzy text classification.2. Combined the science literature's features, we apply the degree of approaching and fuzzy semantic relationship between fuzzy sets to Chinese text classification respectively, test and compare the two algorithms. Compared to the degree of approaching method, the fuzzy semantic relationship between fuzzy sets depends on not only the two fuzzy sets elements' subordinative degree, but also the elements' semantic relationship, while the degree of approaching only depends on the subordinative degree of the same element in different fuzzy sets, so the fuzzy semantic relationship method gets better results, and settle the problem of one text belonging to more than one class well.3. We analyze the classifying results based on the fuzzy text classifying, think the wrong classifying results can be divided into two styles, and we propose a subordinative degree update algorithm aim at the two instances. Combined the nizzy semantic relationship classifying algorithm, we propose the gradual classifier construction algorithm through checkouting and correcting the wrong results constantly with the update formula.. The experiment results prove that this algorithm makes use of the training texts effectively, obtains the best expression thatdescribing the training texts, and improves the precision of text classifying. 4. Improves the training algorithm that we have proposed by iterate, controls the iterate numbers and speed effectively with the classifying precision is guaranteed.
Keywords/Search Tags:Text Categorization, Fuzzy Sets, Fuzzy Classifying, Degree of Approaching, Membership Degree, Fuzzy Semantic Relationship
PDF Full Text Request
Related items