Research Of Text Mining Based On Rough Set Theory

Posted on:2004-07-09

Degree:Master

Type:Thesis

Country:China

Candidate:D Li

Full Text:PDF

GTID:2168360095953807

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Rough Set theory is a soft compulation tool to solve the inaccurate and uncertain information proposed by Z. Pawlak in 1980's, while with the rapid development of Internet and electronic books, text mining is an important research field. This article carries on in-depth research to text mining based on rough set theory.In text classification, we advance an automatic text classification way based on clustering and Rough Set Theory. For clustering is apt to classify the existed documents, Rough Set, by reducing the data, can get a few useful rules, which can improve the efficiency of the classification of new documents. Both theories are combined to classify the documents by unsupervised learning and discuss the method in which new rules, applied to new unclassified documents, can be formed after classifying the training documents.In text retrieval, we introduce an optimized method in text retrieval based on Rough Set theory and Fuzzy Set theory. To be exact, if we combine the Rough Set theory with Fuzzy Set theory, optimize the users' queries of synonym and homoionym and then return the query results in the descending of similarity of the documents and queries, the users can get the most relevant query results as long as they define their queries according to their interests and describe their interest weight of every keyword in their queries in details. If they have more time, they can get other less relevant documents.In this article, we do experiments to prove their validity of applying to the text classification and text retrieval.

Keywords/Search Tags:

text mining, text classification, text retrieval, Rough Set Theory, Fuzzy Set Theory, clustering, textual feature extraction, user's interest, query optimization

PDF Full Text Request

Related items

1	Based On Rough Set Text Automatic Classification Study
2	Research On Key Problems In Text Mining
3	Automatic Text Categorization Based On Rough Set Theory
4	Application Of Rough Set Theory In Chinese Text Categorization
5	The Application Of Rough-Set-Model Based Text Clustering Algorithm In The Text Filtering
6	Research On Several Problems In Text Retrieval
7	Mining Users' Interests Based On Search Logs
8	Text Classification Model Based On Fuzzy-Rough Sets Theory
9	The Research Of Text-Classification Based On Rough Set Theory
10	Study Of Web Text Mining Based On Rough Set Theory