Font Size: a A A

Retrieval-based Chinese Text Mining Technology Study And Design

Posted on:2005-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:X H WangFull Text:PDF
GTID:2168360152465515Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the popularization of Internet, on the one hand, users can easily and quickly access to all kinds of resources. But on the other hand, it is difficult for them to get indeed useful information from the sea of information. More information is needed, but people become less patient. As the soul of Internet, the smooth information flow is the key for activating Internet and boosting its development. Text mining technology not only plays the most important role for realizing the flow but also the basic and central technology and application of Internet. Internet business cubic and various developments include text mining technology development in layers application in Internet corporation. It can be safely to say that when maturing text mining has been used in Internet, it would be very popular.By looking through many literatures, the author concludes that current mining technologies have two solutions for semi-structured and unstructured text data. The idea of the first solution is to use existed mining tool mining text resources which have been processed firstly. The idea of the second one is to build new tools of mining based on retrieval and gets useful information or knowledge from text resources. Comparing the two solutions, it is obvious that the first is easier than the second one. The second solution, however, aims to build new mining tools for semi-structured and unstructured text data, so it is more adaptive for mining text data to find knowledge in them. So the second solution is more valuable than the first one in the long run.So after studying text mining and information searching technologies, the paper gives a general text mining process model based on retrieval and selects advanced and compatible mining algorithms for every step. For solving the staples, keystones and difficulties, the author puts forward automatic resolving algorithms. Then a Chinese text mining system is designed. The system gives interfaces for users adding new models and algorithms which make the products of the system more competitive in market. At the same time, the paper points out text cube and mining knowledge from knowledge conception. At last, the Chinese resume mining system gives satisfying testing results.
Keywords/Search Tags:Text, Mining, General processing flow, System
PDF Full Text Request
Related items