Font Size: a A A

Study On The System Of Chinese Automatic Word Segmentation Based On Text Information Of BBS

Posted on:2007-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:S F HeFull Text:PDF
GTID:2178360185990462Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of Internet Technology, there have been more and more all kinds of network application service. The application of Bulletin Boards System (BBS) has provided a freely communicational space for network users, but some unhealthy and reactive words have a bad effect on our country and society. The managers of networks have been more and more concerned about how to clean up the bad words exactly from the users'view. With the sharp accretion of BBS information capacity, the traditional management methods of BBS are becoming not only laggard but also un-efficient, and it is difficult for them to adapt to the development of the times. Data -Mining techniques are just used to offset the insufficiencies of traditional analytical methods and to deal with massive data. So, how to use data-Mining techniques to manage BBS efficiently and quickly has made so many network managers pay more and more attention to it.Nowadays, the discrimination and filtration techniques for BBS documents have not become mature. Because of the particularity of BBS, the discrimination techniques for common web documents and e-mail are not efficient for BBS documents. It's very important for network management that using data-Mining techniques to delete those unhealthy and reactive words from BBS text documents. When we are dealing with a great deal of documents, we need to analyze and extract useful information from them, and to compare one document with another with correlative facilities as well as to arrange their importance and pertinence, or to find the patterns or trends from them. Therefore, text information mining is becoming an increasingly popular and important research problem in data mining.Text information mining is the process of extracting interesting patterns from very large text collections for the purpose of discovering knowledge, which is an offshoot of data mining. Text information mining is a new technology that adopts data mining method to retrieve information from text. It is a new issue that draws great...
Keywords/Search Tags:BBS, Text information mining, word segmentation dictionary, automatic word segmentation
PDF Full Text Request
Related items