Font Size: a A A

Chinese Word Segmentation Using Rule And Statistic

Posted on:2008-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ChenFull Text:PDF
GTID:2178360242458956Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
We are in a new Information Age. The most notable character of the Age is that the computers are playing more and more important role in human's common day. Natural language is the most important tool in man's communication. It has a very close relationship with language management. The computers with only 40 years are challenging the Chinese words with 6000 years.In the written Chinese, they are successively written between character and character, word and word, the words have not obvious marks in the sentence. So, the chief task of understanding Chinese is that successive Chinese characters cluster is divided into word sequence, namely automatic words segmentation. Words segmentation is a process that successive character sequence is once again combined to word sequence according to definite criterion. As the basic of Chinese natural language understanding, automatic words segmentation is mostly applied to information retrieval, Chinese characters processing, speech processing, content recognition and analysis, natural language understanding, and so on. At present, academia mainly adopts computer automatic words segmentation to solve Chinese words segmentation.When we get a document, we should skim it over, and then read the document selectedly. Many researchers let the automatic words segmentation first to imitate the human. But why don't we stop and think about the purpose of reading. As a matter of fact, we read something with some intention, either to learn or to entertainment. How to let the computer can think like the human's brain?It's the most important ideal in this paper that the real purpose is let the computer understand Chinese rather than Chinese words segmentation. So we can select a professional dictionary to reduce the extension. It is really a good choice.I propose a simple method that prepare for the Chinese words segmentation, in other words, use professional dictionaries. In details, we choice right professional dictionary first, and then do Chinese word segment in statistic method. Follow this method, the algorithm should be better both in veracity and efficiency.
Keywords/Search Tags:professional dictionary, Algorithm for Chinese Word Segment, Regular Word Segment, Statistic Word Segment
PDF Full Text Request
Related items