Font Size: a A A

Application And Research Of Improved Chinese Word Segmentation Algorithm In Automatic Answering System

Posted on:2009-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:X S MengFull Text:PDF
GTID:2178360278472107Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and network technology, computer technology had been used widely. Online teaching platform is a typical example of Internet and network applications. As an indispensable subsystem of online teaching platform, automatic answering system can answer students'difficult problems and eliminate learning barriers in time. The development of automatic answering system is depended on the comprehensive application of other technologies. Among these technologies, Chinese word segmentation technology is a key link of automatic answering system, it is an important technology in automatic answering system, this technology will immediately relate to the intellectuality of automatic answering system.This paper mostly researches Chinese word segmentation technology, because it is a basal place in automatic answering system. First, this paper will research the background and current situation of automatic answering system; and then, make a brief overview of Chinese word segmentation technology; in the last, analyze the characteristics of the questions, meanwhile, research the adaptive word segmentation algorithm according to the characteristics. This paper researches the properties of the problems based on the results of word segmentation. After students question, the system will automatically match the similitude degree with the questions in knowledge base, and then return the questions and their answers which have higher similitude degree in knowledge database to students in order to achieve the intelligence of answering system.This paper presents an improved word segmentation algorithm by analyzing and comparing the classic algorithm. Its basic idea is: First of all, the sentence will be cut into clauses according to the punctuation table, and then the clauses will be segmented words with FMM. Meanwhile, the string matching information will be saved during the matching process. To judge the overlapped ambiguity fields, according to the string matching information and the improved word by scanning method. In the end, the disambiguation process will be dealt with. The improved word segmentation algorithm is the combination of the principle of longest word first and the improved word by scanning method, and utilizes the dictionary mechanism of dynamic TRIE. Meanwhile, statistical method is also used to eliminate ambiguity. The improved word segmentation algorithm inherits the characteristics of FMM which are fast and efficient, and makes use of the trait of statistical method to eliminate ambiguity.This paper searches the application of word segmentation algorithm in automatic answering system, and in addition, it presents the general design and the module design of automatic answering system, this will be regarded as a reference when the system is detailedly designed.In the end, this paper presents the analysis and summary of the system, and put forward the views to further perfection and improvement.
Keywords/Search Tags:Chinese Word Segmentation, Automatic Answering System, Forward Maximum Matching Method, Overlapped Ambiguity
PDF Full Text Request
Related items