Font Size: a A A

Research Of Intelligent Answering System Based On Chinese Word Segmentation

Posted on:2013-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:W T KongFull Text:PDF
GTID:2248330371490229Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the wide application of Internet technology, the people’s needs for information are becoming more and more. As well the needs of students for information are also increasing in the teaching activities, the traditional teaching can be not able to meet these needs because of its limitations of the resource library, so online teaching has emerged. Online teaching using the Internet technology to create an autonomous learning environment for students. Meanwhile, online teaching breaks the limitations of the traditional teaching resource library, because it can make full use of internet resources.Q&A activity that is one of the most direct way to acquire knowledge for students, it is the important part of the teaching process. The performance of online question answering will directly affect the quality of online teaching in the online teaching activities. But there are still a lot of deficiencies on the existing question answering system, especially the terms of intelligent is relatively poor. So a highly intelligent question answering system has become the urgent needs of online teaching. Chinese word segmentation is the core technology of Chinese Information Processing and also the key to determine the intelligent of answering system.Based on the key role of the Chinese word in intelligent question answering system, this thesis mainly done the following work:First, it analyzes the strengths and weaknesses of the existing Answering System, then take a brief overview of the Chinese word, Second, it designs a professional dictionary structure for the intelligent question answering system-the two-word dictionary hash storage structure. Meanwhile we also design an improved maximum matching Chinese word segmentation algorithm, the improved algorithm can dynamic obtain the maximum length of the word by the first and second word index, which is as a basis for segmentation, it breaks the flexibility of traditional maximum matching algorithm which make a segmentation based on a fixed word length and improves the efficiency of the segmentation. It also makes the use of statistical algorithms to dealing with ambiguity of segmentation results and forms an overall algorithm structure based on dictionary and statistics, it takes the Chinese word segmentation algorithm which we designed apply to the Lucene which is the most popular full-text search toolkit, it can make up the lack of Lucene’s word segmentation. Last, it achieves a web-based intelligent question answering system by using Java’s most popular three open source frameworks that SSH (Struts2+Spring+Hibernate) and Lucene. The system major includes three function modules:intelligent question answering module, BBS Q&A modules, Email Q&A modules, it also gives implementation of question management modules and user management modules for ensuring the integrity of the system functions.This thesis makes a test of retrieval for the system based on a large number of course resources of local question library. The test results show that, the system has good performance and be able to meet the needs of users.
Keywords/Search Tags:intelligent question answering, maximum matching, Lucene, chinese word segmentation
PDF Full Text Request
Related items