Font Size: a A A

Research And Implementation On Intelligent Question Answering System

Posted on:2017-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z L YeFull Text:PDF
GTID:2308330485475156Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Intelligent question answering system turns massive information of Internet into knowledge, understands users’questions asked in natural language, and provides answers to users quickly and accurately. It changes the way that traditional search engine returns hyperlink and reduces rapidly the time for users to seek information which is the development trend of next generation intelligent search service.In this thesis, the Intelligent Question Answering System (IQA) is designed and realized. This system can understand Chinese natural language questions and search information rapidly from local RDF knowledge base and massive Web data. The system mainly includes two modules that are question understanding and answer retrieval.Question understanding module consists of lexical analysis, subject recognition, predicate recognition, predicate disambiguation and question elements conversion. Answer retrieval module consists of two sub-modules:answer retrieval based on RDF knowledge base and answer retrieval based on Web.Chinese words segmentation and POS tagging is the foundation of semantic analysis for Chinese texts, and it is also the key step of the intelligent question answering system. Therefore, aiming at the problem of Chinese words segmentation system that ignors the changes of words meaning in different contexts, the POS tagging method of combining of Maxent algorithm and segmentation directory is proposed in this thesis, which improves the performance of POS tagging for multi-category words. In this thesis, the natural language questions are divided into six categories which are character, movie, music, book, game, and application. The question’s category and subject is acquired by user-defined feature template and tagging sets based on CRF(Conditional Random Field) algorithm. Then the predicates of questions are extracted through syntactic analysis and the predicate dictionary. The predicate is disambiguated through words similarity computation, so the procedure of disambiguation makes the predicate consistent with attribute name of RDF knowledge base. The question element definition is given, which is the structured presentation of natural language questions composed by the form of "[subject, predicate]". Question element conversion accuracy determines whether the questions are understood and analyzed accurately or not.The answer retrieval methods are mainly based on the knowledge base and based on Web. The knowledge triples, stored in RDF knowledge base, are classified into six categories of people, movie, music, book, game, and application. In the question understanding stage, the questions are classified, the subject and predicate of the questions are recognized, and then the questions are converted into question elements. After that the question elements are transformed into SPARQL language and the answer is extracted from knowledge base. If the answer can not be found from RDF knowledge base, the method of extracting answers through Web search engine will be applied. The answer retrieval based on Web takes the question as the inquiry request, and the answer is extracted from the returned inquiry results.
Keywords/Search Tags:Big data, Next generation search engine, Natural language understanding, Intelligent question answering, Knowledge acquisition
PDF Full Text Request
Related items