Font Size: a A A

Research On Answer Extraction Technologies In Question Answering System For Unstructured Texts

Posted on:2020-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:J M MaFull Text:PDF
GTID:2428330590973270Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The search engine is an important channel for users to obtain knowledge and answers.Through query entered by users in search box,search engine returns sorted web pages for users to browse.The user needs to quickly browse some webpages and find the target answer according to his discrimina ting ability.The whole process is time-consuming and laborious.Question answering system based on natural language processing technology is an important improvement direction of traditional search engine,which can help users obtain accurate and short answers.Unstructured texts such as Wikipedia and Baidu Encyclopedia are important data sources for building question answering systems.Compared with the question answering system based on structured knowledge such as knowledge graph,unstructured text data is large in scale and easy to obtain.In order to improve the accuracy of the system returning answers,according to the intention of the question,the candidate sentence set of the answer is reduced,and then the answer is accurately positioned.In this paper,the technology of sentence selection and precise answer location for candidate answers in question answering system is studied.For the selection of candidate answer sentences,this paper studies two aspects: 1)Traditional machine learning methods is used to model the relationship between problem sentences and answer sentences,and three aspects of features are extracted,such as similarity based on sentence vector representation,based on the basic features of word co-occurrence and sentence length,we use SVM and Xgboost classifier to give the scores of the questions and each candidate answer sentence for the order of answers.2)This paper also uses CNN,LSTM and other deep learning models to semantically represent sentences.The pairwise method is used for training.The experimental results are better than traditional machine learning methods.For the study of precise answer positioning,we regard this task as one machine reading comprehension task,proposed a baseline model for the reading comprehension task,and improved the input features and model structure based on the model: LSTM is replaced by bidirectional LSTM,introducing attention mechanism to increase the semantic interaction of questions and a nswers sentences,increasing the pre-trained ELMO word vector and multiple model integration methods enhance the EM and F1 indicators.Experiments show that these methods have obvious effects on the basic model and are suitable for machine reading comprehension task.
Keywords/Search Tags:question answering system, answer extraction, deep learning, answer selection, machine reading comprehension
PDF Full Text Request
Related items