Machine reading comprehension has always been an important research direction in the field of natural language processing,the purpose of which is to enable machines to analyze and understand text information like humans to help people obtain information more quickly.However,many existing machine reading comprehension models do not answer questions correctly when dealing with complex and long text information,which can no longer meet the existing needs.They mainly have the following problems:(1)The recurrent neural network commonly used by the model has the problem of gradient vanishing and gradient explosion when processing long text information,resulting in the weight of some words being too high or too low,and the model will determine the answer too early or ignore the early key information.(2)The answer selection method adopted by the model is mostly to calculate the correlation between words and questions in the article in a single time,and some non-key information cannot be excluded according to the internal semantic relationship of the article before calculation,resulting in the model being inefficient in selecting answers.In order to improve the efficiency of machine reading comprehension model,corresponding improvement methods are proposed in this paper to solve the above two problems.In addition,the machine reading comprehension model is combined with the retrieval algorithm in this paper to implement an automatic question answering system,which can answer a variety of open-domain and free-sentence questions.The main work of this paper is as follows:(1)An improved recurrent neural network machine reading comprehension model based on Siren function was researched.The model replaces the traditional word segmentation with a more efficient word segmentation tool,which reduces the amount of text sequence information to a certain extent and improves the efficiency of the model when processing long sequence information.The Siren activation function is introduced into the model,which can be used to converge the output of the neural network to the [-1,1] interval faster to balance the weights of each word,and alleviate the problem of gradient vanishing and gradient explosion in the neural network with a simpler structure.The improved model improved by6.58% on the EM index and 5.97%on the F1 index.In addition,this method also has the effect of enhancing the information extraction ability of the long short-term memory network and the gated recurrent unit network of the variant recurrent neural network,and shortens the training time by 17%.(2)A double-scored machine reading comprehension model based on multiple attention was researched.The model adopts a double-scored answer selection method,which can use the semantic relationship inside the article to assist the answer selection work of the model.First,use the self-attention mechanism to initially score the words and sentences in the article,so as to exclude some non-focused information or reduce their weight.Then,the self-attention weight of the article is combined with the problem representation matrix,and the bidirectional attention mechanism is used to score the words in the article for the second time.Finally,the results are delivered to the fully connected layer and classifier to obtain the prediction answer.This method can reduce the focus of the model on non-key information and improve the processing effect of effective information,and the model has a higher accuracy rate in answering questions than its best-performing counterparts.(3)The research realized an automatic question answering system based on the machine reading comprehension model.The realized machine reading comprehension model and retrieval algorithm are combined to realize the document retriever and document reader of the automatic question answering system respectively.Experimental results show that the question answering system can answer questions in open-domain free sentences better,and has a better answer effect than the traditional question answering system. |