In recent years,more and more scholars have combined artificial intelligence(AI)with medical treatment,and traditional Chinese Medicine(TCM),as a traditional Chinese medical technique,has been handed down for thousands of years.In this thesis,the concept of machine reading comprehension is added to the existing question answering system in the field of machine learning to create a database of Chinese medicine questions and answers.Since Chinese articles are divided by punctuation,so you have to segment the text into words first.In this thesis,the open source pkuseg word segmentation tool is used for Chinese word segmentation,and the deep contextualized word representation and bidirectional gate control loop are added,through the self-attention mechanism,the feature vectors of the article and the question are obtained respectively.Finally,the predicted answer’s starting position and ending position in the article are obtained by using the similarity calculation formula.A total of 22456 question-and-answer pairs,including ID,questions,answers,according to the content,according to the 3:1 ratio for training and testing model.In the Data pre-processing,the length of articles in the Chinese medical literature is reduced and the answers range in paragraphs.By embedding each word after the word segmentation to mean Glo Ve,word embedding to mean Char CNN and contextualized embedding to mean ELMo,this thesis presents three ways to express enhanced contextual relevance,to reduce the inevitable segmentation errors in Chinese word segmentation due to the complexity of the text.In the process of training,the concept of fragment in deep learning is added to alleviate the problem of gradient disappearance or explosion.Machine reading comprehension model is divided into five layers,including input layer,embedded coding layer,article-problem concern layer,model coding layer and output layer.In the end,all the answers predicted by the model are stored in the file in the form of text,which is easy to view and call.The QANet model,R-Net model and the model are tested by Rouge-L and Blue-4,and the results show that the improvement of QANet model is higher than that of the model with ELMo.This thesis is written in Python 3.6.8,runs on the Visual Studio Code Compiler,and implements the construction model through the MXNet framework and calls to the GLUONNLP deep learning toolkit.The goal is to enable the machine to learn the TCM literature and eventually become an intelligent TCM doctor,able to correctly understand all TCM questions raised by users and give accurate answers,using c # on Visual Sudio2010 compiler based on trained machine reading comprehension model to achieve intelligent Chinese medicine automatic question answering system. |