Font Size: a A A

Research On The Generating Auxiliary Diagnosis Dialogue System For Gastroenterology

Posted on:2020-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:M Z ChengFull Text:PDF
GTID:2404330572488162Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of society,the pressure of people is also growing.Irregular diet has become a common problem for modern people,and food safety problems exist all the time,which has led to an increase in many people suffering from digestive diseases.Many diseases of the digestive system have a long onset period.Usually,the early impact on people is very subtle,only be some slight discomfort.It does not constitute sufficient conditions for going to the hospital.Therefore,when the body just has had abnormalities in the digestive system,most people will first choose to seek help online.When a traditional search engine handles a search request for a disease,the technical principle is usually keyword matching,which has many limitations,such as losing key information of the disease.And the entire process is time consuming and may be an invalid query.In this context,considering the dialogue system as an advanced information retrieval system,it can return relevant effective information according to the user's input.This thesis explores a generative dialogue system suitable for digestive medicine.It includes three parts:the text segmentation,the classification of the text,and the dialogue model.1.Study some common word segmentation methods,and analyze the advantages and disadvantages of those methods,as well as the Jieba word segmentation tools applicable to Chinese,and study the problems that Jieba dealing with digestive corpus.Based on Jieba,we construct a professional dictionary in the field of digestive medicine,using the two-way maximum matching segmentation method for word segmentation,and increasing the ambiguity elimination strategy.The experimental results show that our word segmentation strategy can effectively solve the wrong classification problem of disease name,symptom name,drug name,and the significant reduction of word segmentation ambiguity in the process of word segmentation of digestive medical corpus.2.Study the lack of data on questions and answers in the field of gastroenterology.The Beautiful Soup crawler is used to obtain the initial corpus.After data cleaning and word segmentation processing,the commonly used keyword extraction algorithm is studied.Based on this,I propose the word vector construction method of keyword association category,constructing the sentence vector of the question,and I use the sentence vector as the input of the support vector machine,using the active learning strategy to obtain the classification model and to achieve the classification operation of the text.The experimental results show that the sentence vector used in this paper performs better classification than the word2vec(Word to Vector)vector,and the classification model can obtain five kinds of balanced data in the digestive medicine.3.Study the traditional sequence-to-sequence model(Seq2seq),and analyze the flaws of the model in generating the answers to the digestive medical questions and answers,and study Google's parsing tree generation model.Based on the two model,by combining multi-layer cod:ing,attention mechanism decoding,Gated Recurrent Unit(GRU)and Beam Search,we propose the structure of the dialogue model.And combine key-value pair vector and word2vec vector,we propose the new method of model training.The experimental results show that the model structure proposed in this paper can solve the problem that traditional generative models generate answers which are not related to the questions,that the generated sentence structure is incomplete and that the specified input sentence is consistent when responding to the digestive diseases.
Keywords/Search Tags:Gastroenterology, Word segmentation, Word vector, Text classification, Dialogue model
PDF Full Text Request
Related items