Font Size: a A A

Question Analysis Of Chinese Automatic Question Answering In Health Field

Posted on:2020-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:D ZhaoFull Text:PDF
GTID:2404330590982601Subject:Health information management
Abstract/Summary:PDF Full Text Request
[Purpose] This study aims at question analysis for Chinese consumer health question answering.Taking the lung cancer health field as an example,we construct a high-quality question analysis model to achieve automatic analysis of lung cancer consumer health question(including question intention,key semantic components recognition,and inter-entity relationship extraction),and promote the development of question answering system.The purpose of this study is(i)to form a key information labeling system for lung cancer consumer health questions,(ii)to use manual labeling technology to form labelled corpus,(iii)and to implement automatical question analysis with deep learning methods.[Methods] Based on the 10,000 real-world lung cancer consumer health questions crawled from the online medical question answering platform,we construct a question analysis framework based on the BiLSTM model.We use contrast and statistical analysis to construct a key information labeling system for Chinese lung cancer consumer health problems and manually label the corpus.Then use the BiLSTM-CRF model to recognize the questioning intent and key semantic components.After the named entity recognition we use the Attention-Based BiLSTM model to extract relationships between entities.Finally,the above analysis results are stored in the general data exchange format JSON.[Results] According to the key information labeling system(including 20 question entities,22 question types)constructed in this study,through three rounds of labeling and expert proofreading,we manually labeled 10,000 real-world lung cancer consumer health questions which contains 38,505 question entities and 10,361 question types.Then conducted a questions understanding on those labeled questions,The average F1 values for question entities and question type identification reached 81.47 and 80.31,respectively.The average F1 value of the relationship between the entities reaches 85.27.The analysis result is visualized and stored as JSON Data exchange format.[Conclusions] This study extracts semantic information such as question entities,question types and relationships between entities through labeling system construction,dataset manual annotation,question understanding model design and implementation,and achieves efficient analysis of Chinese lung cancer consumer health questions.It laid the foundation of an automatic question answering system for lung cancer consumer health question.At the same time,the implementation process of this research has certain generalization ability,which provides a reference for researchers to adapte to other health domin.
Keywords/Search Tags:Lung cancer consumer health question, Automatic question answering system, Question analysis, Deep learning
PDF Full Text Request
Related items