Font Size: a A A

Research On Query Reformulation For Medical Data Search

Posted on:2019-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WangFull Text:PDF
GTID:2428330566460653Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The explosive growth of data has promoted the rapid development of information science and technology.In the field of healthcare,information technology has made great progress.One application in healthcare is Clinical Decision Support Systems,including utilizing patient profiles as queries to search for relevant medical support.By these means,Clinical Decision Support System can effectively tap deep data in medical treatment,improve the efficiency of medical services,and reduce the accident rate.Query reformulation has become a hotspot for medical data search.Previous work on medical search has a primary topic on query expansion,which enriches the queries by adding more useful terms.However,it performs well when queries are concise.The patient personal profiles saved in electronic health record as queries are saved in free text form,and include patient characteristics,diagnosis,medical testing,treatment and medications.The rich and complex patient profiles can bring in redundant information,which cannot be solved effectively by query expansion.In this paper,we aim to tackle verbose queries and propose a new query reformulation method,which not only considers query expansion,but also includes query reduction.The main contributions of this paper include:1.Designing four query types for medical texts,and establishing corresponding semantic mapping tools.Considering negated words and medical terms to reformulate more effective queries,we design four query types for medical texts: Positive,Negative,Stop,and Normal.Besides,we establish corresponding semantic mapping tools to support subsequent query reformulation algorithms.Semantic mapping tools rely on the characteristics of medical text and can automatically map words to corresponding query types.2.Creatively presenting a method for achieving automatic query classification,and designing a query reformulation algorithm based on threshold partitioning.We propose a threshold partitioned query reformulation model,which combines query expansion with query reduction together for verbose queries.The algorithm uses sentence in the query as candidate for processing.First,we utilize the semantic annotation tools to annotate the sentences of queries.Then,sentences of queries can be divided into two classes: query expansion and query reduction,depending on the occurrence of medical and negative terms.After that,we modify the score of the BM25 model for information retrieval based on these two classes.Compared to the prevailing query expansion method and the baseline of original query,the method proved to be effective in the evaluation for the existing TREC CDS datasets.3.Presenting a query reformulation model based on unsupervised learning.To better understand the potential impact of the queries,we present a new automatic query classification reformulation model based on unsupervised learning.The method is achieved by classifying each sentence into expansion and reduction categories with a weighted score model.Experiments show that the query reformulation algorithm based on unsupervised learning can further understand the intent of the query word and achieve optimal results.In the 2016 TREC CDC dataset,the final performance makes improvements in terms of NDCG as 22.88% compared to the baseline of original query.Finally,we design and implement an online query reformulation prototype system.The system can give a comparison between two algorithms and visualize the experiment results of this article.
Keywords/Search Tags:Query Reformulation, Medical Data Search, Clinical Decision Support, Query expansion, Query reduction
PDF Full Text Request
Related items