Font Size: a A A

Clinical Medicine Literature Retrieval Research Based On Electronic Health Record

Posted on:2019-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2428330548467495Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the popularization of medical information system,there is a large number of EHR electronic health records,which truly describes the clinical manifestations of patients.Unlike most of the synthetic data used as a query in the past,this paper uses EHR electronic health record as a query,and studies the appropriate information retrieval model to help doctors find medical documents faster and more accurately.This task has also received high attention from the information retrieval and biomedical information circles,and set up the task in the TREC competition in 2016 and 2017 consecutive years.Medical retrieval has always been a hot topic in the field of information retrieval.The characteristics of the given real clinical data:EHR as a query,which contains summary,description and note three types and varies in length.Medical data sets and query topics contain a large number of proper nouns and abbreviations.Since a doctor writes clinical notes in an urgent time,there is a general problem of irregular and incomplete format and content.However,in the traditional document length normalization method,if the parameter sets a very small value,it runs better for short query,and it is more favorable to the long query when the parameters are set very large.The existing pseudo relevance feedback model do not take into account the importance of the candidate term in feedback documents and the co-occurrence relationship between a candidate term and a query term simultaneously.In view of this,the main contributions of this paper are as follows:First,based on the probability model,we set up a dynamic function instead of the regular parameter adjustment of the fixed value.The dynamic function needs to satisfy(1)The article contain the query term that can be retrieved when only one query word.(2)The function value is decreasing with the increase of query length.(3)The function is bounded.At the same time,we adopt a new concept of average specific group frequency,making a new model to make a word difference.Second,in order to better solve the problem of terms that have a higher co-occurrence degree with a query term are more likely to be related to the query topic.In this paper,we use the Hyperspace Analogue to Language(HAL)model to set fixed size of window,and calculate the weight of each candidate expansion in the window and the adjacent word weight of the initial user query.We study how to incorporate proximity information into the Rocchio's model,and propose a HAL-based Rocchio's model,called HRoc.Finally,we use normalization method to calculate the scores of candidate query terms,and select the top candidate query terms as extended query terms to improve the user's query intention.The improved models presented in this paper have carried out a large number of experiments on the TREC clinical decision support tracking data set.The results show that the proposed method is feasible and effective on most evaluation criteria.
Keywords/Search Tags:EHR, Document Ranking, Query Expansion, Word Proximity, Clinical Medicine Retrieval
PDF Full Text Request
Related items