Font Size: a A A

Design And Implementation Of Data Verification Method For Cohort Study

Posted on:2019-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:M Z LiuFull Text:PDF
GTID:2348330545986353Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Cohort study is an internationally adopted method to explore the etiology of common major diseases.Due to various human factors and the problems of cohort management information system,there may be errors or misses for data of cohort study in different stages of data collection.Traditionally,manual verification is implemented in a certain stage,which cannot fully cover the issues of these errors or misses,and also costs much time and labor efforts.To address this challenge,this thesis uses form recognition and unstructured information extraction technology from electronic medical record(EMR)to carry out the design and implementation of an automated data verification methodology for cohort study.The specific contents are as follows.1.Based on investigating and analyzing the related technologies of form recognition,an automatic data verification method based on case report form(CRF)is designed.The CRF structure is recognized by constructing a description language-based model,and machine learning techniques are used to recognize the tick marks and handwritten digits in CRF.Both steps are sequentially conducted to automatically verify registry data from a specific cohort study,and the experimental results show that the precision,recall and F1 score are 79.06%,89.04%and 83.75%,respectively.2.An EMR-based automatic data verification method is designed in this thesis,with a solid summary and analysis of the state-of-the-art information extraction approaches for EMR.Specifically,a rule-based method is developed to extract the related information from a large volume of EMR.The extracted results are further used to verify cohort data in an automatically manner.The experimental results show that the precision,recall and F1 score are 89.06%,92.43%and 90.71%,respectively.3.This thesis proposes a collaborative verification method based on multi-source data due to some limitations of the above two methods,in which a collaborative verification model is designed.This model defines some rules from the data existence,consistency and credibility perspectives,and then utilizes them to merge the obtained results from both CRF and EMR data sources into final results.The proposed model is applied in a specific cohort dataset and the experimental results show that the verification performance is significantly boosted(precision,recall and F1 score are 93.29%,96.14%and 94.69%,respectively).4.Based on the proposed collaborative verification method,the data verification function is developed and implemented in the cohort management information system,to support cohort data verification workflow.
Keywords/Search Tags:Cohort Study, CRF, EMR, Collaborative Verification
PDF Full Text Request
Related items