Font Size: a A A

Research On The Structuration And Standardization Of Laboratory Test Results For Medical Big Data

Posted on:2021-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:T YangFull Text:PDF
GTID:2404330605974937Subject:General surgery
Abstract/Summary:PDF Full Text Request
Objective:In recent years,artificial intelligence has made unprecedented progress in the medical field and has important application value for the diagnosis and treatment of diseases.The development of artificial intelligence is inseparable from the support of big data.As an important source of medical big data,electronic medical records contain a large amount of disease diagnosis and treatment knowledge and patient health data.Among them,laboratory test,which is an important part of clinical information,plays a key role in doctors' disease diagnosis and treatment.However,such information is contained in unstructured medical texts,which makes it very difficult for computers to understand laboratory test information.Also,due to the variety of laboratory test statements,it is challenging to do structuration and standardization.In addition,medical ontology mainly exists in the form of a single term,but laboratory test results are generally composed of three parts:"specimen,analyte,abnormality",so there is a gap in structure and semantics between the laboratory test ontology and medical ontology.In order to solve such series of problems,this subject explored the establishment of UMLS-encoded laboratory test knowledge base,and the development of corresponding algorithms to transform the unstructured laboratory test results in electronic medical records into a structured and standardized expression of terms,which laies the foundation for the follow-up research.Methods:?Build the knowledge base.The knowledge base of laboratory test results based on UMLS terms is constructed by using the relevant English laboratory test name resources.?First,the laboratory test results in the free text is transformed into a structured triplet format of "specimen-analyte-abnormality".Then the logical laboratory test results expression is transformed into UMLS term expression through this knowledge base.Finally,we choose the electronic medical record acquired from the Internet as corpus to evaluate the effectiveness of this knowledge base and corresponding algorithms.Results:We mapped 453 laboratory tests to 2242 UMLS terms,of which 72.6%were quantitative tests and 27.4%were qualitative tests.In addition,we collected 966 electronic medical records covering 26 different departments.Taking the expert annotation as the gold standard,12949 laboratory test results were annotated,including 10585 quantitative tests and 2364 qualitative tests.?The standardized precision,recall and F1 score of the results of gold standard annotated by experts were 1.000,0.731 and 0.845,respectively.?These cases are first pre-processed and output by the algorithm as 11,219 laboratory results in a structured triplet format.These structured data are then standardized into UMLS encoding.The results showed that the number of true positives is 7262,and that the precision,recall and F1 score were 0.647,0.767 and 0.701,respectively.?We used 210 cases to evaluate 21 different departments,of which the highest algorithm evaluation was general surgery,with F1 score of 0.933 and 0.833,respectively.Conclusions:In this research,we build a knowledge base from logical expression phenotype to term expression phenotype.Based on this knowledge base,we developed an algorithmic tool that automatically structured and standardized laboratory test results in cases.The knowledge base and algorithm can successfully transform unstructured laboratory test results into structured and standardized terminology,which plays an important role in computer understanding of laboratory test results and the secondary use of electronic medical records,such as patient clustering with the same characteristics,machine learning,medical artificial intelligence and so on.In addition,the knowledge base laies a foundation for us to build the ontology of laboratory test results in the future.
Keywords/Search Tags:medical big data, electronic medical record, information extraction, Data standardization, natural language processing
PDF Full Text Request
Related items