Font Size: a A A

Visualization Analysis Of Data Quality And Common Adverse Birth Outcomes In Electronic Medical Records

Posted on:2024-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:H R LiFull Text:PDF
GTID:2544307166462534Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Prematurity,macrosomia,and low birth weight infants are common adverse birth outcomes that not only affect the incidence and mortality rates during the perinatal period,but also have long-term effects on the health of newborns.However,due to the unclear pathogenesis of premature,macrosomia,and low birth weight infants,it is difficult for obstetricians to identify potential high-risk groups in clinical diagnosis early.Therefore,exploring the high-risk factors that affect preterm,macrosomia,and low birth weight infants can help obstetricians locate high-risk groups that may have adverse pregnancies,and establish effective early warning mechanisms.In addition,due to the differences in the number of medical examinations for different pregnant women,data entry errors and non-standard data representation,the integrity,consistency and accuracy of prenatal examination data are difficult to be effectively guaranteed,which seriously affects the effective use of data.Therefore,it is necessary to evaluate the data quality,which can provide a reasonable evaluation of the quality of the data and allow users to understand the problems with the data and take appropriate measures to improve data quality.The main work of this thesis is divided into the following two parts:(1)Data Quality Analysis.In this module,in order to evaluate the data quality,we constructed a data quality evaluation model OR_DQEM that includes five collections for electronic medical records(EMR)in obstetrics and selects integrity,consistency and accuracy as the evaluation dimensions.And then we assign weight based on the selected quality dimensions through the improved entropy method,and calculates the final evaluation results according to the weight of each dimension.In the design of data quality visualization based on evaluation dimensions,we constructed views from three dimensions: completeness,consistency,and accuracy.Firstly,we used a data missing chart to display the overall missing data of the dataset,and then used a word cloud to display the inconsistency of textual data.Finally,we used box plots,statistical methods,and model-based methods to handle outliers.After the data quality analysis is completed,we manually correct the data based on the evaluation results and visual views to ensure a good data foundation for subsequent research.(2)Visualization Analysis of Common Adverse Birth Outcomes.This thesis takes premature,macrosomia and low birth weight infants as the research objects,and selects the corresponding visual view and data analysis technology to establish a visual analysis system based on EMR.The design process is mainly considered from two perspectives: initial and pregnancy data.During the initial data analysis process,we used a two-stage feature selection algorithm based on Relief-RFE to select the features that affect premature births,macrosomia,and low birth weight infants,and constructed a feature distribution sunburst chart to show the highly influential features according to the selection results.Then,we used a parallel coordinates system to display the feature value distribution intervals.To differentiate between different groups of pregnant women,we used T-SNE and K-means++ algorithms to perform dimensionality reduction and clustering on the selected dataset,grouped the data based on the clustering results,and improved the feature value distribution chart based on the grouping categories.During the pregnancy data analysis process,we used a hybrid feature selection algorithm based on RF-XGBoost to calculate the feature weights corresponding to each feature at different gestational weeks,and then selected the features based on the total weights of each gestational week to construct a distribution chart of gestational period feature weights.Based on the characteristics of the selected features,we divided the feature sets into rising and fluctuating groups and constructed a chart showing the number of people corresponding to normal and abnormal feature values to identify the changes in the number of people corresponding to prenatal examination indicators during pregnancy.Finally,the effectiveness of the visualization design was verified through case studies of premature births,macrosomia,and low birth weight infants.
Keywords/Search Tags:Adverse birth outcomes, Data quality assessment, Feature selection, Visual analysis
PDF Full Text Request
Related items