Research On Risk Assessment Model Of Cardiovascular Disease Based On EMR Data Analysis

Posted on:2020-04-09

Degree:Master

Type:Thesis

Country:China

Candidate:W W Yang

Full Text:PDF

GTID:2404330590453162

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Cardiovascular disease is a major public health problem that endangers human health.It has the characteristics of occult onset,long latency and difficult to cure after onset.Relevant studies have shown that the occurrence of cardiovascular disease is closely related to its risk factors.By early detection and prevention of these risk factors,and the establishment of appropriate disease risk assessment models,cardiovascular disease can be effectively prevented and controlled.There are endless studies on cardiovascular disease risk assessment at home and abroad,but most of the existing studies are based on questionnaire data,literature data and established risk factors,which can not fully grasp the risk factors of cardiovascular disease.Most of them are based on medical statistical methods,which have some limitations.In recent years,with the continuous advancement of medical information construction,medical information systems such as electronic medical records(EMR)have developed rapidly.EMR data not only contains the patient’s test indicators data,but also contains a large number of hidden and valuable information,providing a new data source selection for cardiovascular disease risk assessment research.In addition,the continuous improvement and development of data mining algorithms have gradually increased their application in disease risk assessment,making up for the shortcomings of traditional statistical method model construction.Therefore,the rational use of data mining algorithms to explore the potential rules and patterns of EMR data is of great value for the early prevention and treatment of cardiovascular diseases.In this paper,the risk assessment model of cardiovascular disease is deeply studied by EMR data mining technology,taking hypertensive patients as the research object.The main contents and achievements of this paper are as follows:(1)A series of pretreatment operations were carried out to deal with theinconsistency and incompleteness of EMR data sets of hypertension.In order to provide clean and effective data for data analysis algorithm,this preprocessing process has certain reference value for EMR data preprocessing of other diseases.(2)For the attribute redundancy and multi-collinearity problem in the EMR data set of hypertension,the risk component screening operation was performed by principal component analysis in statistics.Twenty of the main risk factors for hypertension were screened from more than 50 test items,which effectively reduced the complexity of the model construction.(3)Aiming at the problems of statistical methods in the model construction,the method of cardiovascular disease risk assessment model construction based on decision tree C5.0 algorithm was adopted.In the process of model building,Boosting technology is adopted to improve the robustness of the model,and ten fold cross validation is used to improve the reliability of the model.At the same time,in order to avoid over-fitting of the model,pruning operation of decision tree is also carried out.Compared with before pruning,the prediction accuracy of the model increased from65.18% to 73.21%.

Keywords/Search Tags:

EMR data, cardiovascular diseases, screening of risk factors, data analysis, C5.0 algorithm

PDF Full Text Request

Related items

1	Research On Intelligent Data Analysis Model And Algorithmi For Cardiovascular Diseases
2	Analysis And Evaluation Of The Risk Factors And Novel Screening Models For Osteoporosis Based On Big Health Data
3	Clinical Evaluation Research On The Treatment Of Cardiovascular Diseases By The Method Of Replenishing Qi And Activating Blood Based On Meta-analysis And Data Mining
4	Research On Risk Assessment Model Of Cardiovascular Diseases Based On Data Mining
5	Analysis And Research Of Tumor Mode Based On Medical Big Data
6	Epidemiological Characteristics And Risk Factors Of Infectious Diseases In Shandong Province Based On Medical Big Data
7	Design And Implementation Of Data Processing And Analysis Of Rehabilitation Equipment Based On Big Data
8	Changes In Cardiovascular Risk Factors And Prediction Of Cardiovascular Diseases Risk In Chinese Population In 2021-2030
9	Research On Disease Gene Mining Algorithm Based On Data Fusion
10	Analysis Of Influencing Factors And Risk Assessment Of Gastrointestinal Diseases Among Middle-aged And Elderly Chinese Based On Data Warehouse