Font Size: a A A

Application Research On Electrocardiogram Diagnosis Based On SMOTE+ENN And Random Forest

Posted on:2020-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z W ShangFull Text:PDF
GTID:2404330596498359Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays,the integration of Artificial Intelligence and medical industry has been deepened.This paper is based on the research,development and industrialization with artificial intelligence background,and applies artificial intelligence technology to medical health clinical assistance for diagnosis decision.In Electrocardiogram(ECG)field which is studied in this paper,already has many machine learning methods for detecting heart diseases and arrhythmia such as Convolutional Neural Network(CNN),Support Vector Machine(SVM),Decision Tree(DT)and other algorithms.At present,there are many methods for classification of ECG data based on public data sets,and have achieved good results,but they lack corresponding medical interpretability.Moreover,when faced with real-world data,exists the problem like uneven data distribution,messy data annotation format,lack of data labels and accuracy of traditional classifiers is low.In view of the above problems,this paper uses SMOTE+ENN integration algorithm to solve the imbalance data distribution in the real world.For the problem of messy data label,the expert labeling in MIT-BIH arrhythmia database(MITDB),and the background,expertise knowledge in ECG and related medicine field are chosen as reference,to establish a data label library for a famous hospital in Shanghai.To deal with the low accuracy when using traditional machine learning algorithms on real world data,we put forward an algorithm which is based on random forest algorithm(RF),and the RF was adjusted and optimized to advanced random forest(ARF).Due to the prediction rate given by random forest is not accurate,this paper uses the accuracy of out-ofbag data(OOB)as the evaluation index instead.Finally,the ECG data of this hospital are applied to the ARF model and has obtained a good classification performance.Specifically,the experimental results show that the OOB accuracy rate of hospital’s binary classification dataset is 96.45%,and that of multi-classification data set is 96.62%,which means they both obtained above 96%.The availability of the ARF model on real world data was verified.In addition,this paper pays special attention to the application and combination of computer with medical research background and knowledge.The background and professional knowledge of the medical field have always run through this research.It is embodied in: 1.The ECG field guidance for feature extraction,extracted key characters for disease diagnosis;2.Medical knowledge guidance for the data label library construction,have constructed 17 types of labels for classification;3.Medical knowedge combined with the computer-assisted diagnosis classification model,enhance the medical interpretability.This study not only pays attention to the data preprocessing,improvement of classification accuracy and performance,but also focus on the corresponding medical interpretability.To accomplish a better combination with medical professional field,and applies that in clinical auxiliary diagnosis based on artificial intelligence.
Keywords/Search Tags:Electrocardiogram(ECG), SMOTE+ENN, Data imbalance, Label generation, Random Forest
PDF Full Text Request
Related items