Font Size: a A A

Cardiovascular Disease Prediction Model Based On Dual-channel Hybrid Neural Network

Posted on:2023-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:P ChenFull Text:PDF
GTID:2544307145968129Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The number of patients with chronic diseases in China continues to expand,and the prevention and control work is facing great challenges.The incidence rate and mortality of chronic diseases are much higher than those of cancer,tumor and other diseases.Accurate prediction of chronic diseases is of great significance for their prevention,control and treatment.In the field of chronic disease prediction,this paper proposes two models for different problems.The traditional logistic regression model is prone to the problems of under fitting and large dependence on training data.Firstly,this paper proposes an improved logistic regression model(ILRM).However,ILRM,like decision tree prediction model,depends on Feature Engineering,and the quality of feature engineering directly affects the results of the model.Neural network models such as deep neural networks(DNN)blindly stack the network layers in the disease prediction model,resulting in too many parameters and lack of inductive bias.In order to solve the above problems,a two-channel table hybrid neural network model(DTHNNM)is built based on convolutional neural network and attentional interpretable table learning network(Tab Net)to model and predict chronic diseases,and experiments are carried out with cardiovascular diseases as an example.The main work and innovations of this paper are as follows.(1)Improved logistic regression model(ILRM): In order to solve the problem that logistic regression models are prone to underfitting in large dataset training,an improved logistic regression model is proposed.Firstly,the dataset is normalized in the model feature engineering phase to improve the feature extraction capability of the model and to calculate the correlation coefficient between attributes and target values.Secondly,in order to quickly train the optimal model and reduce computational resources,this paper uses a logistic regression model to find the optimal threshold.Finally,to solve the underfitting problem,we use K-nearest neighbor(KNN)to replace the classifier of the logistic regression model,and pass the optimal threshold into the KNN and then iteratively train the model.Experiments show that the ILRM also performs well in terms of accuracy,recall and F_1 value.(2)Dual channel tabular hybrid neural network model(DTHNNM): In order to alleviate the work of feature engineering and avoid the inherent drawbacks of multiple modules with the problems of excessive amount of parameters,lack of proper induction bias and weak interpretability of neural network models,this paper proposes a dual channel tabular hybrid neural network model.Firstly,the embedding layer of this model treats numerical and categorical data separately,converting them into a more meaningful representation in vector space by changing the dimensionality of the categorical variables.Secondly,to reduce the number of parameters,this model uses a shallow convolutional neural network(CNN)for local feature extraction and sparse feature selection with sequential attention to provide better inductive bias,while stepwise accumulation is used to enhance the interpretability of the model.Finally,this model fuses the feature vectors extracted from CNN channels and Tab Net channels,and passes the feature vectors into the fully connected layer for processing and then uses the Softmax function for disease prediction.Experiments show that DTHNNM enhances the feature extraction ability of the model and improves various prediction indexes of the model,such as accuracy,recall,F_1 value,and AUC value,compared with traditional classification models and neural network models.
Keywords/Search Tags:Cardiovascular Disease, Logistic Regression, Convolutional Neural Networks, TabNet, Predictive Models
PDF Full Text Request
Related items