Font Size: a A A

The Application Research Of Support Vector Machine In Diabetas Data

Posted on:2018-09-15Degree:MasterType:Thesis
Country:ChinaCandidate:L HuFull Text:PDF
GTID:2334330518996118Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the continuous deepening of the degree of medical information industry, more medical data collection methods emerge which are more intelligent. For the rapid growth of data size, we need to use more effective ways to process and analyze data, then dig out the real value of data content. Data mining technology is a very effective and popular way,in which support vector machine is an important method to dig out the value of medical data. For specific application scenarios, this thesis selects diabetes, a common disease in the modern society as the object of algorithm study. As a chronic disease, diabetes brings heavy economic and psychological burden to individuals and families because of health stress and the money consumption. Therefore, early detection and early avoidance of diabetes can not only improve the quality of life of diabetic patients, but also reduce the health risks and the cost of money, which result from the treatment of late-stage nephropathy.Based on the above points, this article is about the related research of diabetes and diabetic nephropathy, and the construction of the diabetic nephropathy screening model. The data were collected from 2612 healthy people and 994 diabetic patients in 10 cities, including 109 diabetic nephropathy patients and 885 diabetic patients without nephropathy. The data is collected as the original dataset which is used to train the diabetic nephropathy screening model. The main contents are as follows:(1) Research of the existing medical research methods, diagnostic indicators and high risk factors. For the raw data, there is data cleaning,missing value processing, normalization and other operations. Sample annotation is using text matching to label positive and negative sample.To compare the data distribution between diabetes and healthy people, we can find that the indicators of each group have a certain cross, to determine whether there is the possibility of diabetic nephropathy based on the value of an indicator. In this paper, the ROC curve was drawn from the original data, and the 12 input indexes are selected to construct the input vector of the diabetic nephropathy screening model. It is a relatively new research field to apply support vector machine classification algorithm to the prediction of diabetic nephropathy.(2) The classification model of diabetic nephropathy is constructed by support vector machine classification algorithm. The hyper plane is trained by training data, to compare the classification accuracy of normalized data and non-normalized data. The experimental results show that the normalized data can improve the classification accuracy. Using grid optimization method to further improve the accuracy rate of diabetic nephropathy screening classification model is effective. The experimental results show that the accuracy of classification is better than that of stochastic parameter selection. However, the search range is large and the step size needs to be determined according to the specific data. It is necessary to introduce a more intelligent parameter optimization method.(3) Particle Swarm Optimization (PSO) algorithm is used to optimize the penalty parameter C and the kernel function parameter ?.The sensitivity of different kernel functions to PSO is compared. The accuracy of cross validation of PSO-RBF-SVM is 94.89%, which has certain practical value and using prospect.
Keywords/Search Tags:KEYWRODS, support vector machines, diabetic nephropathy, roc curve, diabetic nephropathy screening model, grid search, particle swarm
PDF Full Text Request
Related items