Font Size: a A A

Research On Data Mining And Visualization Of Maternal And Child Health Care Disease

Posted on:2017-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2348330488487605Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Medical science directly related to people's lives and health,in the "Internet+" era.,from medical data to explore the future of medical science,how the huge data resources quick access to information,to enhance the experience of the medical workers is a realistic problem urgently need to be explored.From the point of view of data processing,medical data is heterogeneous,complex,privacy and so on.There are many incomplete and inconsistent "dirty data",which has a great impact on the accuracy of the data mining results.Therefore,we propose a method based on recent global similarity to eliminate noise data,which can effectively improve the accuracy of data mining.From the perspective of data mining,it is necessary to extract the hidden,credible and effective information from the mass of medical database to provide the scientific basis for the medical workers to make the corresponding decision.Support vector machine is a new technology of data mining,and it is a new tool to solve machine learning problems by use of optimization method,it in solving the small sample,high dimension and nonlinear pattern recognition showed many advantages.However,when the dimension of the sample is high and there are dirty data interference,the traditional support vector machine training will slow down,reduce the classification performance problems,in order to solve this problem,an improved support vector machine algorithm sample weighted incremental,which is based on the incremental sample correlation weighted support vector machine algorithm is proposed.It firstly introduces the research background,significance and the current situation of the domestic and foreign,and the basic medical data mining process and main technologies are outlined.The design of data visualization and data visualization is described and the visual basic diagram are illustrated by.Then,it introduces the concept and method of data preprocessing.Based on the shared nearest neighbor similarity algorithm,the algorithm of noise cancellation based on global similarity is proposed,according to a sample of two other samples as the similarity of neighbor form to determine whether the sharing of isolated points.And experiment of the two algorithms,algorithm is proposed in this article the recall rate is high,can better eliminate noise data.Secondly,the support vector machine technology are introduced,and linearly separable binary classification problem and non-linear time-binary classification problem were carried out described,on this basis,it describes the KKT support vector machine algorithm based on incremental learning.Finally,focuses on the incremental sample correlation weighting algorithm based on support vector machine(SCW-ISVM)of the basis,and sample weighting method and algorithm steps.And in a maternal and child health hospital breast cancer data set and standard library in the data set,and the other two kinds of support vector machine algorithm were compared in classification accuracy and training time,the performance test in detail,by contrast,improve the SCW-ISVM than the other two kinds of support vector machine algorithm on the classification performance and comprehensive performance more stable.
Keywords/Search Tags:Medical Data mining, Noise data, Suport vector machine, Sample weighting, Data visualization
PDF Full Text Request
Related items