Font Size: a A A

The Application Of Data Mining Technology In The Regional Death Cause Registering Data

Posted on:2019-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:Q LinFull Text:PDF
GTID:2404330566499459Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
There is a great value in the death cause registering data which is from health and family planning system.How to effectively excavate the value of the death cause registering data is worth studying.This thesis mainly studies how to apply data mining technology to the death cause registering data,including exploratory research,traditional method improvement and verification research.The main works are as follows:1.According to the general process of data mining,this thesis explores the death cause registering data.Through the analysis of the distribution of the cause of death,we find that the death ratio of cancer,cerebrovascular disease and cardiovascular disease are relatively high.And then the time series modeling for the death of major diseases with high death ratio is made,and the prediction application of the model is discussed.2.The death cause registering data include both ordered and disordered classification variables,which belongs to the mixed classification data.In view of the shortage of traditional Kmodes clustering algorithm,an improved K-modes clustering algorithm,which is suitable for mixed classification data,is proposed.The algorithm deals with orderly categorical data and disordered categorical data by using different distance measurements.And the weights of attributes are given by average entropy.The experimental results show that the improved algorithm has better performance in clustering accuracy and clustering distance index.3.The death cause registering data is a confirmatory excavation to study the revelation of the date of birth.The study shows that " Birthday is the date of death " phenomenon exists widely in many divided sub populations.This thesis uses random forest algorithm to find the serious characteristics of the above phenomenon.The results indicate that the highest diagnostic hospital,the causes of death and education are the three most influential features.The results of association rule mining are also verified.
Keywords/Search Tags:Death cause registering data, Data mining, Time series analysis, K-modes algorithm, Birth, Death
PDF Full Text Request
Related items