Font Size: a A A

Research And Application Of Knowledge Discovery Method For Tuberculosis Data

Posted on:2022-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y SangFull Text:PDF
GTID:2504306347473244Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Up to now,tuberculosis is still a public health problem that seriously endangers people’s physical and mental health.It is easy to spread through droplets and is highly infectious.The tuberculosis data obtained from the statistics of patients’ medical information over the years are objective and reflect the real situation.Data mining and knowledge discovery of this data set can dig out valuable information of disease diagnosis,treatment,public health prevention and control,and provide technical support for medical teams and organizations.We have obtained the tuberculosis treatment data of Shandong Province in a certain year from Shandong provincial hospital.There are about 70 megabytes of desensitized data sets of TB patients,which are collected from six cities in Shandong Province,including Jinan City,Dezhou City,Jining City,Linyi City,Weifang City and Yantai city.Each municipal city is also analyzed by patients’ personal information,including basic personal information such as gender,age,home address,etc.,it contains the necessary TB disease information such as the time of visit,the source of patients,the location of diagnosis and the results.The whole information has mining value,which aims to discover the knowledge and visualization of tuberculosis from this data.However,because medical data is entered artificially and the number of patients visited is large.The original medical data is not standardized and incomplete,and it is a multi-type mixture.It cannot be directly applied to data mining and knowledge discovery.Therefore,this paper adopts 0-1 coding,data cleaning,data integration,and normalization operation.In the hospital geography,the paper proposes a new method to calculate the weight according to the high correlation dimension,and finally gets the value of 0-1 range.The existing model distribution is difficult to accurately express its scientific meaning,and the clustering methods have their own advantages and disadvantages.The peak density clustering algorithm has strong anti-interference ability,and can deal with complex shape data,which is suitable for dealing with noisy data.Therefore,this paper selects the peak density clustering algorithm to find the cluster center dynamically,analyze the decision diagram on the processed data set,find the cluster center,and then calculate the cluster center The characteristics of data in cluster.The clustering data has certain relevance,so this paper uses Apriori association rule algorithm based on clustering to analyze the data within and between clusters.By setting different confidence and support thresholds,according to the actual situation and hospital needs,two or more valuable attributes are put together for association rule analysis,and different association rules can be obtained.The ultimate goal of tuberculosis data set analysis is to carry out knowledge discovery and visualization,so this paper uses SPSS modeler tool for knowledge visualization,and uses network diagram to show the association rules clearly.Provide technical support to respiratory team.By integrating the whole algorithm and process of TB knowledge discovery,a simple visualization platform for tuberculosis knowledge is designed for users to use.The platform contains five core functions,which can realize data viewing and processing,density peak clustering analysis,association rule mining and result display.Users can view and process the data according to their own needs.Finally,the association rules are presented in the form of tables.Different support and confidence thresholds are set,and different attributes are checked to analyze different association rule results for users’ reference and use.
Keywords/Search Tags:tuberculosis data, density peak clustering, association rules, knowledge visualization
PDF Full Text Request
Related items