| Hypertension is a common intractable cardiovascular and cerebrovascular disease.Although its awareness rate and treatment rate have increased in recent years,the overall level is still low,and the prevalence of hypertension is still rising.Data mining technology has been applied to the analysis of hypertension at home and abroad,and many studies and experiments have been done,but it is mainly aimed at the prediction of hypertension,and there is little comprehensive analysis and Research on the factors of hypertension.In this paper,taking various factors causing hypertension as the starting point,data mining technology is applied to the health file data set of patients with hypertension,the relationship between different factors and hypertension is studied and analyzed,and the impact of various factors on the probability of hypertension is predicted.The paper takes the data in the physical examination management system and health archives management system of a health center in Gaochun District of Nanjing as the data source.Firstly,on the basis of data preprocessing,the association between BMI,age,gender,daily behavior and hypertension is studied based on association rule algorithm.Secondly,combined with more characteristic values of daily behavior habits,this paper makes a visual analysis on the health file data set,intuitively understands the a priori relationship between various disease factors and hypertension,and uses xgboost algorithm to achieve better prediction results of hypertension.Thirdly,the behavior habits of residents with sub-health symptoms are similar to those of patients with hypertension,which will cause more misjudgments in the model and affect the accuracy of hypertension classification.A series hybrid prediction model of AdaBoost and xgboost is constructed to effectively separate sub-health population and hypertension population and improve the accuracy of identifying patients with hypertension.Different from the previous research methods for a single chronic disease,this paper comprehensively considers hypertension and a variety of chronic diseases.Based on BP neural network,random forest,xgboost and AdaBoost algorithms,the impact of other chronic diseases on hypertension is studied.After comparing the performance measurement results of each algorithm model,it is found that xgboost model has the best performance,and it is used to rank the importance of the impact of other chronic diseases on hypertension.Then,the concept of RR is introduced to analyze the relative risk of hypertension in people with other chronic diseases,as well as the probability of suffering from a single disease and the joint probability of multiple diseases,so as to explore whether each chronic disease is more likely to accompany the disease.Finally,a hypertension prevalence factor analysis system is designed,which helps the public health departments of each community to explore the relationship between daily behavior habits of community residents and other chronic diseases and hypertension.There are differences in the prevalence factors of hypertension in different regions.Using this system,we can intuitively see the size ranking of the relationship between the prevalence factors and hypertension in this region,which can be used by the public health departments of each community to formulate personalized hypertension prevention and treatment strategies,and can also be analogously applied to the analysis of other chronic diseases,which is operable and popularized. |