Objective: To explore and analyze association rules method and classical regression modeling for physical examination data and to provide reference for information mining of physical examination data.Methods: Data from the same medical institutions,Han physical examination data from Urumqi,The Uygur nationality physical examination data from Hetian City.Taking the diagnosis of hypertension as the breakthrough point,we use association rule Apriori algorithm and FP-growth algorithm to excavate the simple correlation and complex interaction of hypertension related factors.Then logistic regression modeling was used for quantitative assessment and medical literature used for qualitative assessment.Results: The prevalence of hypertension was 20.6% in the Han population and 11.9% in the Uygur population.No matter the Han and Uygur population,the rate of male hypertension was higher than women and the rate of hypertension was increasing with age.Association rules for hypertension related factors mining showed that:There is a difference between strong association rules of hypertension related factors excavated by different ages and different sexes in Han population,a simple association rules include drinking,family history,triglycerides,BMI,waist circumference,blood uric acid etc.complex association rules include triglyceride and alcohol,family history and liver imaging(fatty liver),age and operation history,urine protein and sex,BMI and blood uric acid,etc.The strong association of hypertension related factors in the Uygur population in different sex and different age also dig out the differences,a simple association rules include fasting blood glucose,low density lipoprotein cholesterol,liver imaging detection(fatty liver),waist circumference,urinary protein etc.complex association rules mainly include blood glucose and blood uric acid,waist circumference and BMI,age and liver imaging(fatty liver),fasting blood glucose and low density lipoprotein cholesterol,waist circumference and fasting blood glucose,age and waist circumference,etc.Logistic regression modeling quantitative assessment results showed that 80% of the simple association rules and complex association rules(interaction)hadstatistical significance;25 articles were supported by relevant original medical literature.Conclusion: There is a difference in of hypertension related factors and the interaction factors between the different sex and age of the Han and Uygur people.Association rules mining combined with logistic regression model can excavate the rich information contained in physical examination data,get the correlation strength association rules between factors and diseases and provide clues for early detection,diagnosis and prevention of diseases. |