| With the great development of the information society,big data has become a strategic resource that is as important as natural resources.The medical field generates a lot of data every day.The vast medical data contains many valuable information resources,which are of great significance for the diagnosis and treatment of related cases and the development of medical research.However,the polymorphism,incompleteness,redundancy and privacy characteristics of medical big data have greatly hindered the mining work.At present,many large and medium-sized hospitals have established hospital information systems(HIS),but the use of HIS system data is limited to simple queries,and it is more passive for data analysis and auxiliary diagnosis and treatment of data containing value.Starting from medical big data,this paper introduces the characteristics of medical data and data mining algorithms,and selects the classic algorithm of association rules for medical scenes--Apriori algorithm for key learning.The Apriori algorithm is used to mine the relationship between the attributes of the target object,mainly including two parts of the content,looking for frequent itemsets and generating strong association rules.In view of the two shortcomings of the traditional Apriori algorithm,this paper proposes a corresponding improvement method:First,replace the database with a one-dimensional array,so as to reduce the problem of frequent scanning of the database by the algorithm.Second,the pre-prune strategy is introduced to solve the problem of huge candidate sets.Based on the improved algorithm,the overall scheme of the association of breast tumor disease association rules is designed to find valuable association rules.Using the improved association rule algorithm and traditional algorithm to mine the desensitization data of clinical breast tumor disease,it is found that the frequent itemsets generated by the two methods are consistent,which proves the correctness of the algorithm.Under the same data set,the performance of the two algorithms is compared,which proves that the efficiency of the improved algorithm is significantly higher than that of the traditional algorithm. |