| With the increasing number of aging people,environmental pollution and other external environmental changes,respiratory diseases have increased significantly in terms of morbidity,disability,and mortality.Common acute respiratory diseases include severe pneumonia and respiratory failure,Acute respiratory distress syndrome,etc.,these diseases have a high mortality rate,many complications and expensive treatment.At present,the safety and effective evaluation of medication in the treatment of patients with respiratory diseases mainly rely on the clinical experience and relevant field knowledge of medical staff.In the complex situation of multiple complications of respiratory diseases,it is very important to grasp the evaluation of drug efficacy and safety during the treatment process.However,judging the relationship between drug treatment and clinical outcomes,and re-evaluating the effectiveness and safety of drugs after they are marketed,are currently difficult problems for hospitals and medical insurance departments.With the gradual development of medical informatization,medical data in the medical industry has surged,and the size of medical databases has gradually become larger.Among them,the potentially huge practical value,and the data mining methods based on machine learning have also matured in recent years.The above development provides the necessary conditions for the evaluation of the effectiveness and safety of medication for the development of a data mining method combining big data statistics and machine learning based on this paper.In this paper,based on the data collected by China Hospital Pharmacovigilance System(CHPS)for a typical hospital’s typical respiratory disease patients in the past 5years,the effectiveness of the drug was evaluated by applying relevant statistical methods.In addition,based on the data of Ulinastatin’s adverse drug reaction report from Tianpu Pharmaceutical,three machine learning prediction models were used to conduct relevant research on drug safety evaluation.The main work of this article is as follows:(1)In this paper,methods such as propensity score matching,chi-square test,and survival analysis were used to evaluate the effectiveness of drugs used in patients with respiratory diseases in a hospital.The data of 950 patients after screening were divided into medication group and control group,and the data were matched 1: 1 by the propensity score matching method.After comparative analysis,it was found that the use of ulinastatin had a significant impact on clinical outcomes.The medication group had a greater proportion of good clinical outcomes.In addition,the survival analysis method was used to compare the data between the medication group and the control group before and after matching,and it was found that the medication had a certain effect on the survival time of the patient,but the effect was not significant.(2)This article uses SPSS Modeler software to establish a predictive model based on machine learning methods,analyzes the safety of patient data,and discovers the factors that cause adverse drug reactions.In this paper,after using Logistic regression,support vector machine,and BP neural network to build a model experiment of three machine learning methods,it is found that the characteristics of dosage,number of medications,usage,whether to combine medications,and whether there are family drug adverse reactions are adverse reactions to drugs.the main factor of influence.The model prediction based on the Logistic regression algorithm established in this paper can achieve 97.2% accuracy on the test set data,the support vector machine model accuracy rate is 97.2%,and the artificial neural network model is slightly inferior,with an accuracy rate of 93.68%.The model is evaluated by AUC values and Gini coefficients.The AUC values and Gini coefficients of the Logistic regression algorithm model,artificial neural network model,and support vector machine model trained on the test set are 0.997 and 0.993,0.992 and 0.985,0.995 and 0.990,indicating that the prediction models all have better curve fit,but the artificial neural network is slightly worse.The above shows that the Logistic regression algorithm and support vector machine algorithm are effective in the application of drug safety analysis and have practical application value. |