In the field of drug safety,more and more attention is paid to the supervision of adverse events caused by drugs.Adverse Drug Reaction(ADR)is an accidental injury caused by the normal use of normal doses of drugs.Adverse events caused by drugs not only endanger public health,but also cause huge losses to the healthcare system.They are the fifth most common cause of death in hospitals.The existing signal detection methods mainly focus on the relationship between a single drug and a single adverse reaction,and lack of research on the level of drug category.In some pharmacovigilance information of China,ADRs of drug categories have been reported many times.Therefore,how to effectively explore the risk feature of drug category has become a very worthy research topic..This paper proposes some methods based on concept generalization,which generalizes the"drug-adverse reaction"to the corresponding"drug category-system organ involved".First,the mainstream signal detection methods are applied to chinese ADR monitoring data,and based on the comparative analysis of the signal results,IC is selected as the signal detection method of this research.At the same time,a signal detection method named SDMI(signal detection based on mutual information for drug category)is proposed and the DCG(Discounted Cumulative Gain)feature extraction technology is used to analyze the potential relationship between the drug category and the system organ involved.The main work of this paper includes:(1)Using the database of spontaneous reports of ADRs in Jiangsu Province from 2011 to 2018as the data source.Five types of drugs are selected as the experimental data,involving 44 drugs and699 adverse reactions.According to the WHO-ART adverse reaction terminology,the Chinese Drug Catalog and the Chinese Pharmacopoeia,the raw data is preprocessed by some procedures,such as splitting,specification,classification and concept generalization,and the key attributes including drug category,drug name,adverse reaction name,and system organ involved are extracted..(2)Collecting adverse reaction notification information issued by the US Food and Drug Administration(FDA)and China Food and Drug Administration(CFDA)from the Internet,and build a result verification database,which is used for validation of the results.(3)Collecting the data of the drug instructions and constructing a known signal database to provide calculation basis for the signal evaluation indicators such as recall rate,accuracy rate,F1,AP,MAP,etc.The conventional signal detection methods(PRR,ROR,MHRA,IC,x~2)are applied to the experimental data,and after the evaluation and analysis of each signal detection method,IC is determined as the signal detection method in this paper.Then,using the DCG(Discounted Cumulative Gain)algorithm to rank,score,and extract risk features based on the IC results,and compared with the result verification library,the accuracy rate is as high as 67%,so the IC signal detection method is more ideal.(4)A signal detection based on mutual information named SDMI is proposed.Using the principle of maximizing the F1 index to determine the threshold of the SDMI,and comparing it with the conventional signal detection methods.This method is applied to the experimental data to mine the associated signals of drug category and system organ involved,and the DCG algorithm is used to extract the risk features of the experimental results.The results show that,compared with IC,the SDMI proposed has better performance in signal mining for drug categories. |