| Since H9N2 avian influenza virus was first found in the United States in 1966,it has become widespread in poultry and has gradually become the major influenza virus infecting poultry.Particularly in China,H9N2 avian influenza virus first broke out in 1992 in Guangdong Province,and gradually became endemic nationwide over the next two decades,and then became the major subtype of avian influenza virus in China.At the same time,H9N2 avian influenza virus has also been reported to have the ability to infect humans across host species.Other subtypes of avian influenza viruses have reassortments during co-infection with H9N2 avian influenza viruses,generating new reassortment virus subtypes and making them acquire the ability to infect humans across host species.Therefore,it is important to study the fitness evolutionary pattern of H9N2 avian influenza virus for the prevention,which can help to control and monitor the outbreak of epidemics.In this paper,we use data mining techniques and mathematical modeling methods to model and analyze the population frequency,the reassortment pattern and the fitness evolution pattern of H9N2 avian influenza virus,based on genomic sequence big data.The mainly research includes the following three aspects.First,we collect H9N2 genome sequence data from existing public data platforms,and construct two categories as external segments and internal segments,totaling eight data sets.We define the strain frequencies and branching calculation methods,and apply them to the quality-controlled HA external segment data set for calculation and analysis.The change of population frequency and the pattern of amino acid mutation characteristic of H9N2 avian influenza virus are summarized based on the calculation results.Second,this paper proposes the concept of core gene pool and establishes the typical clusters for constructing the core gene pool of internal segments of H9N2 avian influenza virus,based on the phenomenon that H9N2 avian influenza virus provides internal segments for multiple reassortant subtypes of influenza viruses.Moreover,feature extraction and clustering are used to verify and analyze the core gene pool.And the results showe that the typical clusters constituting the core gene pool have obvious differences in geographical distribution and mutation sites.Thirdly,we construct a fitness evolutionary model of H9N2 avian influenza virus based on the results of frequency analysis of H9N2 avian influenza virus populations and the results of reassignment network analysis.The model is used to predict the branches of H9N2 that might have highly adaptive offspring in the future.The results showe that some branches generated in 2013-2014 have highly adaptive ability and might become epidemic populations in the future. |