| The rapid development of high-throughput sequencing technology has set off an upsurge in genomics research and promoted the generation of a large amount of omics data.Each type of omics data is based on a specific level of biological information,including genome,epigenetic,transcriptome,proteome,and metabolome.How to organically integrate these multi-omics data of genomics to describe the dynamics of molecular systems more accurately and systematically is an important topic in the study of molecular mechanisms,and it is also a major problem facing genomics research.At present,the target gene is an important bridge connecting the abovementioned different omics data.The target gene prediction methods commonly used in animal and plant research have obvious deficiencies in the running time and prediction accuracy,and there is no specific applicable method for microbial target gene prediction.In this paper,we first construct a set of target gene prediction methods that do not rely on species information,have wide applicability,high accuracy,and fast operation speed.Using the target gene sequence seed bank combined with Smith-Waterman local accurate alignment,its prediction efficiency is more than 10 times faster than the existing prediction methods.At the same time,the filter conditions for the prediction results are further established,including the comparison results,the energy value of the combined target region,and the matching result of the seed region of the microRNA,which greatly reduces the false positives of the target gene prediction results.According to the expression correlation or interaction database,etc.,the target gene can integrate multiple omics data according to the relationship between pairs.However,when systematically analyzing its internal interaction mechanism,especially in the face of three or more omics data,the analysis based on the combined method of target genes also faces great challenges.To meet the increasing demand for joint analysis of multi-omics data,this paper further established a set of highly compatible and scalable multi-omics data joint analysis method.Firstly,construct the co-expression module for each omics data separately,and then according to the overlapping relationship between different omics data,screen out the significantly related module pairs through the hypergeometric test,and perform GO and Pathway functional enrichment analysis,expression trend and interaction network analysis on the significantly related module pairs.This analysis strategy is not limited by the dimensions of omics data and the constraints of the target gene targeting relationship.It not only shows the significant interaction relationship from the perspective of statistics,but also gives the potential correlation factors from the perspective of gene function and metabolic process.The method was designed,species-independent,to work well in both human colorectal cancer and Arabidopsis leaf data,provides a feasible solution and comprehensive data support for in-depth mining of the interaction mechanism between multiple omics data. |