| In the human body,the post-translational modifications of the protein further modify the protein to precisely regulate the activities and functions of the protein,and are associated with a variety of biological processes.With the advance in high-throughput technologies,more and more post-translational modifications data are being generated.Among them,protein phosphorylation modification is one of the most widely recorded post-translational modifications.Besides,complex diseases have negative influences on our health.It has been widely recognized that dysfunctional phosphorylation modifications are related to a variety of diseases,such as cancer.Specifically,some single amino acid variations could disrupt existing kinase-substrate relationships and create novel kinase-substrate relationships.In order to investigate the underlying molecular mechanism of diseases,numerous network-based methods have been proposed to identify meaningful disease modules,which are locally dense subnetworks.However,there are no methods combining protein phosphorylation data to identify disease biomarkers,including disease modules and disease proteins.Therefore,this thesis plans to integrate phosphorylation modification data to identify disease biomarkers.The thesis could be divided into two parts.In the first part,the associations between phosphorylation residues and disease related mutations were systematically investigated,aiming at establishing the theoretical foundation for further work.By analyzing the phosphorylation residues and the amino acid mutations,the amino acid mutations co-occurring with phosphorylation residues and phosphorylation cross-talks were found tend to be deleterious mutations in diseases.Specifically,the deleterious single amino acid variants associated with cancer and muscular diseases tend to co-occur with phosphorylation residues.The phosphorylation residues of proteins from nuclear envelope,protein complex and lysosome were found to be more likely adjacent to the deleterious amino acid variations.These findings provide insights into the associations between phosphorylation and diseases,and can help identify novel disease genes in the future.Then,a new network clustering method to uncover disease modules was proposed,which is based on the significance of connections instead of local density.Specially,a weighted tumor network of lung adenocarcinoma was built with kinase-substrate relationships,tissue-specific gene regulatory network,pairwise gene expression data and mutation data.With appropriate parameters decided by a machine learning algorithm,the method identified nine disease modules and twenty candidate lung cancer associated proteins.The analysis showed that these disease modules could effectively discriminate tumor samples from normal samples.Some significantly important genes in these modules have been identified as target genes of drugs recently.And previous studies have found that these candidate proteins were associated with lung cancer,especially lung adenocarcinoma.In this thesis,the associations between phosphorylation residues and disease related mutations were systematically investigated.A network clustering method and a disease gene prediction method were proposed to uncover disease modules and proteins.Specially,a weighted tumor network of lung adenocarcinoma was built.The results provide insights into the disease mechanism underlying,and help identify more target genes of drugs in the era of precision medicine.However,there are still some shortcomings in this thesis.The research of this thesis is theoretical research.And we need to consider how to combine the theory with medical application in the future work. |