Font Size: a A A

The Similarity Analysis Of Homology Protein In Microbes

Posted on:2016-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:D FengFull Text:PDF
GTID:2180330479999294Subject:Biophysics
Abstract/Summary:PDF Full Text Request
With the progress of large-scale genome sequencing projects, a large number of microbial genomes and protein sequences have been determined. It is a challenge for bioinformatics to find the biological significance of a variety of data.Genomic GC content of microbes is very different. Amino acid composition in proteins is influenced by a few factors such as genomic GC content, phylogenetic relationship and environment. It is possible to analyze the influence of genomic GC content and phylogenetic relationship on amino acid content of homologous proteins from currently published data. Homologous proteins are from a common ancestor, which have functionally and structurally similar proteins. In order to maintain their function in different species, amino acid composition in some segments of homologous proteins need to be adjusted to keep conserved motif invariance. Therefor, it is very important to analys the influence of genemic GC content and species genetic relationship on amino acid compositon of homologous proteins.In this dissertation, a database is established to analyze influence of genomic GC content on amino acid composition of homologous proteins in species, which is consist of seventy strains in seven microbes with different genomic GC content. Ten kinds of homologous proteins are studied, which are related to transcription and translation process. The protein sequences contain more than 300 amino acid residues. The sequence alignment technique is used to analyze the sequence similarity of homologous proteins from the strains of microbes. The aligned segments of protein sequences are divided into three kinds of sites:identical sites, similar sites and unmatched sites. The amino acids in proteins are divided into three classes:GARP class, FYMINK class and OTHERS class, which corresponded to rich GC codons, rich AT codons and no preferent codons. The sequence similarity of homologous proteins is studied based on the content of three amino-acid classes at the three kinds of sites. The results show that the amino-acid content of GARP and FYMINK classes at identical sites has smaller change, while the content at similar and unmatched sites has larger change. It means that the amino acids at identical sites are more conserved than those at the other sites, when amino acids in proteins vary with genomic GC content. Compared with amino-acid composition of homologous proteins in microbes, it show that the amino-acid composition is also influenced by phylogenetic relationship. Furthermore, the average amino-acid content of three classes from the then kinds of homologous proteins is analyzed, it has been found that the amino-acid content in GRAP class increases with the increase of genomic GC content of microbes, whereas the amino-acid content in FYMINK class decreases with the increase of genomic GC content of microbes. When the genomic GC content is closed to fifty percent, the amino-acid content of GRAP class is approximately equal to that of FYMINK class in homologous proteins. However, in three of all ten kinds of homologous proteins, the amino-acid content of three classes has smaller change, it means that the three homologous proteins are more conserved than others. The gene GC content is also analyzed for all microbes, the genes corresponding to the three proteins is lower change than those of genomic GC content of microbes. It means that the amino-acid composition of homologous proteins is adjusted at gene level to keep conservation of their structure and function.
Keywords/Search Tags:sequence alignment, homologous protein, sequence similarity, amino acid components
PDF Full Text Request
Related items