Font Size: a A A

Twilight zone: Protein sequence search and classification

Posted on:2011-06-04Degree:M.SType:Thesis
University:Michigan State UniversityCandidate:Guo, JiarongFull Text:PDF
GTID:2448390002455798Subject:Biology
Abstract/Summary:
Homology search is important for gene annotation and classification. The emergence of Next Generation sequencing techniques and metagenomics give homology search new challenges. In this study, I evaluate the commonly used homology search tools: BLAST and HMMER to find new genes (nirK) in our metagenomic data and classify closely related genes (amoA and pmoA in UniProtKB). BLAST false positive problem is found when comparing BLAST and HMMER searching results in metagenomic data. Furthermore, I also describe methods which use phylogenetic trees to refine the results of common homology search methods. We evaluated these methods with the above genes, and find narrow phylogenetic sampling of genes (nirK) will limit sequence annotation and some genes (amoA and pmoA) are probably misclassified.
Keywords/Search Tags:Search, Genes
Related items