Font Size: a A A

Protein Subcellular Localization Prediction Based The Fusion Characteristics

Posted on:2013-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z M GuoFull Text:PDF
GTID:2230330395984905Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the human genome project finish smoothly, more and more DNA sequencesand protein sequences were determined. Pure biological experiments have been unableto fill the gap between the large amounts of sequence information and severedeficiency of protein function annotation information, besides it is fairlytime-consuming and expensive, So there is an urgent need by calculating method topredict protein function. In addition, biological study shows that there is an intrinsicrelationship between protein function and its subcellular location. The information ofprotein subcellular location can provide useful clues for the research of proteinfunction. Therefore, to further understand the function of protein, identifying thesubcellular location becomes the important research area of proteomics.Focuses on the topic of prediction of protein subcellular location, the papermakes intensive studies on protein sequence encoding and designing of classificationalgorithms. The followings are main research achievements:This paper presents a new protein sequences encoding method, which iscomprised by three sequences feature. The first sequence features is the traditional20dimensional amino acid composition(AAC). The second sequence features is theamino acid position information, which mainly extracted each amino acid residuelocation information in the sequence. The third part is amino acids local orderinformation. Each amino acid residues are expressed by corresponding five bits ASCIIof binary,then each L length of the protein sequence, so local order information can berepresented by a five row and L column matrix.We then calculated the frequency ofquadruplex appeared in the matrix each row. In this paper, We use the nearestneighbor classification algorithm as a predictive classification tool, by testing on twodifferent apoptosis protein datasets, a re-substitution test and a jackknife test wereemployed on the same datasets. The results show that the method proposed in thepaper achieves better predictive performance and comparing with other methods, thismethod also has obvious advantage.
Keywords/Search Tags:Subcellular location, Sequence encoding, Feature extraction, Amino acidcomposition, Nearest neighbor algorithm
PDF Full Text Request
Related items