Font Size: a A A

Clustering Research And Information Database Construction Of Affibody Proteins

Posted on:2016-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:M LiFull Text:PDF
GTID:2180330479993479Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Affibody proteins are a kind of fact it ious protein mo lecules whic h are highly affiliat ive and composed of 58 amino acid residues. The amino acids of 13 specific sites in the first and second spira l can be randomly mutated to affibody protein library which can be comb ined w ith any target mo lecules, theoretically inc lud ing 1320 a mino ac id sequences. Affibody proteins are wide ly applied in many fie lds of bio logica l science, such as molec ular ima ging, immune t herapy, drug deve lopme nt and c linica l treatment. Through exper imenta l screening on the affibody protein library, the affibody proteins comb ined wit h a target protein can be obta ined. However, due to the large scale of affibody protein library, the met hod of bio logica l experime nts can’t meet the need of practica l applicat ion, and costs high, consumes long time. This thes is takes the met hod of clustering analys is to explore the calc ulat ion way of affibody protein screening, so as to guide and accelerate the process of screening experime nts.The present t hesis ana lyzes affibody proteins of d ifferent s izes in comb inat ion wit h the c lassificat ion o f amino acid, the protein sequence coding and the cluster ing met hod. Dataset construct io n, sequence coding, data clustering, comprehe ns ive cluster ing a nalys is and evaluat ion on c luster ing e ffect are studied. The main work is as fo llows :(1) express the affibody protein sequence by us ing the particular a mino ac id class ificatio n method, construct d ifferent data sets, and displa y t he sequence layer by la yer wit h the continuous refineme nt of the class ificatio n of amino acid;(2) encode the affibody protein sequence by using three coding met hod of t he amino acid index, amino acid hydrophobic va lue and a mino acid composit ion;(3) conduct cluster ing a nalys is on t he data sets ge nerated by different encoding met hods wit h five different cluster ing a lgorit hm;(4) integrate a ll c luster ing results to achieve a comprehe ns ive cluster ing algor ithm;(5) evaluate and analyze the result va lid it y wit h the screened affibody proteins. The results show that, when the amino acids are divided into more tha n fo ur categories, affibody proteins w hich can be comb ined wit h the same protein are basica lly clustered into the same c lass;(6) constr uct infor mat ion database of experime nta l screened affibody prote ins capab le of bind ing to a specific protein, so as to provide a plat form for the study of affibody proteins.This paper argues t hat the a ffibody proteins o f same c lass have s imilar funct io ns by cluster ing analys is of affibody prote in sequences. It can be confir med, that the met hod is workable, and the clustering result can be used to guide the laboratory screening, improve effic ienc y and reduce the blindness of operation.
Keywords/Search Tags:affibody protein, clustering analysis, amino acid classification, comprehensive clustering
PDF Full Text Request
Related items