Font Size: a A A

Subcellular Localization Bioinformatics Prediction And Verification Of TA Protein In Arabidopsis Thaliana

Posted on:2020-11-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z S LiFull Text:PDF
GTID:2480305735492774Subject:Developmental Biology
Abstract/Summary:PDF Full Text Request
TA proteins(tail-anchored proteins)are a unique class of integral membrane proteins located in eukaryotic cell organelles anchored by their unique transmembrane domain(TMD)at the C-terminus.The N-terminal of the TA proteins is exposed to the cytoplasm.TA proteins mediates a variety of important life processes,such as participation in vesicle trafficking,protein transport and transcriptional regulation.Because the targeting process of the TA protein(targeting signal at the C-terminus)occurs after protein translation,the TA protein targeting system becomes a systematic model for studying protein post-translational targeting.The potential TA proteins in yeast and humans have been identified from yeast and human using bioinformatics methods,but little is known about TA proteins in plants.The prediction and identification of plant TA protein's localization is of great significance for extending the post-translational transport model to more organelles and revealing the function of TA proteins.In this study,bioinformatics methods were used to screen and identify all possible TA proteins in Arabidopsis thaliana and to analyze and predict their transmembrane structure.Molecular biological methods were used to identify TA proteins that were not fully studied,such as AT2G05310,AT3G09035,AT1G26690,and AT5G42370.And the effect of the TMD and Tail(sequences downstream of the TMD)region of the mutated TA protein on its subcellular localization of the TA protein was investigated.The work mainly involves the following aspects:1.Establish TA protein training data setBased on dozens of TA proteins that have been well studied in Arabidopsis,their homologous proteins in different plants were identified by blast alignment in NCBI.The TMHMM v.2.0 program was used to predict the position and length of the TMD of all retrieved proteins.All 406 TA protein sequences from different plant species were obtained by this method as candidate TA proteins.These proteins serve as training data sets for our prediction programs.Calculating the TMD GRAVY of the amino acid in the TMD sequence and the Tail charge of the TA protein in the training data set,and the TMD GRAVY,Tail charge and positioning information of TA protein are collected together as annotation information into the training data.2.Establish a model to predict the subcellular localization of TA proteinFive kinds of algorithm models of GBDT,SVM,KNN,RF and NB were used to analyze the targeting data of TA protein in training dataset,and we combined with the above five algorithms to use the machine learning method to establish the prediction model for predicting TA protein subcellular localization.The cross-validation of the predictive model shows that the prediction accuracy is 88.05%under 5 fold(5 fold cross-validation).3.Screening and identification of TA protein in Arabidopsis thalianaDownload the entire sequence of the Arabidopsis proteome from the Arabidopsis Information Resource(Tair)website and predicteda transmembrane helix domain by the TMHMMv.2.0 program for all 48359 sequences in the downloaded proteome.According to the structural characteristics of the TA protein,a protein containing only one TMD and the unique TMD within 50 amino acids of the C-terminus of the protein sequence were screened,and then 1037 protein sequences were obtained.Among these sequences,only the first transcript of each gene locus was retained,and 576 protein sequences were obtained finally.A total of 576 protein sequences were analyzed using nearly twenty protein localization signal sequence prediction programs to remove 194 proteins with N-terminal signal peptides and involved in the secretory pathway.Finally 358 eligible TA proteins were obtained of Arabidopsis thaliana.4.Subcellular localization prediction of TA protein in Arabidopsis thalianaSubcellular localization prediction of 358 TA proteins in Arabidopsis thaliana was performed using our established TA protein subcellular localization prediction model.The predicted results indicated that 221 TA proteins were localized in the endoplasmic reticulum,50 were located in mitochondria,and 87 were located in the peroxisome.5.Experimental verification of subcellular localization of TA protein5.1 Construction of TA protein expression vectorThirty TA proteins in Arabidopsis thaliana were randomly selected as validation targets.The gene sequence corresponding to the C-terminus(including TMD and Tail)was cloned and ligated into the vector EYFP-DECR-pCAT.The expression of the fusion protein was observed to study the subcellular localization of the TA protein.5.2 Observing the subcellular localization of onion epidermal cells The constructed fusion protein expression vector was transformed into onion epidermal cells,and the subcellular localization of the TA protein was observed under a fluorescence microscope.The result showed that the subcellular localization of 23 TA proteins were consistent with the predicted results,4 of which had double localization;the subcellular localization and prediction of 7 TA proteins were different.The correct localization rate of onion epidermal cells is slightly lower than the subcellular localization prediction program.5.3 The effect of directionally mutations TMD and Tail sequences on subcellular targeting of TA proteinsThe Tail sequences of the TA proteins AT1G54400,AT5G48810,AT4G35000,AT5G46850,AT2G27300 and AT1G20970 were directionally mutated which in turn changed their tail charge.The TMD sequences of the TA proteins AT4G35000 and AT5G46850 were mutated to alter TMD GRAVY.With the increase of TMD GRAVY,the TA protein originally localized to peroxisomes is more likely to localize to the endoplasmic reticulum and mitochondria;As the charge decreases,the TA originally localized to peroxisomes tends to localize to the endoplasmic reticulum;As the tail charge increases,the TA protein originally localized to the endoplasmic reticulum shifts to mitochondria targeting.It is concluded that the endoplasmic reticulum and mitochondria-targeted TA protein may have a relatively high TMD GRAVY,and the TA protein targeting the endoplasmic reticulum has a relatively low Tail charge,but the TA protein localized to mitochondria has a relatively high Tail charge.TA proteins with high tail charge and lower TMD GRAVY tend to target peroxisomes.Innovations of the topic:1.Strict screening method was used to screen 358 protein sequences conforming to TA protein structure from Arabidopsis thaliana proteome,and a subcellular localization prediction model was established by machine learning method.This model was used to identify TA protein of Arabidopsis thaliana.The subcellular localization of the TA protein is predicted.2.The prediction results of TA protein were verified by molecular biology methods.The experimental results show that the established subcellular localization prediction model has a high prediction rate of TA protein in Arabidopsis.3.By changing the amino acid composition of TMD and Tail sequences,the TMD GRAVY and Tail Charge were changed,and the influence of TMD GRAVY and Tail Charge on the targeting of TA protein in cells was discussed.
Keywords/Search Tags:TA protein, gene gun, subcellular localization prediction, machine learning
PDF Full Text Request
Related items