Font Size: a A A

LEA Gene Classification, Bioinformatics Analysis Of LEA Gene Codon Biased And Drought Resistance

Posted on:2016-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:X J ZhangFull Text:PDF
GTID:2180330479475691Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
Xinjiang’s oasis area is only about 5% of the whole area, suitable land resource is very scarce, and annual precipitation in Xinjiang is low, the climate is dry, here the growth of plants is facing a variety of abiotic stress. Now scientists have found a variety of stress resistance gene, there are directly to protect cells against adversity of material, there are some indirect participation in art reaction substances, but in these art gene, LEA gene(late embriogenesis abundant gene, LEA) newest of the resistance is one of the most diverse, also has the attention of the researchers, LEA gene research for the Xinjiang region and the world is extremely important. Based on Bioperl this study write Perl script, remote download LEA gene sequences from NCBI, build a complete and accurate data set, and on the basis of the data set to write Perl script, statistics k-mer frequency, LEA gene classification, and use the bioinformatics software to learn LEA gene codon biased and its resistance, make new basis for LEA gene the deep mechanism of resistance and its application, make great contribution to agricultural development in extreme conditions. This paper has done research in the following aspects:PartⅠ. Based on the Bio Perl download LEA gene data from NCBIIn biological experiments, biological data is obtained by experiment or in the database manually screened, but this is based on a small number of biological data, when the need for large quantities of biological data, it must be with the help of computer technology can achieve. This research needs a large amount of data to construct LEA gene data set, we must use bioinformatics methods to solve biological data collection, this study based on the Bio Perl tool to design two kinds of methods to remote data download LEA gene successfully. The first method combined with the LEA the keywords(i.e. nickname) as a retrieval condition remote download LEA gene sequence data, the method download the LEA gene of the most comprehensive, high accuracy; Another method using the conserved domain of LEA protein sequence as search condition, download sequence, due to the need to match exactly, the LEA gene data is accurate, but not comprehensive. In this paper, we accord to the advantages and disadvantages of the two methods to construct the data set.Part Ⅱ. LEA gene classification based on the principle of K-mer frequencyWith the rapid development of bioinformatics and Life Sciences, more and more LEA genes were found, the amount of data has reached more than 40000 in NCBI, show that this classification method is not applicable, and found that the functional characteristics of each class is not uniform in the literature, the new method of the new classification the LEA gene is very necessary. This part according to K-mer frequency, using bioinformatics methods statistical frequency of K-mer, the new classification of LEA gene: the LEA7 family and LEA2 family is divided into a family, other family unchanged, a total of six families, and each family according to different frequency is divided into various sub family, the contrast experiment by using Vector software guide tree, to prove the accuracy of the results.Part Ⅲ. Analysis of LEA gene codon biasedCodon bias is one of the main factors influencing the expression of exogenous genes, but the study of LEA gene in this respect is not much, only in the literature have seen on LEA gene sequence, and there is no research on the entire LEA gene codon bias model from a macro point of view. This study use the CHIPS and CUSP components in the EMBOSS, analysis of LEA gene codon biased, according to the ENc value, the content of GC equivalent, it is concluded that LEA gene codon biased is not high, the most frequent use ends in A /T codon.Part Ⅳ. Analysis of LEA gene resistance with it’s a sequence of domains conserved structure repetitionsDue to some of the limitations of biological experiments, and the limits of science and technology, for the mechanisms of the resistance function of LEA genes have not been clear, but through a lot of literature can still find some clues. This study proposes a hypothesis, LEA gene resistance is related to its conservative fragment repetitions, in order to prove this hypothesis, to LEA3 group sequence data in the study as the research object, using the Blast, http://web.expasy.org/protparam/, http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?Page=npsa_sopma. These three online HTML software analysis sequence, it is concluded that the hydrophilicity is related to the repeat fragment in LEA3 group and the proportion of alpha helix. That is, LEA3 family protein sequences and their resistance to conserved domain sequence and the Alfa spiral proportion indirect correlation.
Keywords/Search Tags:LEA gene, k-mer frequency, Codon bias, bioinformatics, resistance
PDF Full Text Request
Related items