Font Size: a A A

Research On Plant LncRNA Identification Based On Structure Characters

Posted on:2017-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:H Y WangFull Text:PDF
GTID:2180330482995642Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As a potentially crucial layer of biological regulation, lnc RNA plays an important role in organisms. It is a kind of noncoding RNA molecules with more than 200 base pairs in length and without or with short Open Reading Frame(ORF). lnc RNAs of all kinds have been implicated in a range of developmental processes and the occurrence and development of some diseases, the mutation and abnormal regulation of these lnc RNA can influence a variety of complicate diseases in human, lnc RNA can regulate plant vernalization and the disease-resistant ability of plants also. The potentially emerging regulatory mechanism of lnc RNA indicates that lnc RNA is a general regulator.At present, two important questions of lnc RNA studies are whether all lnc RNAs are functional and how they could exert a function. However, studies about plant lnc RNA are relatively fewer than studies about lnc RNA regulatory in human disease, so the regulatory mechanism is still unknown. In order to identify plant lnc RNA sequences efficiently, this paper builds a plant lnc RNA sequence identification model based on structure characters of known plant lnc RNAs and plant flc DNA sequences using statistics and computational biology. First of all, this paper collects known lnc RNAs of target plant, predicts their secondary structures and analyzes the characters to find the common structure characters of these lnc RNAs. The characters can be used as a condition to filter the candidate sequences. Then, this paper collects flc DNA sequences of target plant and filters them to keep the sequences with more than 200 base pairs in length. Secondly, this paper collects and analyzes ORF length of known lnc RNAs, then filters the sequences again according to average ORF length. While homology with known proteins can be used as another indication of protein coding potential, the remaining sequences are aligned to annotated non-redundant protein sequences from authoritative databases to avoid false positive samples. According to e-value, the model can filter the sequences once again. Thirdly, the rest of sequences are aligned to known small RNAs in order to remove the candidates which are characterized for small RNA precursor potential. At last, this paper filters the sequences for the last time based on the structure characters we have got. After the above steps, plant lnc RNA sequences are identified, and then in the future research, this paper may supply a reasonable assumption and explanation for the regulatory mechanism of lnc RNA in plants based on the results.This paper can find more possible plant lnc RNAs using known plant lnc RNA sequences and known plant flc DNA sequences by this plant lnc RNA identification model, it supplies an efficient way to enrich plant lnc RNAs. This paper hopes to reveal the potential regulatory mechanisms of plant lnc RNAs in plant growth and development process in nature through above work. At the same time, this paper may provide a theoretical basis for improving disease-resistant and cold and drought resistant ability of plants in the future.
Keywords/Search Tags:lnc RNA, lnc RNA identification, sequence features, structure features
PDF Full Text Request
Related items