Design And Implementation Of Open Data Analysis System Based On OpenMP

Posted on:2016-12-26

Degree:Master

Type:Thesis

Country:China

Candidate:Y Shi

Full Text:PDF

GTID:2208330461987648

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In this paper, the promoter gene data are analysised and processed. Then a promoter data analysis system is designed and implemented based on OpenMP. The Soybean promoter sequences are analyzed by this system.First of all, the different plant upstream promoter gene sequences of 1 KB area in the gene are intercepted as the experimental data. Then the sequences will be matched with the 469 fixed motifs. Then the matching positions and the matching times of the 469 regulatory elements in 1 KB promoter sequence are identified. According to the requirements, the matching result will be duplicate removal in the matching processing. Next, the matching result is handled by the frequent mining software. Secondly, the P values of the frequent mining results file are calculated and the prime splitting algorithm is adopted to increase the calculation precision of P value. Then the 0.05 / C(469 m) is the standard to filter, and the frequent regulatory elements combinations are selected with this standard to find the effective combinations in the frequent data. Finally, the effective combinations of the heterogeneous plants promoter sequences are analysised and processed by sharing. Then the shared frequently regulatory elements combinations between the heterogeneous plants are obtained. The result will be processed by GO functional annotations. The mutual relations of promoter sequences are determined between heterogeneous plant promoters from the annotation results. At the same time, the special sequences can determined whether these sequences are effect for the gene regulation, thus the result can provided the reference data for subsequent gene research work.In the process of the system realization, because of the large amount of genetic data, the task modules relationship among promoter sequences and multilayer circulation interconnection are analysised. So the uncorrelated tasks will be parallel processed by OpenMP technology. In the end, the execution time between serial algorithm and parallel algorithm are compared. The experimental result is shown that the the parallel algorithm is observably to improve the efficiency of the promoter data analysis system. This efficient processing method is important practical significance for the subsequent gene research.

Keywords/Search Tags:

Promoter, GO Annotation, Motif, P-value, OpenMP

PDF Full Text Request

Related items

1	Construction Of The Genomic Promoter Sequence Database And Web Application System
2	Study On Algorithms For DNA Sequence Motif Discovery Based On Gibbs Sampling
3	Research On Compilation And Optimization For OpenMP Programs
4	Research On The Algorithms And Applications For The Motif Discovery Problem
5	Based On Latent Semantic Analysis Of Eukaryotic Promoter Recognition
6	Promoter Recognition System Research From Gene Sequence Data
7	Subtle Motif Discovery Algorithms
8	Research Of Promoter Recognition Using The Rough Set Theory
9	Optimizing Genetic Algorithm For Motif Discovery
10	The Textural Research On Citation Of The Guo Yu And Its Annotation Of Jia Kui And Wei Zhao In Li Shanâ€™s Annotation Of Wenxuan