In this paper, the promoter gene data are analysised and processed. Then a promoter data analysis system is designed and implemented based on OpenMP. The Soybean promoter sequences are analyzed by this system.First of all, the different plant upstream promoter gene sequences of 1 KB area in the gene are intercepted as the experimental data. Then the sequences will be matched with the 469 fixed motifs. Then the matching positions and the matching times of the 469 regulatory elements in 1 KB promoter sequence are identified. According to the requirements, the matching result will be duplicate removal in the matching processing. Next, the matching result is handled by the frequent mining software. Secondly, the P values of the frequent mining results file are calculated and the prime splitting algorithm is adopted to increase the calculation precision of P value. Then the 0.05 / C(469 m) is the standard to filter, and the frequent regulatory elements combinations are selected with this standard to find the effective combinations in the frequent data. Finally, the effective combinations of the heterogeneous plants promoter sequences are analysised and processed by sharing. Then the shared frequently regulatory elements combinations between the heterogeneous plants are obtained. The result will be processed by GO functional annotations. The mutual relations of promoter sequences are determined between heterogeneous plant promoters from the annotation results. At the same time, the special sequences can determined whether these sequences are effect for the gene regulation, thus the result can provided the reference data for subsequent gene research work.In the process of the system realization, because of the large amount of genetic data, the task modules relationship among promoter sequences and multilayer circulation interconnection are analysised. So the uncorrelated tasks will be parallel processed by OpenMP technology. In the end, the execution time between serial algorithm and parallel algorithm are compared. The experimental result is shown that the the parallel algorithm is observably to improve the efficiency of the promoter data analysis system. This efficient processing method is important practical significance for the subsequent gene research. |