Font Size: a A A

Research On Key Algorithm Design And Optimization Technology In Molecular Co-Evolution Analysis

Posted on:2018-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhangFull Text:PDF
GTID:2370330623950562Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Co-evolution is a biological phenomenon that occurs generally in nature and inside living organisms.The co-evolution at the molecular level is called molecular co-evolution,which refers to synchronous evolution between different sites in sequences n order to maintain the structural or functional interaction between or inside molecules.The molecular co-evolutionary analysis algorithm is used to obtain the co-evolutionary region between molecules to further predict the intermolecular structure and function binding domains.However,traditional molecular co-evolution analysis tools apply a mathematical model with some limitations,lack the compatibility process different kind of biological macromolecule sequences and cannot support the processing of large-scale data.In order to solve the problems above,this thesis designs a new molecular coevolution analysis algorithm to solve the problem of accuracy and compatibility in traditional methods,and designs and implements a multi-level parallel acceleration model based on CPU-MIC heterogeneous cooperation to meet the need of large-scale molecular co-evolution analysis.This thesis mainly includes the following three aspects of work:1.The traditional protein-protein co-evolution analysis algorithm has the shortcomings that the detection results are susceptible to single-column noise and the traditional algorithm ignores the molecular co-evolution between co-conservation sites.Based on the problems above,a new protein-protein molecular co-evolution analysis algorithm is proposed.In order to solve the problem of single-column noise,the concept of the site unit is proposed and the sliding window strategy is adopted in the search process.The algorithm abandons the traditional co-variation model,uses a new mathematical model and takes both the co-variation and co-conservation information into account.The molecular co-evolution detection capability of the algorithm was tested and analyzed by generating simulated protein sequences with typical evolutionary characteristics.Experiments show that the new algorithm has the ability to detect both co-variation signals and co-conservation signals at the same time,and has certain advantages in the detection ability compared with the traditional algorithm.2.It is important to study the interaction between lncRNA and protein by molecular co-evolution,which is of great significance to reveal the regulation mechanism of histone modification.However,the traditional molecular co-evolutionary detection algorithm generally focuses on a specific biological sequence,lacks the compatibility of nucleotide sequences and amino acid sequences and cannot apply to different types of input sequence data.In this thesis,the process of calculating the average substitution rate between sequences is defined as the concept of distance calculation function.The corresponding distance calculation function is designed and integrated into the new algorithm for different types of sequences.The algorithm not only realize the function of RNA-protein molecular co-evolution analysis but also gets the detection ability for DNA,RNA and protein.Based on the new algorithm,molecular co-evolution analysis tool COPCOP is designed and developed with good practicality,which can be used for researchers to conduct molecular interactions research between various biological sequences.3.Molecular co-evolution analysis is a computationally intensive work.In the case where the length of the sequence is significantly increased and the number of aligned sequences is significantly increased,the computational cost required for mass molecular co-evolution analysis is huge.However,there is no molecular co-evolution analysis tool that multi-node heterogeneous parallel to process large-scale data sets.This paper presents a multi-level parallel optimization of the newly developed molecular coevolution analysis tool COPCOP in Tianhe-2 supercomputer system to meet the requirements of large-scale molecular co-evolution analysis and detection.Based on the OpenMP and MPI parallel programming architecture and the CPU-MIC heterogeneous collaborative model,this paper realizes the large-scale multi-level parallel molecular coevolution analysis and detection tool mCOPCOP.In the multi-node test,mCOPCOP obtained a maximum of 197.14-fold parallel speedup and obtains near linear scalability,which provides an effective solution for large-scale molecular co-evolution analysis.
Keywords/Search Tags:Molecular Co-evolution, Histone modification, Co-variation Selection, Co-conservation Selection, Algorithm Design, Parallel Optimization, Tianhe-2
PDF Full Text Request
Related items