| Multi-sequence alignment is the most basic information processing method in bioinformatics,and sequence similarity is one of the main reference bases for multisequence alignment.It is helpful for researchers to analyze the specific functions corresponding to different segments in the sequence,deepen the understanding of sequence structure and evolution information,and has important significance for human evolution.The main idea of sequence alignment is to use mathematical models or algorithms to calculate the maximum number of base matches between two or more sequences.The matching results show the similarity relationship between sequences and their biological characteristics.With the completion of the Human Genome Project,humans have obtained a large number of original biological data.In order to better analyze and utilize these data,sequence alignment has become the most important and commonly used research method in bioinformatics.How to obtain sequence alignment algorithms with better alignment quality,shorter operation time and less operation space and easy to use alignment analysis programs is a hot and difficult issue in current bioinformatics research.This paper focuses on how to effectively use the fragment information between sequences to improve the accuracy of the comparison results of the heuristic algorithm.The main work is as follows:This paper analyzes and introduces the star alignment algorithm commonly used in the current processing of high similarity sequences,and points out its defects: there is a local optimization problem,the application range of the algorithm is small,and the accuracy of the alignment results will be significantly reduced when the differences between sequences are large.In view of the problems of the algorithm,this paper proposes an improved star alignment algorithm based on the partial order graph to generate consensus sequence,combining the partial order alignment and the method of constructing the guidance tree.First,construct a guide tree according to the similarity between the sequences,and then carry out pairwise comparison according to the order of the guide tree to generate consensus sequences from the final partial order diagram.Finally,use the consensus sequence as the central sequence and the sequence to be compared to obtain the final alignment result.The test shows that the accuracy of the improved star alignment algorithm is better than that of the traditional star alignment algorithm,which shows that the local optimization problem of the algorithm is solved to a certain extent.On this basis,the SIMD parallel computing strategy is used to parallelize the sequence alignment process in the improved star alignment algorithm,which effectively reduces the time complexity of the algorithm.The test results show that the algorithm has good performance. |