Font Size: a A A

Research On Reconstruction Of Ancestral Genome Based On Maximum Likelihood Criteria

Posted on:2021-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:X Y QiFull Text:PDF
GTID:2480306455973939Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Genome reconstruction is an important research field of bioinformatics,which can promote the development of genome comparison and analysis.The reconstruction process of the ancestral genome is the phylogenetic study of a group of species,and information about the ancestral species can be obtained,such as gene content,the order and direction of these genes in the genome,nucleotide sequence,etc.This information can help researchers understand the history of species formation and evolution,as well as the evolutionary relationship between species.Therefore,it is of great significance to study the method of reconstructing the ancestral genome.The thesis proposes an ancestor genome reconstruction algorithm based on the maximum likelihood method and an algorithm for obtaining candidate ancestor genomes.Main research content:(1)We propose an algorithm for obtaining candidate ancestor genomes.The algorithm takes the phylogenetic tree and the genomes of known species as input data,and obtains candidate ancestral genomes through weight calculation based on the principle of maximum parsimony.The algorithm for obtaining candidate ancestor genomes can process a large amount of basic data and effectively reduce the loss of genes in the process of predicting ancestral genomes.This thesis,we deeply study the reconstruction algorithm of ancestor genome based on the maximum parsimony method,namely the infer CARs algorithm.The algorithm uses a greedy algorithm to calculate the parsimonious value of the target ancestor genome.When the amount of basic data is too large,it will predict gene loss.This thesis,the algorithm for obtaining candidate ancestor genomes is used to improve infer CARs,which effectively reduces the amount of predicted gene loss and improves the accuracy of predicting ancestral genomes.(2)Ancestor genomic reconstruction algorithm based on maximum likelihood method.This thesis proposes the RAGMLC(Reconstruction of Ancestral Genomes based on Maximum Likelihood Criteria)algorithm to predict the ancestral genome by calculating the maximum likelihood value.The maximum likelihood method is a statistical method based on an evolutionary model.It has the characteristics of statistical consistency,robustness and full use of original data,which minimizes the error rate of prediction.RAGMLC adopts the algorithm of obtaining candidate ancestor genomes to reduce the amount of gene loss in predicted ancestral genomes and improve the accuracy of prediction.The RAGMLC algorithm takes the phylogenetic tree and the genomes of known species as the input data of the algorithm,and reflects the accuracy of prediction by calculating the likelihood values of the genomes of all nodes in the phylogenetic tree.The infer CARs algorithm can only predict the target ancestor genome,while the RAGMLC algorithm can accurately predict the ancestor genome of all nodes in the phylogenetic tree.According to the biological rules of species evolution,RAGMLC calculates the likelihood of the entire phylogenetic tree to make the prediction results more accurate.This thesis uses simulated data experiments and real data experiments to test the RAGMLC algorithm and the infer CARs algorithm respectively.The results show that the ancestral genome predicted by RAGMLC is closer to the real ancestral genome.RAGMLC predicts that the amount of gene loss in the ancestral genome is significantly less than infer CARs.At the same time,the accuracy of RAGMLC prediction results is higher in other algorithm evaluation criteria such as DCJ(the shortest rearrangement distance between two genomes)distance.
Keywords/Search Tags:maximum likekihood method, maximum parsimony method, ancestral genome, phylogenetic tree, reconstruction
PDF Full Text Request
Related items