Font Size: a A A

Hierarchical Evolutionary Tree Construction Basic On Generalized Tight Clusters

Posted on:2015-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:X JinFull Text:PDF
GTID:2250330431953558Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Phylogenetics is a discipline which concentrates on the regularity of biologi-cal evolution. It describes the relationship between different species by using Phylogenetic trees and has attracted much attention in recent years. A pop-ular methodology in Phylogenetics research is to extract the biosystems as a digraph, of which nodes represent individual species and edges represent the genetic relationships between them. Therefore, Phylogenetic trees with hier-archical structures could be obtained by classifying the nodes in digraphs.Most of the related works are about finite direct graphs, which are limited to the classification of the subsets (i.e. the leaf nodes) only. In this sense, non-leaf nodes are always ignored, which might lead to the situation where a great number of nodes are left out and excluded from all classifications. Meanwhile, existing works on infinite digraphs failed to produce satisfactory results, especially in terms of ideal hierarchical structure, compared with finite graphs. The main reason for this drawback is that these works tend to seek clusters in infinite graphs which are infinite and hierarchical.In this study, we extend the methods of node classification of finite graphs and propose a novel method of hierarchical classification which could adapt to both finite and infinite graphs. Our contributions are mainly three folds. Firstly, this paper defines Generalized Tight Clusters(GTC) which are consist-ed of nodes with satisfactory descendant closure and ancestral closure in finite graph.’Descendant closure’means one set containing its all descendants and ’Ancestral Semi-closure’indicates that presents of any node are also in this set or in some specify subsets which are called limited sets of GTC. All nodes of GTC are children of some nodes in limited sets. But all descendants of nodes in limited sets are in GTC. Since GTC and limited sets are one-to-one cor-respondence, we can find GTC by way of finding limited sets. Secondly, this paper defines Generalized Complete Tight Clusters(GCTC) without boundary points.’Boundary points’are the nodes which their children are in the limited sets of GTC. Finally, we present an algorithm which can find GTC in given graphs, and get GCTC by boundary expansion. We improve our algorithm by adding weights on closed and connecting checking so that the improved algorithm is capable to deal with the Big Data with noise and data missing.We define time-limited generalized tight clusters for infinite graphs, also called Conditional Tight Clusters(CTC). By constraining the birth time of individ-uals, we could get clusters which all of their elements were born before the limited time T. The CTC are related to the limit time T and they would change in line with T. However, for some CTC, their limit sets might stay unchanged while we change T. Such clusters are defined as Uniform Tight Clusters(UTC). Since the CTC and UTC could be certain after their limit sets was fixed, we could use GTC searching algorithm to process the limit sets of CTC and UTC, and calculate the clusters.Compared with the existing methods, our method could deal with more nodes, both leaf nodes and non-leaf nodes, and is easier to find out clusters. Meantime, we have designed an algorithm for searching clusters in graphics and it could be accelerated by Graphics Programming Unit (GPU). Besides, we also describe a weighted classification algorithm so that the clusters could be more robust.
Keywords/Search Tags:Species, Evolutionary Trees, Hierarchies, Clusters
PDF Full Text Request
Related items