Font Size: a A A

Research On Visualization And Clustering Of Standard Synthetic Biology Parts Based On Nonliner Dimensionality Reduction

Posted on:2017-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:R C LiFull Text:PDF
GTID:2308330485462224Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
īš¼isualization is an efficient tool to express data, and has been successfully applied into many areas, e.g. Information, Biology, Synthetic Biology, etc. The rapid development of these new areas has put great challenges to visualization. For example, in Synthetic Biology, there are numerous of standard parts, making it hard to choose a part when constructing devices. Visualizing these parts could simplify the part selection. In order to achieve this goal, we done the research:1) Dimension reduction and Visualization. The first step of dimension reductionanalys is for biological components is to calculate the similarity between two biological components. The choice of similarity calculation methods will significantly impact the clustering.Considering synthetic biology parts are DNA segments with various lengths, the similarity of these parts was evaluated by the combination of edit distance and Gaussian kernel. Based on the similarity, LaplacianEigenmaps is employed to reduce data dimensions to 2 or 3 dimensions. By visualizing the reduction datawhich proves the discrimination of the reduction data, the parts with different functionality could be separated efficiently and this could significantly improve the efficiency of parts selection.2) Clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). Clustering the biological components sequences byusing k-means algorithm. By visualizing the reduction data, the parts with similar functionality could cluster together, and the parts with different functionality could be separated efficiently. Besides, the cluster accuracies for two kinds and three kinds of parts reaches 91.6% and 82.4%.Through the above research, we presentmethodthat can improve the efficiency of building synthetic biological devices, and proved the effectiveness by using cluster analysis. This method makeit easy to choose a part when constructing devices.
Keywords/Search Tags:Visualization, Synthesis Biology, Nonlinear Dimensionality Reduction, Edit Distance, Clustering
PDF Full Text Request
Related items