Font Size: a A A

Biological Data Storage Based On Hbase And Analysis Of DNA Sequence

Posted on:2016-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhangFull Text:PDF
GTID:2180330464970724Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the exponential growth of biological data, the problems with store and deal with biological data to be solved in the process of construction biological database are more outstanding. With the process of biological databases, it is necessary to use Hadoop platform, and build the Hbase model, with implementation of cloud computing, integrating other discipline knowledge for sequence data analysis.According to the construction process of biological databases, the paper utilizes Hbase database of Hadoop platform, considering that in recent years the amount of biological data shows exponential growth, facing large biological data processing problems. First of all, this paper utilizes the UML class diagram to represent genomic data model and GenBank file data model, and design the storage mode of genomic data and the GenBank data file with Hbase, especially for the storage mode of sequence data with Hbase are discussed. DNA sequence patterns with Hbase database, make the best choice for short sequence alignment, and put forward the corresponding functions, which can improve the efficiencies to some extent.In this paper, the paper utilizes the phase space of nonlinear subject, to different structures of unique sequences. In the construction process, the paper utilizes K-words and the proposed index, to calculate the minimum K value to obtain the shortest sequence to distinguish each sequences, finally with phase space, the paper maps the sequences to the graph, to observe the differences within the sequences from the graphics.In this paper, using the random walk and fractal of nonlinear subjects, the paper constructs random walks for DNA digital sequence, and compares different sequences. The paper calculates Hurst index, which is divided into two stages mapped to 2D space, and compares differences within different species.
Keywords/Search Tags:Biological Database, Hadoop, Hbase, Phase Space, Hurst parameter
PDF Full Text Request
Related items