Font Size: a A A

Design And Implementation Of Big Data Management Platform For Ruminant Genome Based On MongoDB

Posted on:2020-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:X M WuFull Text:PDF
GTID:2393330578956458Subject:Agriculture
Abstract/Summary:PDF Full Text Request
With the development of genomic research,a large number of public database systems of molecular biological information are available for researchers.However,the genomic data of ruminants such as cattle,sheep and camels are distributed and stored in different public databases,which requires researchers to retrieve and download the genomic data from different database platforms repeatedly,and it leads to researchers spend more time and energy.Therefore,it is imperative to construct a genome-integrated storage platform for ruminant livestock.Meanwhile,with the rapid development of sequencing technology and the substantial reduction of sequencing cost,there is a trend of explosive growth for genome sequencing data,which makes the storage and management of genome data faced with unprecedented challenges.Because the traditional storage technology has been no longer to meet the needs of the era of big data anymore,it is profound significance to establish an integrated management platform of ruminant genome based on big data technology.The specific work of this thesis is as follows:(1)Firstly,this thesis investigated large-scale public databases and related literatures,and designed a data model of genome database according to the research needs of researchers in the Applied Information Biology Laboratory.And from the public databases such as NCBI and ENSEMBL,the high-throughput sequencing data,whole genome data and variation data of cattle,sheep,camels and other species were downloaded,with more than 23000 pieces and about 30TB of data,which provides sufficient data support for the construction of the platform.(2)Secondly,Based on the storage and retrieval requirements of massive genomic data in the field of biological information,this thesis proposed a storage scheme of genomic big data of ruminant livestock based on MongoDB distributed cluster.With the Spring MVC framework as the core technology and combined with Ajax and Vue,the Java programming language was adopted to implement functional integral platform,which provides users with a series of operations on genome data,such as uploading and downloading the sequence file,managing functions of importing batch genome data in Excel table,adding or deleting or modifying genome description documents.The ruminant livestock genome data management platform has been formed eventually.(3)Finally,this thesis deployed a ruminant livestock genome big data management platform on a five-node MongoDB distributed cluster."The Black-Box Testing method"was used to test the core functions of the ruminant genomic big data management platform.The experimental results showed that the platform brings a good experience for users and each functions were used normally.The design and development of this platform provides a good data platform for the study of ruminant genomics and provides convenience and help for the researchers engaged in the study of ruminant genome.
Keywords/Search Tags:Big data, MongoDB, Ruminant genome, Spring MVC
PDF Full Text Request
Related items