Font Size: a A A

Genetic Variant Data Processing Based On JSON

Posted on:2019-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:DILINI DULANGIKA JATHUNGA DAHAFull Text:PDF
GTID:2428330566497333Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The arrival of the next generation sequencing has been used for the wide range of studies in different fields such as health genetics,phylogenetic,microbiology etc.This new technology made a big mess in the industry,such as the difficulty of large genomic data generation,storage,visualization,etc,and always need of expert bioinformatician's help becoming a huge challenge for researchers in biomedical fields nowadays.Normally VCF(Variant Call Format)data are stored and handled in rapid computing cluster in present days.VCF is a systemized format for saving and reporting genomic variations such as SNPs,indels,and larger structural variants collectively with rich annotations.One of the biggest challenges is how to effectively and efficiently managing the huge amount of VCF data.Processing and analyzing VCF data usually require huge storage space and need a high computational speed.As a result of this new arrival,several formats were introduced for the variant storages and representations.At this point,large VCF files can be processed through database mapping conversions.A number of VCF data tools for extracting data from VCF files have appeared,but JSONbased conversion tools are rarely involved.This article attempts to establish a mapping method between VCF and JSON,and based on this design and development of a VCF conversion analyzer.Development of tools to deal with huge VCF files is an essential requirement for bioinformatics industry.Currently,effort has been direct to developing different methods such as conversion tools to gain more effective,efficient results and userfriendly Graphical User Interface(GUI)to expand the access for non-bioinformaticians too.In this thesis,a mapping from VCF to JSON is designed and a tool(VCF parser)that is used to transform the VCF data to the JSON data is designed.VCF parser is a web application where the user can input a VCF file or a batch of VCF files(Zipped)and retrieve the JSON output,developed using python and user interface developed using html5 and j Query libraries.The final experimental results illustrate that conversion from VCF to JSON has achieved zero data loss and large files have been compressed to smaller sizes,which could effectively improve the storage performance.
Keywords/Search Tags:Variant Call Format, Storage, Query, Parser, Conversion Tool
PDF Full Text Request
Related items