Font Size: a A A

Visualization Of The Quality Control Of The Whole Genome Sequencing Data Of Drug-resistant Bacteria

Posted on:2021-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhouFull Text:PDF
GTID:2434330605463098Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
The existence and spread of drug-resistant bacteria is increasingly threatening public health and hygiene,which poses serious challenges for clinical and laboratory testing and monitoring of bacterial resistance.It is urgent to carry out rapid drug sensitivity tests and analysis of bacterial resistance mechanism.Single-cell whole-genome sequencing has the advantages of rapid culture-free discovery and heterogeneity discovery in the discovery of new drug resistance genes.However,due to the easy contamination of single-cell sequencing data and the large sequencing preference,comprehensive quality of single-cell sequencing data require comprehensive quality control and in-depth analysis.At present,both tools and integrated systems for genomic data analysis of drug-resistant bacteria and analysis methods need to be improved.In this article,on the one hand,we do analysis and comprehensive quality control of drug-resistant bacteria sequencing data.First,we collect public data and conducted basic analysis.It is found that the data source in the public system is relatively single,then we select some data as a comparative reference for gene annotation.After that,we perform a series of visual analysis and design on the high-throughput processing results,perform statistical analysis and visualization on single nucleotide polymorphisms and genomic contamination;we use the t-SNE algorithm to visualize the department category to which the sequence belongs,and find some information such as the B3-sample contains a lot of Enterobacteriaceae,Corynebacteriaceae and Corynebacterium.Finally,the genomic sequence data of single and multiple samples are selected for potential category analysis and potential category regression analysis,respectively,to analyze the gene sequence characteristics of drug-resistant bacteria,which is helpful for genetic analysis of drug-resistant bacteria performance.On the other hand,we build an integrated and visual analysis system.Using R's shiny technology,we construct a single-cell whole-genome sequencing data statistical analysis visualization platform.The platform is a systematic and process visualization platform,which is mainly used to integrate high-throughput data analysis results,and conduct in-depth analysis and visualization.The platform and the high-throughput analysis platform jointly form a complete visualization system.The analysis system has excellent performance,convenient operation,and strong practicability.It can perform simultaneous analysis of any number of samples;the platform can be used simultaneously by multiple ports after Docker deployment;Users only need to read in the data and easily click to realize the complex data analysis and visualization function.The whole genome sequencing data of a group of drug-resistant bacteria was imported into the platform for evaluation,and the results showed that the platform performed well.This system can effectively improve scientific research efficiency and analysis effect,and can play an important role in the study of whole genome sequencing data of drug-resistant bacteria.
Keywords/Search Tags:whole genome sequencing, drug-resistant bacteria, poLCA, visualization, t-SNE
PDF Full Text Request
Related items