Font Size: a A A

Research On The Processing Method Of Insertion And Deletion Errors In DNA Sequencing

Posted on:2021-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:L X WangFull Text:PDF
GTID:2480306548982899Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Multiplexed sequencing is a key strategy for sharing the rising capacity of next-generation sequencing.The robustness of multiplexed sequencing in general relies on the error correction capability of barcodes.However,barcodes are frequently corrupted by insertion,deletion and substitution errors introduced during DNA synthesis,amplification and sequencing,resulting in sample misassignment.Aiming at insertions,deletions and substitutions in multiplexed sequencing,a new construction method for sequencing barcodes is proposed.Furthermore,the highly robust barcode identification methods are presented.First,aiming at the high insertions,deletions and substitutions in multiplexed sequencing and the high complexity for decoding the sequencing barcodes,a new barcode construction scheme based on a cyclic block code and a predetermined pseudorandom sequence is proposed.Furthermore,a low-complexity barcode identification scheme is presented,which uses a combination of cyclic shifting and dynamic programming to mark the insertion and deletion positions and then performs modification and erasure-and-error-correcting decoding to the corrupted codeword.Through simulations,we verify the robustness of barcodes for multiple errors and evaluate the reliability of the barcodes in the DNA context.Second,to further improve the performance of sequencing barcodes against insertions,deletions and substitutions,a barcode construction method based on a general block code and a pseudorandom sequence is proposed.Furthermore,a soft decision identification method is presented.First,the hidden Markov model for base insertion/deletion estimation is established using the known pseudorandom sequence.Then,the forward-backward algorithm is adapted to output the soft information of each bit of the block code.Finally,soft decision decoding is performed to effectively correct multiple errors.Simulation results show that the proposed method is robust to a high number of insertions,deletions and substitutions in the barcodes.In addition,for two different error scenarios in DNA context,the performance of the proposed barcodes is superior.
Keywords/Search Tags:DNA sequencing barcode, Insertion/deletion, Dynamic programming, Hidden Markov model, Forward-backward algorithm
PDF Full Text Request
Related items