Font Size: a A A

Analysis Of Microsatellites Landscapes In Human Genome At 1 Kbp Resolution

Posted on:2022-07-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:D Y LiFull Text:PDF
GTID:1520306731968219Subject:Biology
Abstract/Summary:PDF Full Text Request
Microsatellites are a type of short tandem repeats(STRs)widespread in various position of eukaryotic,prokaryotic,and also viral genomes with repeat units of 1-6 base pair(bp).Microsatellites were considered as junk sequence when they were found out at first,however,increasingly more studies have reported that the microsatellites play important structural and functional roles this decades,but there are still very few human microsatellites studies,therefore,it is necessary to systematically and comprehensively study the microsatellites distributional characteristics in 22 autosomes and 2 sex chromosomes of human genomes.Herein,the human reference genome(latest version,GRCh38.p13),including 22 autosomes and 2 sex chromosomes with genome size about 310 million base pairs and sequenced size about 290million base pairs,was selected as the sample to systematically analyze microsatellites distributions in human genome.The data showed there are over 19 million microsatellites,whose total size is 152 million base pairs and the relative density(RD)is 51.70 bp/Kbp(Kbp=1,000 bp)in the reference human genome;in the microsatellites analysis of 24 human reference chromosomes respectively,the numbers of the microsatellites are 190,048~1,590,279,and sizes of the microsatellites are 1,528,466~12,686,411 bp,while the relative densities of the microsatellites are 49.70~57.86 bp/Kbp.However,it is very simple method to analyze the relative densities of human microsatellites,which can not exactly reveal the microsatellites landscapes in different positions of the human genome.Based on lots of previous analysis and exploration,the home-made Microsatellites Differential Calculating algorithm were built here,with the corresponding program Differential Calculator of Microsatellites(DCM),the corresponding analytical system and the corresponding element Position-related Differential_nRelative Density(p D_nRD),which can comprehensively and clearly reveal the exact distributional features of microsatellites densities in different positions of large genomes like human genomes;it is necessary to adjust differential resolutions in multiple times,which can find out the proper differential scale to reveal the microsatellites landscapes in each position of the human genome,thus,the systematic analysis was built to compare the microsatellites distributional features at many resolutions.The systematic comparative analysis showed that the results at 1 Kbp resolution can more clearly and precisely reveal the distributional features of microsatellites in different positions of the human reference genome,and these results were visualized into totally 58830 microsatellites landscape maps in the human reference genome at1 Kbp resolution.These maps showed clear microsatellites landscapes in different positions of the human genome;they revealed that there are the microsatellites accumulate in high and middle density levels in some positions,whose density values are 6 and 3~6 times higher than those of their neighborhood respectively with notable statistical significance,and many regions harbor the microsatellites with low density variations to show the average microsatellite distributions,moreover,the microsatellites occur with very low densities in some regions and even no microsatellite appear in some positions.In summary,these landscape maps are grouped into 3294 HM(High&Middle-high)maps,18437 M(Middle-high)maps,36607 Normal maps,455 EL(Extremely low)maps and 38 Penta(Pentamicrosatellites)maps,whose percentages are 5.60%,31.34%,62.23%,0.77%and 0.06%,respectively,and all of each investigated chromosome contains HM,M and normal maps,however,some chromosome does not contain EL and Penta maps.Based on many published studies having reported the ubiquitous STRs distributions in genomes,the linear replication slippage model proposed before was revised in this study,and the folded replication slippage model was proposed here,which scientifically involved the geometric space of nucleotides and the relationship between the phosphodiester bonds and hydrogen bonds in DNA double stain,and was built by the exact calculation of the CAD software.This model can reasonably explain the mechanisms of STRs occurrences in genomes.In conclusion,based on massive data of microsatellites in human genomes with new analytical methods and systems to establish visualized maps,this study revealed exact microsatellites landscapes in different positions of human genomes,and established the folded replication slippage model to reasonably explain the mechanism of STRs occurrence,which can contribute to deeply and exactly understanding the human STRs distributions and occurring mechanism,and also can provide important guides and reference for further studies of STRs relating with human genome structures and function.
Keywords/Search Tags:Human reference genome, Microsatellites (SSRs), Differential calculating algorithm, 1 Kbp resolution, Landscape maps, Folded replication slippage model
PDF Full Text Request
Related items