Populus simonii is an important tree in the genus Populus,widely distributed in the Northern Hemisphere and having a long cultivation history.Although this species has ecologically and economically important values,there is a lack of information on genetic resources of P.simonii,hindering the development of new varieties with wider adaptive and commercial traits.1.Here,we report a chromosome-level genome assembly of P.simonii using PacBio long-read sequencing data aided by Illumina paired-end reads and related genetic linkage maps.The assembly is 441.38 Mb in length and contain 686 contigs with a contig N50 of 1.94 Mb.With the linkage maps,336 contigs were successfully anchored into 19 pseudochromosomes,accounting for 90.2%of the assembled genome size,in addition,due to lack of markers or location conflicts,350 contigs and 126 markers were not anchored to chromosomes.2.In order to further improve the integrity and accuracy of genome assembly,this study also combined Hi-C technology to assist the genome assembly of P.simonii.By using ALLHIC software,653 contigs(435Mb)were successfully anchored to 19 chromosomes,accounting for98.6%of the total genome size,only 33 contigs unanchored to the corresponding chromosomes.Genomic integrity assessment showed that 1,347(97.9%)of the 1,375 genes conserved among all embryophytes can be found in the P.simonii assembly,this indicated that the integrity of the genome assembly of P.simonii reached 97.9%.The data of the whole-genome PE short reads from the Illumina platform,the two batches of long reads from the PacBio system,and the transcriptome reads have been deposited in the SRA database at the National Center for Biotechnology Information(NCBI)with accession numbers of SRP071167,SRR9112943,SRR9887262,and SRR9113443,respectively.The genome assembly of P.simonii is available under the GenBank assembly accession number GCA_007827005.2.3.Genomic repeat sequences analysis revealed that 41.47%of the P.simonii genome is composed of repetitive elements,of which 40.17%contained interspersed repeats.A total of45,459 genes were predicted from the P.simonii genome sequence and 39,833(87.6%)of the genes were annotated with one or more related functions.4.Clustering analysis of gene families from P.simonii and the related three species showed that there were 24,955 gene families containing sequences from the four Populus species,of which 15,556(62.3%),4,451(17.8%),4,237(17.0%),and 711(2.8%)were shared by 4,3,2,and only one of these species,respectively.Excluding the gene families shared by all four Populus species,we found that the desert tree species P.euphratica shared 996 gene families with any one or two of the three species P.trichocarpa,P.deltoides,and P.simonii,while 7,692 gene families were shared within the remaining three Populus species.This indicates that P.simonii is more closely related to P.trichocarpa and P.deltoides than to P.euphratica.Phylogenetic analysis indicated that P.simonii and P.trichocarpa should be placed in different sections,contrary to the previous classification according to morphology.The results of collinearity analysis show that P.simonii structure was globally similar to that of P.trichocarpa,without large rearrangements,inversions,or translocations.Additionally,the analysis of gene family expansion and contraction showed that 2,356 gene families were expanded and 5,224 families were contracted in the P.simonii genome compared to the other plant species.The genome assembly not only provides an important genetic resource for the comparative and functional genomics of different Populus species,but also furnishes one of the closest reference sequences for identifying genomic variants in an F1 hybrid population derived by crossing P.simonii with other Populus species. |