| Soybean (Glycine max) is one of the most important oil crops. Soybean was domesticated in China, but its domestication place is still in controversial. Genetic diversity of cultivated soybean has been narrow because of the genetic bottleneck and further improvement by the modern breeding programs. Wild relatives(Glycine soja) have the potential to provide extra genetic resources for the soybean breeding. With the rapid development of next generation sequencing (NGS) technologies, ab initio sequencing and resquencing of genomes provide a faster and more convenient way for genetic diversity investigation. Recently, over10wild soybean genomes have been re-sequenced, however, none of them was from Southern China. In this study, a wild accession (Lanxi1) from Zhejiang Province, the low Yangtze River was collected and deep sequenced using Illumina Hiseq2000platform. To aid gene annotation, a transcriptome of Lanxi1leaf was also determined by RNA-SEQ. Total55Gb raw reads were generated and53.4Gb clean data after filtering reads with low quality, which resulted in average depth of54.9X and genome coverage above95.75%to the reference genome Williams82, were further used to assemble. After de novo assembly, a scaffold sequence set of Lanxil genome was obtained with N5051,008bp. Totally92,163protein-coding genes were predicted based on ab initio gene finding methods (AUGUSTUS) and77,426(84%) and74,292(80.6%) of which have significant similarities with the public reference gene set and the Lanxi1RNA-SEQ transcripts, respectively. Approximately4.2million SNPs and0.7million InDels have been identified between Lanxil and the Williams82reference. Relative to other17wild and further14cultivated soybean genomes,10Mb and22.4kb Lanxil-specific sequences were detected, respectively. In other hands,3.1Mb the reference William82-specific sequences were identified by mapping genomic data from Lanxil and other17wild accessions. The3.1Mb cultivated soybean specific sequences containing127genes were further tested neutrality using Tajima’s D test, and67regions (hosting20genes) have significant positive selection signals, suggesting their potential roles in soybean domestication and agronomic traits. Finally, SNP densities and phylogenetic trees of Lanxi1and other accessions comparing to the reference indicate that Lanxil has the most diverse genetics with the cultivated soybeans than all other wild lines. This result favors the hypothesis of northern China for the origin of soybean. Taken together, our researches provide a valuable wild genome for soybean breeding and evolutionary research in future. |