First Complete Genome of a Bacterial Strain from Soil

Released: September 22, 2016
Cutting-edge sequencing overcomes grand challenge in soil metagenomics
Using data from native prairie soil in Kansas and computational resources at EMSL, researchers made the highest quality and most extensive metagenome assembly of a soil microbial community to date.

The Science

Soil metagenomics have been termed a grand challenge due to the complexity and diversity of microbial communities in soil ecosystems. A recent study addressed this challenge by using long-read genome sequencing technology to improve sequence assembly, and to enable the reconstruction of hundreds of individual microbial genomes from complex soil ecosystems.

The Impact

The findings demonstrate the promise of long-read genome sequencing technology for overcoming one of the biggest challenges facing metagenomics: sequence assembly of highly complex soil microbial communities. This approach could improve molecular-level understanding of the functions of different soil microbes in stabilizing soil ecosystems and in the response of soil systems to environmental disturbances.

Summary

Soil microbes carry out key processes for life on our planet, including nutrient cycling and supporting plant growth. However, there is poor molecular-level understanding of their functional roles in ecosystem stability and responses to environmental disturbances, largely due to the difficulty in making a culture for the majority of soil microbes. The use of culture-independent approaches such as metagenomics could allow scientists to directly assess the functional roles of soil microbiomes. But the high microbial diversity in soils represents an enormous challenge to metagenomics, due to low coverage obtained for individual populations, uneven sampling of microbes, large amount of sequence data acquired, and the typical sequencing of short DNA fragments that can cause artifacts and decrease the reliability of analysis. Despite increasingly large soil metagenome data volumes, the majority of the data have not been amenable to assembly—the process of aligning and merging fragments from a longer DNA sequence in order to reconstruct the original sequence. To overcome these challenges, researchers from Pacific Northwest National Laboratory (PNNL) turned to the cutting-edge approach of Moleculo synthetic long-read sequencing technology, available from Illumina, Inc., to assemble soil metagenome sequence data into long contigs—contiguous, overlapping sequence reads resulting from the reassembly of the small DNA fragments. Through the Great Prairie Soil Metagenome Grand Challenge Initiative spearheaded by the U.S. Department of Energy’s (DOE) Joint Genome Institute, the researchers obtained ~110 giga base pairs (Gbp) of raw sequence data from a Kansas native prairie soil consisting of short-read data (~100 bp). In addition, they sequenced 87.7 Gbp of rapid-mode read data (~250 bp), plus 69.6 Gbp of long-read data (>1.5 kbp) from Moleculo sequencing. To assemble sequence data, researchers used computational resources at EMSL, the Environmental Molecular Sciences Laboratory, an Office of Science user facility. The Moleculo data alone yielded over 5,600 reads of >10 kbp in length, and hybrid assembly of all data resulted in more than 10,000 contigs over 10 kbp in length. Moreover, the assemblies enabled the reconstruction of the first complete genome of a bacterial strain from a native soil metagenome, as well as hundreds of genomes for both common soil microbes and those with few cultured representatives. This represents the highest quality and most extensive metagenome assembly of a soil microbial community to date, and the first successful reconstruction of hundreds of genomes from a highly complex soil type. Taken together, the findings demonstrate the promise of Moleculo sequencing technology for improving both the accuracy and cost-effectiveness of sequence assembly for highly complex soil microbial communities.

Read related PNNL news release.

PI Contact

Janet K. Jansson
Earth and Environmental Sciences Directorate, PNNL
janet.jansson@pnnl.gov

Funding

This work was supported by the U.S. Department of Energy’s Office of Science (Office of Biological and Environmental Research), including support of EMSL, a DOE Office of Science User Facility, and the Microbiomes in Transition Initiative, a Laboratory-Directed Research and Development Program at PNNL.

Publication

R.A. White, III, E.M. Bottos, T. Roy Chowdhury, J.D. Zucker, C.J. Brislawn, C.D. Nicora, S.J. Fansler, K.R. Glaesemann, K. Glass, J.K. Jansson, “Moleculo long-read sequencing facilitates assembly and genomic binning from complex soil metagenomes.” mSystems 1(3), e00045-16 (2016). [DOI: 10.1128/mSystems.00045-16.]