Ocean DNA Study Maps 317 Million Gene Groups

  • <<
  • >>

610176.jpg

Transmission electron micrograph of Wolbachia within an insect cell. Wolbachia belongs to the alphaproteobacteria family, which the researchers found in the upper ocean, mesopelagic ocean, dark ocean and benthic realm. Credit: Scott O'Neill, PLoS Biol, doi:10.1371/journal.pbio.0020076

The largest-ever study of ocean DNA has led to the most comprehensive database of marine microbes to date—317 million gene groups from marine organisms matched with biological function, location and habitat type. The so-called KMAP Ocean Gene Catalog 1.0 is a first step toward developing an atlas of the global ocean genome, which will allow scientists to investigate how different ocean ecosystems work, track the impact of pollution and global warming, and search for a myriad of biotechnology applications, among other possibilities.

Although mapping marine biodiversity is common for researchers, creating a full atlas of ocean life had been challenging since most marine organisms cannot be studied in the laboratory. The advent of DNA sequencing has helped researchers overcome these limitations, especially as the technology increased in speed and decreased in cost over time.

“Since each species has its own set of genes, we can identify which organisms are in an ocean sample by analyzing its genetic material,” said lead author Elisa Laiolo of the King Abdullah University of Science and Technology (KAUST) in Saudi Arabia. “The development of massive computational power and AI technologies have made it possible to analyze these millions of sequences.”

In the current study, published in Frontiers in Science, Laiolo and team used KAUST Metagenomic Analysis Platform to scan DNA sequences from 2,102 ocean samples taken at different depths and locations around the world. This advanced computing infrastructure identified 317.5 million gene groups, of which more than half could be classified according to organism type and gene function. The team then matched this information with sample location and habitat type, creating a database of unprecedented information on where microbes live and what they do.

Researchers say the catalog has already revealed a difference in microbial activity in the water column and ocean floor, as well as a surprising number of fungi living in the “twilight” mesopelagic zone. Interestingly, fungi represented over 50% of the distinct gene clusters identified in the mesopelagic zone. This highlights the contribution of fungi to microbial diversity and carries functional consequences for the role of fungi in elemental cycling in the ocean.

Additionally, researchers discovered that the ocean’s biomass ratios are not reflective of the corresponding distribution of unique genes—the eukaryotic component of ocean biomass outweighs its contribution to marine genetic diversity, which remains dominated by bacteria. The researchers say this means viral genomes contain far more innovation than previously realized.

These and other insights will help scientists understand how microbes living in different habitats shape ecosystems, contribute to ocean health, and influence the climate. The data also serves as a baseline for tracking the effect of human-induced pollution and global warming on marine life.

“This achievement reflects the critical importance of open science,” said the study’s senior author, Carlos Duarte, a faculty member at KAUST. “Building the catalog was only possible thanks to ambitious global sailing expeditions where the samples were collected and the sharing of the samples’ DNA in the open-access European Nucleotide Archive. We are continuing these collaborative efforts by making the catalog freely available.”

Cataloging the genome of the global ocean is a work in progress, and will remain so for decades to come. A remaining bottleneck, according to the KAUST researchers, is the need for higher computational power to handle the growing metagenomics datasets, including ongoing access to high-performing supercomputers.  For example, 48% of the gene clusters identified in the most recent study could not be annotated, requiring continuous efforts to compare the leftover data of approximately 150 million gene sequences against newly deposited sequences in order to characterize them. Hence, improvements in gene prediction frameworks are needed, together with continuous re-analysis.

“Our analysis highlights the need to continue sampling the oceans, focusing on areas that are under-studied, such as the deep sea and the ocean floor. Also, since the ocean is forever changing—both due to human activity and to natural processes—the catalog will need continual updating,” said Laiolo. 

 

Subscribe to our e-Newsletters
Stay up to date with the latest news, articles, and products for the lab. Plus, get special offers from Laboratory Equipment – all delivered right to your inbox! Sign up now!