Microbiome catalog: Researchers index over 200,000 human gut genomes
30 Jul 2020 --- A unified catalog of over 200,000 non-redundant reference genomes from the human gut microbiome has been created by an international research team. The Unified Human Gastrointestinal Genome (UHGG) collection compiles 171 million protein sequences from more than 4,600 bacterial species in the human gut, revealing its tremendous diversity. Combined with the team’s Unified Human Gastrointestinal Protein (UHGP) catalog, the collections will facilitate the exploration of the links between bacterial genes and proteins, and consequently, their effects on human health.
“This immense catalog is a landmark in microbiome research and will be an invaluable resource for scientists to start studying and hopefully understanding the role of each bacterial species in the human gut ecosystem,” explains principal investigator Nicola Segata from the University of Trento, Italy.
The UHGG project was led by European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI) and includes collaborators from the Wellcome Sanger Institute, the University of Trento, the Gladstone Institutes in the US and the US Department of Energy Joint Genome Institute.
Previously, researchers at the EMBL’s European Bioinformatics Institute and the Wellcome Sanger Institute identified almost 2,000 gut bacterial species, using a range of computational methods. Since the findings are mostly based on European and Northern American samples the researchers flagged the need for more data from other regions of the world.
Their work, published in Nature Biotechnology, revealed that 40 percent of the UHGP lack functional annotations. Notably, 3,312 (71 percent) of the detected bacterial species had never been cultured in the researchers’ lab – their activity in the body remains unknown and remains to be experimentally characterized. The largest group of bacteria that falls into that category is the Comantemales.
“It was a real surprise to see how widespread the Comantemales are. This highlights how little we know about the bacteria in our gut,” explains Dr. Alexandre Almeida, EMBL-EBI/Sanger Postdoctoral Fellow in the Finn Team. “We hope our catalog will help bioinformaticians and microbiologists bridge that knowledge gap in the coming years.”
More microbes than human cells
Bacteria produce proteins that affect human digestion and, consequently, our susceptibility to diseases. They are so prevalent that the body is estimated to contain more cells in its microbiome – the bacteria, fungi, and other microbes – than it has human cells.
To understand the role that bacterial species play in human biology, scientists usually isolate and culture them in the lab before they sequence their DNA. However, many bacteria thrive in conditions that are not yet reproducible in a laboratory setting.
To obtain information on such species, researchers take another approach: they collect a single sample from the environment – in this case, the human gut – and sequence the DNA from the whole sample. They then use computational methods to reconstruct the individual genomes of thousands of species from that single sample. This method, called metagenomics, offers a powerful alternative to isolating and sequencing individual species’ DNA.
Where did you get your genes from?
To standardize the genome quality across all sets, the researchers used thresholds of at least 50 percent genome completeness and no more than 5 percent contamination. The final numbers of genomes matching these criteria resulted in a total of 286,997 genome sequences.
These represented 204,938 non-redundant genomes, only considering one genome per species per sample to account for the fact that the three large metagenome-assembled genome (MAG) studies analyzed many samples in common.
Genomes were further recovered in samples from a total of 31 countries across six continents (Africa, Asia, Europe, North America, South America and Oceania), but the majority originated from samples collected in China, Denmark, Spain and the US.
Although this catalog provides a “very rich source of information” for microbiologists and clinicians, Dr. Almeida flags that much work is yet to be done. “We will likely discover many more novel bacterial species in under-represented geographical areas like South America, Asia and Africa. We still don’t know much about the variation in bacterial diversity across different human populations.”
Digging deep in the digestion
Genome sequencing and categorization have taken off in the scientific community. Recently, NutritionInsight reported on NutriGenomix growing its gene testing panel of validated markers from 45 to 70 genes. Microbiome discovery company Eagle Genomics and Cargill collaborated to enable the digital transformation of microbiome across the former’s global locations.
In the same sphere, a collaboration between Danone Nutricia Research and the University of California, San Diego’s Microsetta Initiative is recruiting hundreds of US citizens to map their gut microbiomes. The project offers participants an opportunity to get their microbiome sequenced and tested, free of personal charge.
“Using the UHGG or UHGP catalogs, the community can now screen for the prevalence and abundance of species or genes in a large panel of intestinal samples and specific clinical contexts. By pinpointing particular taxonomic groups with biomedical relevance, more targeted approaches could be developed to improve understanding of their role in the human gut,” the study concludes.
By Anni Schleicher
To contact our editorial team please email us at editorial@cnsmedia.com
Subscribe now to receive the latest news directly into your inbox.