NIST 2020 Spectral Library Update Adds 14,000 Metabolites

  • <<
  • >>

565936.jpg

 

As one of the largest commercially available databases, scientists routinely rely on the NIST Mass Spectral Library to identify compounds of interest. Luckily, that just got a little easier with the NIST20 update that adds 6,000 human metabolites, 8,000 plant metabolites, 2,000 drugs, 1,000 pesticides and 1,000 lipids.

With nearly 1.5 million spectra in its database already, one of the biggest challenges of updating the NIST Mass Spectral Library every year is adding compounds that will actually help scientists make a difference. So, of the countless compounds out there, which ones should be included in the update?

The answer to that question lies partially in Tytus Mak’s court. Mak, a NIST biostatistician, scours the catalogs of chemical manufacturers and lists of important compounds published by private companies, government agencies and scientific researchers. He then prioritizes the compounds based on their relative importance and the cost of purchasing samples for analysis.

“One of the main drivers behind our new release is focusing on biologically relevant compounds (i.e. metabolites), Mak explained to Laboratory Equipment. “Simply buying tens of thousands of compounds is easy, there are 10+ million to choose from on the market, but almost all of them aren’t actually found in nature and are synthesized via combinatoric chemistry for drug screening assays. We did a lot of work in selecting for compounds that people actually care about, with a particular emphasis on metabolomics applications.”

In fact, one of NIST’s goals was to incorporate all the metabolites in The Human Metabolome Database into the 2020 library update. The Human Metabolome Database, funded by the Canadian government, is an electronic database containing detailed information about small molecule metabolites found in the human body. The database contains 114,224 metabolite entries, including both water-soluble and lipid-soluble metabolites, as well as metabolites that would be regarded as either abundant (> 1 uM) or relatively rare (< 1 nM).

“The cutting edge of biomedical research is driven by big-data driven “-omics” platforms, which include genomics, proteomics, and metabolomics. Metabolomics in particular has emerged in the last decade as one of the most promising fields for revealing biomarkers for disease with very broad clinical applications,” said Mak.

As Mak points out, the spectral library would be useless if it contained compounds that were no use to scientists. Thus, Mak and his colleague Sara Yang, a NIST computational biologist, relied on direct communication with stakeholders to help them prioritize important and not-as-important compounds.

For example, the 2020 library update includes 246 extractable and leachable compounds, which are contaminants in pharmaceutical products and packaging and processing.

“I was initially unfamiliar with this class of compounds but learned about them from a pharmaceutical industry chemist who emailed me about their importance,” Yang said.

Once NIST scientists decided on the compounds to be included in the update, Yang—who worked on quality control of the library—developed computer algorithms and software tools that thoroughly mine all useful spectra from gigabyte scale datasets for processing data, annotating spectra and assisting manual inspection. She also developed a quality control pipeline with more than 40 steps by using raw data statistics, chemical structures and spectra of related compounds to reject contaminants and confirm spectrum quality.

The NIST Mass Spectral Library comes pre-installed on many instruments, and users can purchase the update from their instrument manufacturer or other distributors. Collections of mass spectra used in specialized areas of research can be downloaded for free from the NIST website.

The job of updating the NIST Mass Spectral Library with new compounds will continue; but for now, chemists can easily identify tens of thousands of compounds more efficiently.

Photo: In this 1948 photo, a NIST staff member operates an early mass spectrometer.​​ Credit: NIST