Advertisement

If you’ve read some of my editorials on scientific publishing in the past, you may have gotten the (correct) impression that I’m an avid proponent of open-access. I strongly believe it is one of the keys to the future of scientific development and innovation. That’s why a recent study from Penn State researchers caught my eye. 

According to Rick Gilmore, associate professor of psychology at the university, data sharing may actually play a significant role in science’s reproducibility crisis. 

For years, the popular criticism of scientific researchers has been their inability to reproduce certain studies, including high-profile paper retractions such as the memorable STAP stem cells paper out of the Riken Institute a few years ago. Couple that to the life sciences industry, where some studies have demonstrated that an estimated 50 percent of published data is irreproducible. All of a sudden it’s not a problem—it’s a crisis.

Michelle Taylor
Editor-in-Chief

But, Gilmore may have a somewhat simple approach. 

According to his study, in psychological and brain sciences at least, irreproducibility has more to do with the complexity of managing data, rather than incorrect or hidden methods and results. 

Gilmore uses cognitive neuroscience as his example. It is a computationally intensive field that produces data files in a variety of sizes and formats. There’s data from EEGs, fMRIs, MRIs and CT and PET scans. Then there’s video and audio recordings, surveys and computer-based tasks. However, there are relatively few organized initiatives to encourage sharing of these different file types, nor is sharing widespread. 

“Right now, data sharing is still largely unfunded and unrewarded and is only rarely required,” said Gilmore.

To that point, Gilmore and his co-authors suggest requiring data sharing for federal grant funding. Publishers of scientific journals could also mandate the accessibility of data as a requirement to be published. On an encouraging note, some journals have already begun to do this. 

We’ve also recently seen other positive trends in creating a more open, and subsequently reproducible, environment. For example, more and more researchers are sharing not only their data but the computer software they used to analyze it. At the same time, technology is continuing to improve, with developers creating new web-based management tools and software to help scientists work with and share their data. 

Gilmore is also the founding co-director of the Databrary Project, which is a web-based digital library for storing, managing, preserving, analyzing and sharing video. The project, funded by the National Science Foundation and the National Institutes of Health, aims to promote data sharing, archiving and reuse among researchers who study human development.

Another example is the website protocols.io, an up-to-date, open-access, collaborative repository of scientific methods and protocols. Founded almost two years ago by MIT postdoc Lenny Teytelman, protocols.io is the result of his own horrific research experience. He spent a year and a half conducting research before discovering that a single step of the fish microscopy method he was using, which was published in Nature Methods, was faulty—and there was nothing he could do to warn others. 

Similar tools include CiteAb, which helps scientists identify antibodies; ChemSpider, for chemical structures; and Access Innovations, which scans and flags published papers that have accidentally used misidentified cell lines. 

Still, the number of open-access articles, journals and tools pales in comparison to their restricted counterparts.  

Although Gilmore and his co-authors were speaking about the field of neuroscience in particular when they said the following, I think it has applications in the broader science conversation:

“We think that investments in…infrastructure will generate big payoffs,” the researchers said. “Fostering the widespread adoption of open, transparent and reproducible research practices coupled with innovations in technology that enable the large-scale analysis of ‘big data’ will accelerate the discovery of generalizable, robust and meaningful findings.”
 

Advertisement
Advertisement