Tackling the reproducibility crisis requires universal standards


Reproducibility is the conduit through which scientific discoveries become future innovations, cures and commercial opportunities. It is also the central pillar of scientific integrity: the barometer by which future generations can independently validate all the progress that came before them.

But, as is now widely acknowledged, reproducibility is in crisis. A 2016 Nature survey, for instance, found that 52 per cent of researchers believe that there is a “significant reproducibility crisis”, while more than 70 per cent have failed to reproduce another’s experiments.

The crisis could even impede economic growth, preventing industry from exploiting the social and economic potential of research partnerships with universities. From satellites to supercomputers, we are now developing new ways to exploit and gather vast pools of data. Yet the entire endeavour rests on trust in the information being collected. The goal of democratising science through “open data” similarly rests on our ability to guarantee the validity of the information we are sharing with the world.

Many solutions are being touted. Some commentators argue that the peer-review process, currently limited to seeking the opinions of a small number of other scientists, should be widened to crowdsource more expertise and ensure more transparent and rigorous research.

Others blame the tendency by research managers, funding bodies and research journals to measure success in citations. As datasets and code do not typically get cited elsewhere, both journals and authors prefer papers that focus on the juiciest headlines and dispense with the kinds of detailed methodology sections that would benefit those seeking to reproduce the findings.

Some people even argue that the peer-review process should be extended to cover the entire discovery process, not just the end result, creating “open research”.

Yet without a benchmark against which scientists can measure their discovery process, it is difficult for others to check its accuracy. For instance, in a more recent Nature article, scientists from the UK’s National Physical Laboratory noted that there is no regulation of the doses used in radiotherapy research outside clinical settings, making it impossible to reproduce the findings of many vital cancer treatment studies.

In order for research methods to be consistent, accessible and reproducible, we need universal, widely understood standards for research that all scientists adhere to. NPL has been responsible for maintaining fundamental standards and units for more than 100 years and is now engaged in pioneering work to create a set of “gold standards” for all scientific methodologies, materials, analyses and protocols, based on exhaustive testing at a large number of laboratories, in tandem with both industry and national and international standardisation organisations.

We are working to create international reference standards against which to isolate, extract and measure precise quantities for any type of experiment. Instruments will have to be continuously re-calibrated in line with these standards, ensuring the validity of the results and guaranteeing that the precise measurements needed to repeat the experiment are known. NPL is even creating quality-assured methods of online data processing, storage and analysis, including standards for how sensors should collect and encrypt data and a quality test for AI algorithms. We have to trust the accuracy and integrity of digital data since they will increasingly fuel everything from personalised health treatments to business decisions.

Our aim is for best practice standards across all fields of science to become part of a formal accreditation system, such as ISO (International Organisation for Standardisation) standards. To get this accreditation, researchers would have to undergo regular peer review. Crucially, this would help to guarantee the reproducibility of innovations at the cutting edge of research, where no measurement standards or accreditations currently exist.

For example, graphene is one of the most promising discoveries in recent times but there is currently no standard for characterising it; we are collaborating with the UK’s National Graphene Institute to create the first ISO for graphene with applications in the life sciences.

We are incentivising voluntary compliance by engaging front-line scientists in the creation of the standards, to ensure that they will aid the research process without becoming a burden.

Looking to the future, ISO-style accreditation could become part of promotion criteria or national audits such as the UK’s research excellence framework. This would become a way for governments and industry to ensure that the money they invest in research delivers the promised social and economic benefits. Ultimately, excellent research methods should be as important to funding allocation as excellent research outputs.

Treaties and regulations could further incentivise adoption of the new standards. For example, the Paris Agreement on climate change contains no mechanism for independently verifying whether signatories are actually meeting their carbon targets. NPL’s new standards for measuring air quality and emissions will offer a way for countries to prove their measurements conform to internationally agreed methodologies, boosting trust in climate data.

We should go further and also make accreditation in research methods a mandatory requirement for publication in journals, helping enforce a culture of best practice and give a trusted stamp of quality assurance to research findings. Only then can we guarantee that the most promising discoveries can be translated into real impact.

Author Bio: Pete Thompson is chief executive officer of the UK’s National Physical Laboratory.