The outcomes here are much less clear. The extensive supplementary materials the replication team handed out helpfully distinguish between “reproducibility” (do the results of an experiment turn out the same if you do it again with the same data and approach?) and “replicability” (can a new, overlapping experiment with new data yield reliably similar results?).
The COS team has tried to be explicit about how messy this all is. If an experiment fails to replicate, that doesn’t mean it’s unreplicable. It could have been a problem with the replication, not the original work. Conversely, an experiment that someone can reproduce or replicate perfectly isn’t necessarily right, and it isn’t necessarily useful or novel.
But the truth is, 100 percent pure replication isn’t really possible. Even with the same cell lines or the same strain of genetically-tweaked mice, different people do experiments differently. Maybe the ones the replication team didn’t have the materials to complete would have done better. Maybe the “high-impact” articles from the most prestigious journals were bolder, risk-taking work that’d be less likely to replicate.
Cancer biology has high stakes. It’s supposed to lead to life-saving drugs, after all. The work that didn’t replicate for Errington’s team probably didn’t lead to any dangerous drugs or harm any patients, because Phase 2 and Phase 3 trials tend to sift out the bad seeds. According to the Biotechnology Industry Organization, only 30 percent of drug candidates make it past Phase 2 trials, and just 58 percent make it past Phase 3. (Good for determining safety and efficacy, bad for blowing all that research money and inflating drug costs.) But drug researchers acknowledge, quietly, that most approved drugs don’t work all that well at all—especially cancer drugs.
Science obviously works, broadly. So why is it so hard to replicate an experiment? “One answer is: Science is hard,” Errington says. “That’s why we fund research and invest billions of dollars just to make sure cancer research can have an impact on people’s lives. Which it does.”
The point of less-than-great outcomes like the cancer project’s is to distinguish between what’s good for science internally and what’s good for science when it reaches civilians. “There are two orthogonal concepts here. One is transparency, and one is validity,” says Shirley Wang, an epidemiologist at Brigham and Women’s Hospital. She’s co-director of the Reproducible Evidence: Practices to Enhance and Achieve Transparency—“Repeat”—Initiative, which has done replication work on 150 studies that used electronic health records as their data. (Wang’s Repeat paper hasn’t been published yet.) “I think the issue is that we want that convergence of both,” she says. “You can’t tell if it’s good quality science unless you can be clear about the methods and reproducibility. But even if you can, that doesn’t mean it was good science.”
The point, then, isn’t to critique specific results. It’s to make science more transparent, which should in turn make the results more replicable, more understandable, maybe even more likely to translate to the clinic. Right now, academic researchers don’t have an incentive to publish work that other researchers can replicate. The incentive is just to publish. “The metric of success in academic research is getting a paper published in a top-tier journal and the number of citations the paper has,” Begley says. “For industry, the metric of success is a drug on the market that works and helps patients. So we at Amgen couldn’t invest in a program that we knew from the beginning didn’t really have legs.”