Another day, another New York Times report on bad practice in biomedical science. The growing problems with scientific research are by now well known: Many results in the top journals are cherry picked, methodological weaknesses and other important caveats are often swept under the rug, and a large fraction of findings cannot be replicated. In some rare cases, there is even outright fraud. This waste of resources is unfair to the general public that pays for most of the research.
The Times article places the blame for this trend on the sharp competition for grant money and on the increasing pressure to publish in high impact journals. While both of these factors certainly play contributing roles, the Times article misses the root cause of the problem. The cause is not simply that the competition is too steep. The cause is that the competition is shaped to point scientists in the wrong direction.
As many other observers have already noted, scientific journals favor surprising, interesting, and statistically significant experimental results. When journal editors give preferences to these types of results, it is not surprising that more false positives will be published by simple selection effects, and sadly it is not surprising that unscrupulous scientists will manipulate their data to show these types of results. These manipulations include selection from multiple analyses, selection from multiple experiments (the “file drawer” problem), and the formulation of ‘a priori’ hypotheses after the results are known. While the vast majority of scientists are honest individuals, these biases still emerge in subtle and often subconscious ways.
Scientists have known about these problems for decades, and there have been several well-intentioned efforts to fix them. The Journal of Articles in Support of the Null Hypothesis (JASNH) is specifically dedicated to null results. The Psych File Drawer is a nicely designed online archive for failed replications. PLoS ONE publishes papers based on the quality of the methods, and allows post-publication commenting so that readers may be alerted about study flaws. Finally, Simmons and colleagues (2011) have proposed lists of regulations for other journals to enforce, including minimum sample sizes and requirements for the disclosure of all variables and analyses.
As well-intentioned as these important (and necessary) initiatives may be, they have all failed to catch on. JANSH publishes a handful of papers a year, The Psych File Drawer only has nine submissions, and hardly anyone comments on PLoS ONE papers. To my knowledge, no journals have begun enforcing the lists of regulations proposed by Simmons et al.
What is most frustrating is that all of these outcomes were completely predictable. As any economist will tell you, it’s the incentive structure, people! The reason nobody publishes in JASNH is that the rewards for publishing in high-impact journals are larger. The reason nobody puts their failed replications on Psych File Drawer or comments on PLoS ONE is that online archive posts can’t be put on CVs. And the reason individual journals don’t tighten their standards is that scientists can just submit their papers elsewhere. Even if the journals did manage to impose the regulations, wouldn’t it be better if the career incentives of scientists were aligned with the interests of good science? Wouldn’t a more sensible incentive structure make the list of regulations unnecessary?
This is where the funding agencies need to come in. Or, more to the point, where we as scientists need to ask the funding agencies to come in. Granting agencies should reward scientists who publish in journals that have acceptance criteria that are aligned with good science. In particular, the agencies should favor journals that devote special sections to replications, including failures to replicate. More directly, the agencies should devote more grant money to submissions that specifically propose replications. And finally, I would like to see some preference given to fully “outcome-unbiased” journals that make decisions based on the quality of the experimental design and the importance of the scientific question, not the outcome of the experiment. This type of policy naturally eliminates the temptation to manipulate data towards desired outcomes.
The mechanism could start with granting agencies making modest adjustments to grant scores for scientists who submit to good-practice journals. Over time, as scientists compete to submit to these journals, more of these journals will emerge by market forces. Journals that currently encourage bad practices may adjust their policies if they wish. Under the current system, there is simply no incentive for journals to adjust their policies. Will this transition be easy? No. Will the granting agencies manage this perfectly? Probably not. But it is obvious to me that scientists alone cannot solve problem of publication bias, and that a push from the outside is needed. The proposed system may not be perfect, but it will be vastly better than the dysfunctional system we are working in now.
If you agree that the cause of bad science is a perverse incentive structure, and if you agree that reform attempts can only work if there is pressure from granting agencies, please pass this article around and contact your funding agency. Within each agency, reform will require coordination among several sub-agencies, so it might make most sense to contact the director. Also, please see the FAQ, above, for continuously updated answers to questions.