How much evidence is enough to retract a paper?

A paper published in May this year by Coulter et al. (2018) was followed by two letters to the editor (Gibson et al. 2018, O’Keeffe, et al. 2018). The original paper was a systematic review with meta-analyses regarding the effectiveness of manipulation and mobilization for pain and disability in people with chronic non-specific low back pain.  The paper concluded there was moderate quality evidence that thrust manipulation may offer small to moderate treatment effects compared to other active comparators while non-thrust mobilization techniques offered moderate quality evidence of minimal treatment effect compared to other active comparators.  With respect to manipulation and pain this statement was based upon the results of their meta-analysis as represented in Figure 1.

Figure 1 Replication of manipulation forest plot from Coulter et al (2018). Note the highlighted very small standard deviation values.

The first letter to the editor about this paper was published in July by O’Keeffe et al. (2018). They highlighted concerns with methods of analysis and study inclusion/exclusion, the lack of prospective registration, not using the GRADE approach to rate the quality of evidence and the fact that studies included in the meta-analyses were designed such that a combination of treatments were delivered meaning it was impossible to determine the unique effect of thrust manipulation. In their response, Coulter et al rejected, refuted or explained away these concerns and predictably, the dust settled and things moved on.

The second letter, published in October (Gibson et al. 2018) raised two issues.  It reiterated the same concerns regarding the inability to delineate a unique effect of manipulation due to combined treatment interventions in a number of papers in the manipulation meta-analysis. Critically, the second issue was that the standard deviations reported and used from one of the papers in the meta-analysis (Ritvanen et al. 2007) were implausibly small (See Figure 1 above). Correspondence with an author on the Ritvanen et al paper confirmed our suspicion that what were reported as standard deviations were in fact standard errors, an error not picked up by the review authors.

This is critically important because correction of this error fundamentally changes the conclusions of the review.  Figure 1 highlights how these very small standard deviations led to an exaggerated treatment effect which in turn influenced the overall pooled estimate of effect. We back-calculated the standard deviations for this trial and used the corrected values to reproduce the meta-analysis. Figure 2 shows this with the now non-significant pooled estimate of effect. Using the authors’ initial classification of effect sizes, this result would be more correctly reported as “moderate quality evidence that manipulation offers a minimal treatment effect compared to other active comparators for reduction in pain”.

Figure 2 Corrected manipulation forest plot with notes highlighting changes

In their reply to our letter, the authors stated that “by standard procedure, we used the published authors’ own data in our analyses, without any revision of these by ourselves” (Coulter et al. 2018a) and did not deny or refute the standard error/standard deviation mistake. Of course research is a human endeavour and mistakes happen, but we contend that once an error has been identified, it is important to correct it, lodge an erratum and in doing so inform other researchers, clinicians, policymakers and the public that the first result was wrong.

Coulter et al argue against this course of action on the grounds that the study with erroneous data had only a small weighting on the pooled estimate of effect (the weight can be considered how influential each study in a meta-analysis is on the final result). Best practice would suggest that even if that was the case, it is important to be accurate, but our analysis clearly shows that is not the case – correcting the data completely altered the conclusions drawn from the meta-analysis and essentially reverses the clinical conclusions that may be derived from this review.

This is particularly important when the impact of the paper is substantial.  This paper appears in the ‘most read’ section of the journal website. It has an Altmetric attention score in the 99th percentile for research of this age, 165 tweets from 131 users, with an upper bound of 123,428 followers and the conclusions/main findings have been reported by 49 news outlets. It’s not inconceivable to suggest it may have influenced healthcare providers and people with chronic non-specific low back pain when considering treatment options. This is a most unfortunate situation because the reported conclusions regarding reduction in pain, which are based on incorrect trial data, are wrong.

At the time of writing this post, neither the authors nor the journal have provided any public indication of retracting or amending this paper. The Committee on Publication Ethics (COPE) offer guidelines on retraction and clearly state that a paper should be considered for retraction if “they have clear evidence that the findings are unreliable, either as a result of misconduct (e.g. data fabrication) or honest error (e.g. miscalculation or experimental error)”.

Erratums seldom get the coverage the corrected paper does, but it still seems important to do it, in order to uphold the integrity of the scientific process. We hope that the record will be corrected and the paper withdrawn or an erratum published.

About William Gibson

Willie is an Associate Professor in the School of Physiotherapy at The University of Notre Dame Australia. He is interested in curriculum design, teaches pain and evidence based practice and enjoys working with the small Notre Dame research crew on primarily pain-related topics. Outside of work, time spent in the garden, cooking and playing golf are always a bonus. He’s a confirmed foodie and bread-maker who is very happy to talk to people (in detail and at length) about the wood-fired pizza oven he built this year!!

This following people also contributed to this post:  Thorvaldur Palsson, Evan Coopes, Benedict Wand, Mervyn Travers, Mary O’Keeffe, Kieran O’Sullivan and Derek Griffin.

References

Coulter, I. D., C. Crawford, E. L. Hurwitz, H. Vernon, R. Khorsan, M. S. Booth and P. M. Herman (2018). “Response to letter to the editor entitled “thrust manipulation may not decrease the intensity of chronic low back pain” concerning “manipulation and mobilization for treating chronic low back pain: a systematic review and meta-analysis” by Coulter et al. TSJ; doi: 10.1016/j.spinee.2018.01.013.” Spine J 18(10): 1964.

Coulter, I. D., C. Crawford, E. L. Hurwitz, H. Vernon, R. Khorsan, M. Suttorp Booth and P. M. Herman (2018). “Manipulation and mobilization for treating chronic low back pain: a systematic review and meta-analysis.” Spine J 18(5): 866-879.

Gibson, W., T. S. Palsson, E. Coopes, B. M. Wand and M. J. Travers (2018). “Thrust manipulation may not decrease the intensity of chronic low back pain. Letter to the editor regarding “Manipulation and mobilization for treating chronic low back pain: a systematic review and meta-analysis” by Coulter et al.” Spine J 18(10): 1961-1963.

O’Keeffe, M., D. Griffin and K. O’Sullivan (2018). “Spinal manipulation for chronic low back pain: is it all it is cracked up to be?” Spine J 18(7): 1298-1299.

Ritvanen, T., N. Zaproudina, M. Nissen, V. Leinonen and O. Hanninen (2007). “Dynamic surface electromyographic responses in chronic low back pain treated by traditional bone setting and conventional physical therapy.” J Manipulative Physiol Ther 30(1): 31-37.

COPE (2009). “Retraction Guidelines”.