Maintenance spinal manipulation: The cherry-pickers quandary

The email from the industry was effusive. In a cock-a-hoop, caps lock-happy frenzy it bellowed “ALL MANUAL MEDICINE PROVIDERS SHOULD BE AWARE OF THIS STUDY”. The study in question, soon to be published in the journal “Spine” is a RCT that specifically looks at whether patients with chronic back pain benefit from a sustained period of “maintenance spinal manipulation” following their initial treatment period and concludes that SMT is indeed effective but that “maintenance manipulations” add benefit after the initial intensive therapy has concluded. No wonder the mailer was excited; here was a study that purportedly demonstrates a real benefit to spinal manipulation in chronic back pain and that seems to validate that common but controversial practice of regularly seeing patients between flare-ups for a “quick click” to keep the spine tip-top.

This was a small study of 60 patients divided into 3 groups: one received 12 sessions of “sham” manipulation over one month, one received 12 sessions of real manipulation over a month (one of those general “give the back a good click” manips), and the third group received the same plus an additional “maintenance spinal manipulation” every 2 weeks for nine months. The authors report that both groups that received real manipulation did better than the sham group but the group that got the maintenance manipulation did the best of all.

Now 60 people in 3 groups should not really provide sufficient statistical power to demonstrate a difference, even if one is there. So it would suggest that the effect of this manipulation is either uncharacteristically (for a manual therapy) consistent, or large when it is present. Put into context the recently updated Cochrane review of all trials of manipulation for chronic back pain suggests a tiny effect size for manipulation that doesn’t really tickle the undercarriage of clinical significance.

Closer inspection of the paper revealed a few devils lurking amidst the details. There are serious challenges to devising a true sham manipulation. In fact depending on what one thinks the “active ingredient” of a manual therapy is it may well be considered impossible. The authors did not investigate whether or not the sham was convincing to the folk who took part but there is a possible clue in the fact that half of those in the sham manipulation group dropped out during the study compared with fewer dropouts in the other groups.  The authors have tried, fairly, to account for this in the analysis but this level of uneven dropout makes any comparisons with the sham group unreliable. We cannot be clear that manipulation was truly better than the sham.

In terms of the effects of maintenance manipulation there is a further issue. Given that this group is basically compared to a “no maintenance treatment” group there is no control for non-specific effects of care such as placebo, attention, etc. Perhaps the manipulation is a red herring; maybe, if the observed effects are real, it just helps patients to be seen regularly. There is not much data about but another new trial, this time for chronic neck pain might shed some light. Here, maintenance manipulation was compared with manipulation plus exercise and with a control group that met with their chiropractor to discuss their symptoms but who received no manipulation or exercise, thereby controlling for the effects of attention. They found no benefit to maintenance manipulation or manipulation plus exercise for any outcome, although this study also suffers from its small size.

Beyond these issues a few other concerns spring out at me that question not just the interpretation of the effect but whether it is really there at all. The level of variability in this sample of patients seems remarkably low. In fact calculating and comparing the standard deviations with those of the studies included in the recent Cochrane meta-analysis of manipulation for chronic back pain they are around half the size and sometimes less. The criteria for including patients don’t seem remarkable so what else could explain such a consistent population?

The results at first appear statistically super significant with a p-value of p>0.001 for many comparisons. Simply put, this indicates that the chances of us getting the same positive result by chance despite the treatment actually being useless are less than 1 in 1000. Unfortunately the authors decided to use a statistical test called the t-test for every comparison. If we include all of the outcomes and all of the follow-up points that means 72 of them! This is something of a methodological no-no because it means that each time we take the test we increase the chances of finding a false positive. A quick, dirty and imperfect correction for this suggests that for this study p<0.001 actually equates to p<0.072 which we would not accept as statistically significant. This choice of statistical approach seems a touch bizarre, and I can’t really think of a good reason that it might have been chosen.

It is of course possible that the results of this study are accurate and maintenance manipulations are effective, but these problems make it difficult to judge. The message from this one back pain trial might seem appealing and I can see why the email was so enthusiastic.  But by focusing on one particular cherry that seems so ripe and juicy we might miss the bigger picture from the rest of the tree.  And there is always the chance that the tastiest cherries contain a few artificial sweeteners. Personally I would lay off the caps lock for now.

About Neil

Neil O’Connell is a researcher in the Centre for Research in Rehabilitation, Brunel University, West London, UK. He divides his time between research and training new physiotherapists and previously worked extensively as a musculoskeletal physiotherapist. He also tweets! @NeilOConnell

Neil is currently fighting his way through a PhD investigating chronic low back pain and cortically directed treatment approaches. He is particularly interested in low back pain, pain generally and the rigorous testing of treatments. He also tends to get all geeky over controlled trials.
Senna MK, & Machaly SA (2011). Does maintained Spinal manipulation therapy for chronic non-specific low back pain result in better long term outcome? Spine PMID: 21245790

Rubinstein SM, van Middelkoop M, Assendelft WJ, de Boer MR, & van Tulder MW (2011). Spinal manipulative therapy for chronic low-back pain. Cochrane database of systematic reviews (Online), 2 PMID: 21328304

Martel J, Dugas C, Dubois JD, & Descarreaux M (2011). A randomised controlled trial of preventive spinal manipulation with and without a home exercise program for patients with chronic neck pain. BMC musculoskeletal disorders, 12 PMID: 21303529


  1. Arco:

    I am not sure of your conclusion regarding cause and effect with the case study you mentioned in which a patient did not respond to manipulation but did respond to flexion exercises.

    Most of the criteria which indicate that manipulation may be effective, to me, also indicate that they are highly likely to improve with no intervention what so ever. e.g. pain has been present for a short amount of time.

    Therefore, to say that someone “responded” to flexion exercises for a patient who had improved after 6 sessions of Mckenzie therapy is hardly a conclusion i would draw from the case presentation. We have no evidence that the patient would not have achieved just as good results without intervention.

    I often cringe when people tell me they get really good results using X to treat condition Y. It is often highly unlikley that that person has actually seen enough patients with an identical condition fail due to other means and then succeed due to their chosen treatment to accurately make that claim. If i had a treatment that got 8/10 people better, whereas without intervention 6/10 people would get better, within my clinic I dont think I would be able to recognise that my intervention was better than no intervention unless I started to take records.

    I would much rather hear their reasoning for why and how they think their treatment works than have them tell me that it works because they think they get great results.

    Paul Ingraham Reply:

    Great to see that kind of thinking, Lachlan. Very nicely said. I’m afraid too many manual therapists are fond of paying lip service to humility while basing everything they do on grandiose claims of efficacy and special knowledge. What you just expressed is the real thing, genuine humility, and that perspective and attitude is desperately needed for many reasons. In particular, it is a pre-requisite for trying to understand what treatments work and which ones don’t.

    arco Reply:

    Hi Lachlan

    My conclusion as I said before, and within the thread we’re discussing about manipulation issues is that currently we have no firm evidence to identify a priori people who will benefit from them. And that current CPR, people who fit in, can respond to another approaches or as you said they can also improve over time.

    Unfortunately, the belief that people will recover from an episode of acute LBP regardesless of the treatment received, and just a few will have chronic symptoms is far from being true. If you read the evidence fro Croft, Von Korff, you’ll see how up to 45-65% of people will still complain of persistent symptoms after a year. Acute episodes can subside but recurrences are common.

    So as Gordon Waddell said, despite all the advances in the new technologie diagnostic tools and all the treatments, LBP still remains an enigma, and we are no changing its prevalence, but he goes further and said that we as clinicians would ask ourselves if we are not being a key factor to not improve that figures. And I would add, with our own believes, diagnostic terminologies and so on.

    My conclusion was also that manips works, but not always, so the best thing we can do is in the absence of contraindications do it, but if it doesn’t work try another thing.

    You’ve mixed up the 2 cases studies I gave as a reference. The patient who respond to flexion just met the criteria, but also was a flexion responder and didn’t received any other tretament than MDT, and he was OK in 1 week. The other case study, she didn’t respond to the manip over 2 visits (unchanged), so the 3rd she was changed to MDT and she started to improve until the 6th visits. So 6 visits were in whole, 2 manips + 4 MDT.

    I want to keep the discussion within the manip thread because this what the post by Neil is about, but when you see words like decreased, abolished, centralisation, they refer to symptomatic response within the session, and it is achieved as a a consequence of a deliberately loading strategy applied, it’s not a matter of something that happens over time. Within MDT you look for a change, that should be lasting. I’ll be pleased to talk about MDT if a new entry is posted, but this is not the right place.

    From the beginning I’m claiming for the need of a better diagnostic approach for NSLBP (hip problems can give only LBP), and we should apply our treatments based on our diagnostis. I wish we could identify a problem and we could apply 10 differnt treatments to that same problem. But the first thing is to recognise the responder from the non responder.

    NSLBP is such a heterogeneous group, that if we keep trying to see it as a valid label, we will be becoming the heterogeneous sample in homogeneus, and that’s a very big mistake. I would say the main one and it also is preventing any advance in the LBP field.

    I hope Lachlan you got my point of view right, and I don’t want you to agree or disagree with me, but keep my words right in order to not confuse other people.

    Thanks again for sharing your comments.

  2. Hi. Good post and good analysis. Be shouldn´t believe everything we read. Anyway and going straight to the point, I think we are still discussing a “specific” treatment for a “non specific condition”, when we should be asking for a better understanding of what people will benefit from manips, and a more specific definition of “chronic pain”. Not all chonic pain is a chronic pain state. We need a better “diagnosis” before looking for the best treatment.


    Neil O'Connell Reply:

    Thanks Arco,

    The “subgroup of patients who will respond to manips (or exercise or etc etc etc)” idea is popular and we have discussed it here before. Unfortunately the data gives us no guide to what such a subgroup might look like. Non-specific back paiun is really the best we can do. Of course many clinicians believe that they can pick those patients who will best respond but in chronic back pain no-one has yet built a coherent model of who these folk are and then tested it in a convincing way.

    In acute back pain some efforts were made in this direction but the best known “clinical prediction rule” for manips fell at the scientific hurdle of independant replication in a different population .

    Of course some people seem to respond very well to any given treatment and that is compelling in the clinic. But it could be that one is just witnessing statistical noise rather than a treatment effect. I personally don’t subscribe to the idea that a lack of subgroups explains the lack of effect of common treatments for CLBP for the reasons my colleague and I outline here:

    arco Reply:

    Thanks Neils for your reply.

    When it comes to a “chronic pain”, if by that term we mean longer than 3 months, then we should approach them as if it were “acute” (less than 7 days?). We should assess it and see the response. Every single try to establish a “one size fits all” treatment hasn’t worked so far.
    So it seems we really need to work hard on finding a tool to diagnose.
    If you read this link , you can see 52% of chronic back people responded “normally” to a mechanical assessment.

    I don’t know which method, approach or CPR is better, but we should avoid tthe NSLBP as “the best we have” label, because so far it’s not very helpful to improve our treatment skills.

    You also have a the case of a patient who met the criteria for manip CPR who failed to respond after 2 sessions of manip and responded to another approach:

    or another one who met the criteria but responded to flexion exercises as well :

    So acute/chronic doesn’t give us a clue to treat, but give us the opportunity to assess, and treat according to our findings.

    And of course when we don’t find a “responder” we should say he doesn’t respond instead of he’s a chronic pain patient, with the biopsychosocial label.

    The more chronic the problem, the more likely to have BPS elements there. But it doesn’t mean we have to exclude them from an assessment. In fact people who respond mechanically have a good prognosis regardeless BPS. Or when taking into account a good response to a mechanical approach depression and somatization were less associated to chronic disability in LBP:
    That’s why I think we have a long way ahead of us, and we have to work on our own yellow flags (believes, terminology,…), to identify who is going to benefit from any specific treatment.

    Manips work? No doubt. When? we don’t know. So instead of saying manip are good for chronic or acute LBP, I would say that in the absence of any contraindication for manip, just do it, and if it works great! If it doesn’t keep looking for!



  3. “Much ado about very little” indeed. “Fighting over scraps” as I’ve put it occasionally. Basically the effect sizes are so small that it’s kind of laughable how much attention and energy has been devoted to trying to determine if they are real. Does it matter if they are?

    Writing about that recently updated Cochrane review for, I got into a fairly lengthy discussion with some folks somewhere in the wilds of Facebook. A couple of chiropractors in particular predictably expressed their conviction that exactly the right kind of SMT for the right kind of patient is going to make all the difference, and that these marvelous and clinicall obvious-to-them benefits must either shown by other, better experiments that weren’t reviewed, or they were reviewed but statistically swamped by crappy experiment that didn’t properly match treatment to specific low back pain diagnosis.

    If there are any such experiments showing a large effect … well, heck, let’s see your cards, gentleman! So they cited two examples of experiments that supposedly stand out in the crowd, experiments that, if they had been reviewed, would “pull the average up” and provide us with something more than scraps to fight over. So, what did they cite? One study had an inconclusive conclusion about effects on headaches (not back pain), and the other was not an experiment at all, but just another systematic review … a review with exactly the same “meh” conclusion as the Cochrane review. There was only silence after I pointed this out.

    And so I was struck yet again by how little there is to even argue about here. I’ve been having these discussions for many years now, and they always just fizzle.

  4. Great post! For a clinical PT like me your analysis of dodgy research methods were very helpful. I have doubts of anyone touting “maintenance anything” and especially spinal manipulation since that is akin to the chiropractic business model here in USA.

    I now think of SMT as a “control-alt-delete” for the patient – it resets their brain and allows them to believe that movement is not harmful. Paraphrasing Tim Flynn, PT, “If the PT can torque my spine like that, surely I can get out of bed and go to work!”.

    Do physios in Australia manipulate regularly? Published evidence here in USA has the rate below 5%. What is the Australian rate of physios applying manipulation to the spine?

    Thank you,

    Tim Richardson, PT

    Neil O'Connell Reply:

    Thanks Tim,

    The Cochrane data includes thrust manips, more gentle mobilisations etc under the umbrella of manipulative therapy. The effect size is apparently teeny-tiny which suggests that whether you think you are having a profound effect on the tissues or an effect on the nervous system the effect is rather small. The intense discussions of what mechanisms underpin manipulative therapy’s clinical effects might be considered to be much ado about very little.

    No idea how many folk manipulate (in Oz or the UK) but a fair proportion will reularly use some sort of hands on therapy I am pretty sure.

  5. An entertaining start and finish to this post, Neil, and informative all ‘round. I know that I wouldn’t have spotted some of the concerns about this study that you did, so I thank you for that.

    Aren’t these results somewhat at odds with a long history of negative results for SMT in the same context? It seems to me that even many SMT apologists and boosters have often conceded that the evidence supports SMT for for acute low back pain, but not chronic. Even if these results are taken at face value, they seem inconsistent with decades of underwhelming results from other experiments.

    Neil O'Connell Reply:

    Cheers Paul, and thanks for the link on yer website!

    There is not much specific literature on maintenance manips from which to choose but in terms of the basic question of manips for chronic low back pain the effect size is best estimated as tiny (check out the Cochrane review linked to in the post). Add to that the problems with blinding and controlling such studies introduce a positive bias and it doesn’t look great.

    Pete Medway Reply:

    As Paul Ingraham says, entertaining and informative. He might have added well-written — wherein lies an embarrassment for the likes of me. In our writing on education in English and the humanities, we cheerfully claim — or at least don’t strenuously counter the impression of suggesting — that it’s through reading history, philosophy and literature, writing the sort of essays set in those fields and engaging in the more ‘civilising’ discourses that students become good writers. How else would they acquire all that rhetorical virtuosity — those balanced sentences, surprise volte-faces, artful repetitions (‘caps-lock’) and witty endings? It stands to reason that it can’t be through studying spines and statistics…

    After all, if pain researchers can write like this without having attended a single lecture on Virginia Woolf, what’s to stop engineers and IT guys reaching similar heights? or BScs in Hospitality and Tourism?

    Bad news for teachers like me, of course, who swear by three or more years of our ‘tried and tested’ literary manipulation. But I’m retired so can afford to speculate. My theory, admittedly still awaiting large-scale control trials, is that there’s something in the air that anyone with a good ‘nose’ can take in and thereby acquire writerly expertise.

    (Sorry, specialist readers — not what this blog is for, I realise.)

    Neil O'Connell Reply:

    Thanks Pete – too kind!