Non-inferiority Trials Are Inherently Biased: Here's Why

Debut VideoCast for the Medical Evidence Blog, explaining non-inferiority trial design and exposing its inherent biases:

In this related blog post, you can find links to the CONSORT statement in the Dec 26, 2012 issue of JAMA and a link to my letter to the editor.

Addendum:  I should have included this in the video.  See the picture below.  In the first example, top left, the entire 95% CI favoring "new" therapy lies in the "zone of indifference", that is, the pre-specified margin of superiority, a mirror image of the "pre-specified margin of noninferiority, in this case delta= +/- 0.15.  Next down, the majority of the 95% CI of the point estimate favoring "new" therapy lies in the "margin of superiority" - so even though the lower end of the 95% CI crosses "mirror delta", the best guess is that the effect of therapy falls in the zone of indifference.  In the lowest example, labeled "Truly Superior", the entire 95% confidence interval falls to the left of "mirror delta" thus reasonable excluding all point estimates in the "zone of indifference" (i.e. +/- delta) and all point estimates favoring the "old" therapy.  This would, in my mind, represent "true superiority" in a logical, rational, and symmetrical way that would be very difficult to mount arguments against.

Added 9/20/16:  For those who question my assertion that the designation of "New" versus "Old" or "comparator" therapy is arbitrary, here is the proof:  In this trial, the "New" therapy is DMARDs and the comparator is anti-tumour necrosis factor agents for the treatment of rheumatoid arthritis.  The rationale for this trial is that the chronologically newer anti-TNF agents are very costly, and the authors wanted to see if similar improvements in quality of life could be obtained with chronologically older DMARDs.  So what is "new" is certainly in the eye of the beholder.  Imagine colistin 50 years ago, being tested against, say, a newer spectrum penicillin.  The penicillin would have been found to be non-inferior, but with a superior side effect profile.  Fast forward 50 years and now colistin could be the "new" resurrected agent and be tested against what 10 years ago was the standard penicillin but is now "old" because of development of resistance.  Clearly, "new" and "old" are arbitrary and flexible designations.

Once Bitten, Twice Try: Failed Trials of Extubation

“When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.”                                                                                   – Clark’s First Law

It is only fair to follow up my provocative post about a “trial of extubation” by chronicling a case or two that didn’t go as I had hoped.  Reader comments from the prior post described very low re-intubation rates.  As I alluded in that post, decisions regarding extubation represent the classic trade-off between sensitivity and specificity.  If your test for “can breathe spontaneously” has high specificity, you will almost never re-intubate a patient.  But unless the criteria used have correspondingly high sensitivity, patients who can breathe spontaneously will be left on the vent for an extra day or two.  Which you (and your patients) favor, high sensitivity or high specificity (assuming you can’t have both) depends upon the values you ascribe to the various outcomes.  Though these are many, it really comes down to this:  what do you think is worse (or more fearsome), prolonged mechanical ventilation, or reintubation?

What we fear today we may not seem so fearsome in the future.  Surgeons classically struggled with the sensitivity and specificity trade-off in the decision to operate for suspected appendicitis.  “If you never have a negative laparotomy, you’re not operating enough” was the heuristic.  But this was based on the notion that failure to operate on a true appendicitis would lead to serious untoward outcomes.  More recent data suggest that this may not be so, and many of those inflamed appendices could have been treated with antibiotics in lieu of surgery.  This is what I’m suggesting with reintubation.  I don’t think the Epstein odds ratio (~4) of mortality for reintubation from 1996 applies today, at least not in my practice.