Monday, September 21, 2009
The unreliable assymmetric design of the RE-LY trial of Dabigatran: Heads I win, tails you lose
I'm growing weary of this. I hope it stops. We can adapt the diagram of non-inferiority shenanigans from the Gefitinib trial (see http://medicalevidence.blogspot.com/2009/09/theres-no-such-thing-as-free-lunch.html ) to last week's trial of dabigatran, which came on the scene of the NEJM with another ridiculously designed non-inferiority trial (see http://content.nejm.org/cgi/content/short/361/12/1139 ). Here we go again.
These jokers, lulled by the corporate siren song of Boehringer Ingelheim, had the utter unmitigated gall to declare a delta of 1.46 (relative risk) as the margin of non-inferiority! Unbelievable! To say that a 46% difference in the rate of stroke or arterial clot is clinically non-significant! Seriously!?
They justified this felonious choice on the basis of trials comparing warfarin to PLACEBO as analyzed in a 10-year-old meta-analysis. It is obvious (or should be to the sentient) that an ex-post difference between a therapy and placebo in superiority trials does not apply to non-inferiority trials of two active agents. Any ex-post finding could be simply fortuitously large and may have nothing to do with the MCID (minimal clinically important difference) that is SUPPOSED to guide the choice of delta in a non-inferiority trial (NIT). That warfarin SMOKED placebo in terms of stroke prevention does NOT mean that something that does not SMOKE warfarin is non-inferior to warfarin. This kind of duplicitious justification is surely not what the CONSORT authors had in mind when they recommended a referenced justification for delta.
That aside, on to the study and the figure. First, we're testing two doses, so there are multiple comparisons, but we'll let that slide for our purposes. Look at the point estimate and 95% CI for the 110 mg dose in the figure (let's bracket the fact that they used one-sided 97.5% CIs - it's immaterial to this discussion). There is a non-statistically significant difference between dabigatran and warfarin for this dose, with a P-value of 0.34. But note that in Table 2 of the article, they declare that the P-value for "non-inferiority" is <0.001 [I've never even seen this done before, and I will have to look to see if we can find a precedent for reporting a P-value for "non-inferiority"]. Well, apparently this just means that the RR point estimate for 110 mg versus warfarin is statistically significantly different from a RR of 1.46. It does NOT mean, but it is misleadingly suggested that the comparison between the two drugs on stroke and arterial clot is highly clinically significant, but it is not. This "P-value for non-inferiority" is just an artifical comparison: had we set the margin of non-inferiority at a [even more ridiculously "P-value for non-inferiority" as small as we like by just inflating the margin of non-inferiority! So this is a useless number, unless your goal is to create an artificial and exaggerated impression of the difference between these two agents.
Now let's look at the 150 mg dose. Indeed, it is statistically significantly different than warfarin (I shall resist using the term "superior" here), and thus the authors claim superiority. But here again, the 95% CI is narrower than the margin of non-inferiority, and had the results gone the other direction, as in Scenarios 3 and 4, (in favor of warfarin), we would have still claimed non-inferiority, even though warfarin would have been statistically significantly "better than" dabigatran! So it is unfair to claim superiority on the basis of a statistically significant result favoring dabigatran, but that's what they do. This is the problem that is likely to crop up when you make your margin of non-inferiority excessively wide, which you are wont to do if you wish to stack the deck in favor of your therapy.
But here's the real rub. Imagine if the world were the mirror image of what it is now and dabigatran were the existing agent for prevention of stroke in A-fib, and warfarin were the new kid on the block. If the makers of warfarin had designed this trial AND GOTTEN THE EXACT SAME DATA, they would have said (look at the left of the figure and the dashed red line there) that warfarin is non-inferior to the 110 mg dose of dabigatran, but that it was not non-inferior to the 150 mg dose of dabigatran. They would NOT have claimed that dabigatran was superior to warfarin, nor that warfarin was inferior to dabigatran, because the 95% CI of the difference between warfarin and dabigatran 150 mg crosses the pre-specified margin of non-inferiority. And to claim superiority of dabigatran, the 95% CI of the difference would have to fall all the way to the left of the dashed red line on the left. (See Piaggio, JAMA, 2006.)
The claims that result from a given dataset should not depend on who designs the trial, and which way the asymmetry of interpretation goes. But as long as we allow asymmetry in the interpretation of data, they shall. Heads they win, tails we lose.