Friday, August 20, 2010
Heads I Win, Tails it's a Draw: Rituximab, Cyclophosphamide, and Revising CONSORT
The recent article by Stone et al in the NEJM (see: http://www.nejm.org/doi/full/10.1056/NEJMoa0909905 ), which appears to [mostly] conform to the CONSORT recommendations for the conduct and reporting of NIFTs (non-inferiority trials, often abbreviated NIFs, but I think NIFTs ["Nifties"] sounds cooler), allowed me to realize that I fundamentally disagree with the CONSORT statement on NIFTs (see JAMA, http://jama.ama-assn.org/cgi/content/abstract/295/10/1152 ) and indeed the entire concept of NIFTs. I have discussed previously in this blog my disapproval of the asymmetry with which NIFTs are designed such that they favor the new (and often proprietary agent), but I will use this current article to illustrate why I think NIFTs should be done away with altogether and supplanted by equivalence trials.
This study rouses my usual and tired gripes about NIFTs: too large a delta, no justification for delta, use of intention-to-treat rather than per-protocol analysis, etc. It also describes a suspicious statistical maneuver which I suspect is intended to infuse the results (in favor of Rituximab/Rituxan) with extra legitimacy in the minds of the uninitiated: instead of simply stating (or showing with a plot) that the 95% CI excludes delta, thus making Rituxan non-inferior, the authors tested the hypothesis that the lower 95.1% CI boundary is different from delta, which test results in a very small P-value (<0.001). This procedure adds nothing to the confidence interval in terms of interpretation of the results, but seems to imbue them with an unassailable legitimacy - the non-inferiority hypothesis is trotted around as if iron-clad by this miniscule P-value, which is really just superfluous and gratuitious.
But I digress - time to focus on the figure. Under the current standards for conducting a NIFT, in order to be non-inferior, you simply need a 95% CI for the preferred [and usually proprietary] agent with an upper boundary which does not include delta in favor of the comparator (scenario A in the figure). For your preferred agent to be declared inferior, the LOWER 95% CI for the difference between the two agents must exclude the delta in favor of the comparator (scenario B in the figure.) For that to ever happen, the preferred/proprietary agent is going to have to be WAY worse than standard treatment. It is no wonder that such results are very, very rare, especially since deltas are generally much larger than is reasonable. I am not aware of any recent trial in a major medical journal where inferiority was declared. The figure shows you why this is the case.
Inferiority is very difficult to declare (the deck is stacked this way on purpose), but superiority is relatively easy to declare, because for superiority your 95% CI doesn't have to exclude an obese delta, but rather must just exclude zero with a point estimate in favor of the preferred therapy. That is, you don't need a mirror image of the 95% CI that you need for inferiority (scenario C in the figure), you simply need a point estimate in favor of the preferred agent with a 95% CI that does not include zero (scenario D in the figure). Looking at the actual results (bottom left in the figure), we see that they are very close to scenario D and that they would have only had to go a little bit more in favor of rituxan for superiority to have been able to be declared. Under my proposal for symmetry (and fairness, justice, and logic), the results would have had to be similar to scenario C, and Rituxan came nowhere near to meeting criteria for superiority.
The reason it makes absolutely no sense to allow this asymmetry can be demonstrated by imagining a counterfactual (or two) - supposing that the results had been exactly the same, but they had favored Cytoxan (cyclophosphamide) rather than Rituxan, that is, Cytoxan was associated with a 11% improvement in the primary endpoint. This is represented by scenario E in the figure; and since the 95% CI includes delta, the result is "inconclusive" according to CONSORT. So how can it be that the classification of the result changes depending on what we arbitrarily (a priori, before knowing the results) declare to be the preferred agent? That makes no sense, unless you're more interested in declaring victory for a preferred agent than you are in discovering the truth, and of course, you can guess my inferences about the motives of the investigators and sponsors in many/most of these studies. In another counterfactual example, scenario F in the figure represents the mirror image of scenario D, which represented the minimum result that would have allowed Stone et al to declare that Rituxan was superior. But if the results had favored Cytoxan by that much, we would have had another "inconclusive" result, according to CONSORT. Allowing this is just mind-boggling, maddening, and unjustifiable!
Given this "heads I win, tails it's a draw", it's no wonder that NIFTs are proliferating. It's time we stop accepting them, and require that non-inferiority hypotheses be symmetrical - in essence, making equivalence trials the standard operating procedure, and requiring the same standards for superiority as we require for inferiority.