Thursday, May 24, 2018

You Have No Idea of the Predictive Value of Weaning Parameters for Extubation Success, and You Probably Never Will

As Dr. O'brien eloquently described in this post, many people misunderstand the Yang-Tobin (f/Vt) index as being a "weaning parameter" that is predictive of extubation success.  Far from that, it's sensitivity and specificity and resultant ROC curve relate to the ability of f/Vt after one minute of spontaneous ventilation to predict the success of a prolonged (~ one hour) spontaneous breathing trial.  But why would I want to predict the result of a test (the SBT), and introduce error, when I can just do the test and get the result an hour later?  It makes absolutely no sense.  What we want is a parameter that predicts extubation success.  But we don't have that, and we probably will never have that.

In order to determine the sensitivity and specificity of a test for extubation success, we will need to ascertain the outcome in all patients regardless of their performance on the test of interest.  That means we would have to extubate patients that failed the weaning parameter test.  In the original Yang & Tobin article, their cohort consisted of 100 patients.  60(%) of the 100 were said to have passed the weaning test and were extubated, and 40(%) failed and were not extubated.  (There is some over-simplification here based on how Yang & Tobin classified and reported events - its not at all transparent in their article - the data to resolve the issues are not reported and the differences are likely to be small.  Suffice it to say that about 60% of their patients were successfully weaned and the remainder were not.)  Let's try to construct a 2x2 table to determine the sensitivity and specificity of a weaning parameter using a population like theirs.  The top row of the 2x2 table would look something like this, assuming an 85% extubation success rate - that is, of the 60 patients with a positive or "passing" SBT score (based on whatever parameter), all were extubated and the positive predictive value of the test is 85% (the actual rate of reintubation in patients with a passing weaning test is not reported, so this is a guess):

Note that without the data relating to how many who fail the SBT could have been successfully extubated, we cannot determine either sensitivity or specificity - only the positive predictive value.  Now, if we had a population in whom all the patients passed the SBT or the weaning parameter, we could determine the sensitivity (100%), the specificity (zero%) and the positive predictive value.  Note that in this scenario, the positive likelihood ratio (sensitivity/1-specificity) is 1, meaning that the test has no predictive value whatsoever; the negative likelihood ratio (1-sensitivity/specificity) is 0, but it is a meaningless value because it is based on no data.

Similarly, if the proportion of patients who pass a weaning test and are extubated is high, and most of the patients with a positive test are successfully extubated, the sensitivity will approach 100% but the specificity will still be low.  This yields a positive likelihood ratio (LR+) of 1.25 which is nearly uninformative, and a negative likelihood ratio (LR-) of 0.22 which is slightly informative, taking a prior from 90% to 67%:

Suppose we take the 40 patients from the Yang and Tobin trial and we assume that one third of them could have been successfully extubated.  Now the 2x2 table looks like this, and the LR+ is 3.2 (mildly informative) and the LR- is 0.26, (mildly informative):

Patients who self-extubate, and by definition (or default or neglect or status quo) were not deemed fit for planned extubation, have a re-intubation rate of about 50%.  If half of the patients who failed the weaning test/parameter could have been successfully extubated analogously to self-extubators, the table looks like this with LR+ 2.3 (mildly informative) and LR- .42 (mostly uninformative):

Finally, suppose that the patients who fail the weaning test/parameter have a 70% extubation success rate. It may seen farcical, but we have no way of knowing, because nobody, to my knowledge, has studied patients who have failed weaning tests!  (Except me when I do what has fondly been called The Catholic Method:  "Pull and Pray")  Here are those numbers, representing LR+ 1.5 (uninformative) and LR- 0.61 (uninformative):

In the last several scenarios, the weaning parameter, which had such a good ROC curve in the 1991 article (albeit without surrounding 95% confidence intervals which were likely wide), has mild to almost no predictive value whatever for what we want to use it for: extubation success.  And it hinges entirely on the counterfactual:  if those failing the test had been extubated, where would they have fit into the 2x2 table?

We are not ever going to have these data, unless we design (and get IRB approval for) a study in which we select patients for a breathing test of some kind, then do the test and extubate all of them, regardless of the test result.  This is feasible, albeit unlikely to occur any time soon, because of the hurdles that must be overcome to conduct it.

As Dr O'Brien mused in his post in 2007, perhaps we should just determine and go with the prior probability of extubation success and if it exceeds a certain threshold, predictive tests be damned.  I posit that this is true.  My experience, documented previously on this blog (here and here), is that we are blithely unaware of how many of the patients we think can't breathe on their own, based on testing or otherwise, actually can.  We may not get formal clinical trials evidence, but we can still amass experiential evidence.  Based on experiential prior probabilities alone, many patients deserve a "trial of extubation" rather than a series of weaning trials using "predictors" with unknown (and probably small) predictive value.

[Afterthought:  Yang and Tobin used both "objective" and "subjective" criteria for weaning failure and presumably (the reporting is vague) reintubation:
Twenty-eight patients met objective criteria for weaning failure because they had one or more of the following: a partial pressure of carbon dioxide ≥50 torr (7 kPa) (17 patients), an increase in partial pressure of carbon dioxide of ≥8 torr (1 kPa) (18 patients), a pH of arterial blood ≤7.33 (16 patients), a decrease in pH of ≥0.07 (19 patients), or a partial pressure of oxygen ≤60 torr (8 kPa) with a fraction of inspired oxygen ≥0.5 (5 patients). The remaining 12 patients met subjective criteria for weaning failure because diaphoresis, evidence of increasing effort, tachycardia, arrhythmias, or hypotension required the reinstitution of mechanical ventilation.
Perhaps some patients were reintubated but actually could have continued to breathe spontaneously in spite of meeting these criteria - this is yet another potential misclassification and ascertainment error that undermines the ability of a trial to set a clear path forward for extubation decisions.]

No comments:

Post a Comment