Wednesday, April 14, 2021

Bias in Assessing Cognitive Bias in Forensic Pathology: The Dror Nevada Death Certificate "Study"

Following the longest hiatus in the history of the Medical Evidence Blog, I return to issues of forensic medicine, by happenstance alone. In today's issue of the NYT is this article about bias in forensic medicine, spurred by interest in the trial of the murder of George Floyd. Among other things, the article discusses a recently published paper in the Journal of Forensic Sciences for which there were calls for retraction by some forensic pathologists. According to the NYT article, the paper showed that forensic pathologists have racial bias, a claim predicated upon an analysis of death certificates in Nevada, and a survey study of forensic pathologists, using a methodology similar to that I have used in studying physician decisions and bias (viz, randomizing recipients to receiving one of two forms of a case vignette that differ in the independent variable of interest). The remainder of this post will focus on that study, which is sorely in need of some post-publication peer review.

The study was led by Itiel Dror, PhD, a Harvard trained psychologist now at University College London who studies bias, with a frequent focus on forensic medicine, if my cursory search is any guide. The other authors are a forensic pathologist (FP) at University of Alabama Birmingham (UAB), a FP and coroner in San Luis Obispo, California, a lawyer with the Clark County public defender's office in Las Vegas, Nevada, a PhD psychologist from Towson University in Towson, Maryland, an FP proprietor of a Forensics company who is a part time medical examiner for West Virginia, and an FP who is proprietor of a forensics and legal consulting company in San Francisco, California. The purpose of identifying the authors was to try to understand why the analysis of death certificates was restricted to the state of Nevada. Other than one author's residence there, I cannot understand why Nevada was chosen, and the selection is not justified in the paltry methods section of the paper.

The introduction takes great pains to convince us that forensic pathology is rife with cognitive biases, but an examination of the supporting references shows that the gruel is quite thin, consisting partly of opinion pieces by the authors of the current study, and generally lacking in empirical evidence, insomuch as it is different than speculative expositions. One supporting reference is a book by Dan Kahneman and Cass Sunstein that is not published yet (I pre-ordered it on Amazon); I'm not sure I have ever seen this before, referencing an unpublished book - how do the authors even know what the boodk says? Perhaps the best reference, as regards empirical evidence rather than speculation, is a systematic review of cognitive bias in forensic sciences which shows confirmation bias and problems with latent fingerprint analysis. (For the record, I have no doubts that the so-called forensic sciences are rife with bias; I am only interested in determining whether supposed evidence of such bias is solidly intact.)

In the first part of the study, the authors describe their analysis of 1024 death certificates for children under 6 years of age in Nevada between 2009 and 2019. None of these selections are justified and, given a very marginal p-value for the results of the analysis, I think this is important. Would the results have been different if a different time period were used? A different age range? A different state? What was the methodology? How were the death certificates obtained and sorted based on age? All of these questions relate to researcher degrees of freedom that ought to be justified, or at least explained, in the methods. (The statistical analysis, though it is lean, also ought to be specified and it was not; the use of odds ratios rather than risk ratios [RR] belies a lack of statistical sophistication, and serves to overstate the RR in the survey part of the "study"; furthermore, I cannot reduplicate the reported odds ratios, leading me to suspect that they may be in error.)

The findings of this part of the study are that, for the 23% of causes of death not considered "natural" (which in this age group includes "accidental" and "homicide") black children's deaths were classified as homicides 8.5% of the time, whereas for white children the figure was 5.6% (OR 1.81, 95% CI 1.01 to 3.25). Note that this result is barely statistically significant and if just a couple of outcomes were reclassified, the result may lose statistical significance. This makes the choice of Nevada, the time period of study, the total number of certificates analyzed and the age range of included children critical. Were these chosen after the fact to get the significant result? We are left to wonder.

The authors attribute this difference (which they say is due to overdetermination of homicide in black children compared to white children) to bias in the determination of manner of death (homicide versus accident) by whoever completed the death certificate, then they acknowledge uncertainty:

We must be careful in drawing conclusions about bias from these archival data, especially given that the ground truth of how these children actually died is unknown. For example, it is possible that Black children die from homicide more often than White children.

It is not just possible, it is a fact that black children die from homicide more often than white children. In 2006 in the journal Injury Prevention, Bennett et all report that black children are 4 times more likely to die of homicide than white children, based on the National Violent Death Reporting System. (Four of the authors hailed from Etiology and Surveillance Branch, National Center for Injury Prevention and Control, Division of Violence Prevention, Centers for Disease Control and Prevention, Atlanta, GA, USA.) If there is any bias in the Nevada death certificate results, it is likely that it is underdetermining homicide in black children. It should be obvious that if so, perpetrators of child homicide are going undetected because not enough homicide determinations are being made on Nevada death certificates. There may be bias, but it most likely goes in the opposite direction of what Dr. Dror and his colleagues claim, for reasons the lector may infer himself. Yet another interpretation might be that homicides are being overdetermined in white children, rendering the alleged racial bias topsy turvy.

We turn now to the part of the study that is so preposterous that I don't think it's unreasonable to worry that duplicitousness was involved, or at least that the design was irresponsible, for reasons that will become apparent. (I should point out that it is not clear from the manuscript that IRB approval was sought for the survey part of the study, how privacy of respondents was protected, etc.) The authors recruited 133 respondents from the National Association of Medical Examiners (NAME) by emailing 713 members of NAME. These 133 participants received one of two versions of a case vignette that varied by 3 independent variables. (Neither randomization procedures, nor numbers receiving each version, nor demographics other than age are reported.) All case vignettes described a child with injuries to the skull and intracranial bleeding who was found unconscious by a caregiver on the floor, and who later died. There was bruising on the head, neck and extremities. In all these respects the vignettes were identical. The two forms of the vignette differed in the following respects: in one form, the child was black, and the caregiver was the mother's boyfriend (they called this the "Black Condition"); in the other form, the child was white and the caregiver was the child's grandmother (called the "White Condition"). If you're inclined to read that twice to make sure there is no mistake, you will find yourself in good company. There are actually 4 (four!) independent variables, or differences between the vignettes:

  1. the race of the child
  2. the gender of the caregiver
  3. the age of the caregiver (which is able to be inferred)
  4. Whether there is a blood relation between the caregiver and the child
It is here (but not only here) that the design is irresponsible. When you do an experiment like this, you have the opportunity to isolate the independent variable so that you know, or can strongly infer, what caused any differences in the dependent variable (here the determination of manner of death, accident or homicide) that you may find. In this study, you cannot make strong inferences as to the cause of the differences found in the manner of death, because they could be due to any of the 4 enumerated differences. Confounding was built into the design, and, since the authors should have known better, it is reasonable to question their motives for doing this. It is - or should be - obvious that young men commit most homicides and most reasonable people would infer that a 3.5 year old child's mother's boyfriend will be a young man; that grandmothers rarely kill their grandchildren; that few murderers are older women; and that killing step-children (non-blood relatives) is more common than killing children who are blood relatives. I'm not going to even bother to look up references for those alleged facts because the average person would make the same inferences even in the absence of supporting data. So of course the FPs would determine that homicide was more likely in the "Black Condition" (which was really the black child cared for by a non-blood relative who is a young man condition) than in the "White Condition" (which was really the white child cared for by his blood-relative grandmother older woman condition). Is it any surprise that the determination was homicide in 35% of the "Black Condition" vignettes versus 13% of the "White Condition" vignettes? (The RR of a determination of homicide here, for the 2 conditions, is 2.7; I cannot reproduce the OR of 12.0 that they claim in the paper. It does not help that they didn't report the number of respondents for each version of the vignettes; suffice it to say that the abecedarian methods of this paper are insufficient to reproduce the results, and I'm concerned about incorrect analyses.)

Whether it was a surprise to the authors or not, they were quick to explain it as an example of bias consistent with and corroborating the alleged bias they found in the Nevada death certificates. One does at this point wonder if the confounding caused by the four independent variables was intentionally duplicitous, done to lead to the desired result. They perform some legerdemain in the discussion of the results of the vignettes, saying that the contextual information about the caregiver should not have influenced the homicide determinations. This beggars belief. If the information is not relevant, why would it be presented? If respondents were told that the child was under video monitoring in the hospital for a week leading up to the death, should they have discounted that information, too, in their determination of homicide versus accident? In any case, the methodology is lackadaisical and thus the results cannot be used to confidently support a claim of bias, racial or otherwise. The authors claim that this "medically irrelevant contextual information" should not have been used in the determination of homicide. Because we are not told what instructions respondents were given (whether they should base their determination solely on the medical examination or also consider context), we don't know whether respondents thought they should incorporate this information or not. Since in practice it would be folly not to, as the video example shows, it is unusual that they would imply that the use of contextual information constitutes a bias.

In the NYT article, Dr. Brian Peterson, the Milwaukee County Medical Examiner and one of the FPs who solicited the Journal of Forensic Sciences to retract the article, said:
“By basically accusing every member of ‘unconscious’ racism, a charge impossible to either prove or refute, members will henceforth need to confront this bogus issue whenever testifying in court.”

Since NAME has only 713 members, the fact that 18.7% of them responded to the case vignettes in the study does raise issues about sullying the reputation of the entire group. Since that is a risk of the study, it makes it all the more imperative that it be conducted with rigor, the likes of which we are not seeing at any level. I wonder what NAME has to say about all of this, whether the authors got permission from NAME for the use of the mailing list, and whether NAME would be inclined for another such study to be done with its members. This latter point - the issue of additional study - raises serious concerns about whether it was irresponsible to conduct this slipshod type of "study". Because of its methodological failings, few reasonable conclusions can be drawn from it (except that the authors age more guilty of the kinds of cognitive biases that they accuse their subjects of, including the "bias blind spot"), and it may be impossible to conduct it again because of the Hawthorne effect, magnified by publicity and controversy surrounding the current study. How now can NAME reclaim its honor? Sometimes you only get one chance, and it should not be squandered by sloppy methodology.

The authors end the article with a long, rambling, polemical set of patronizing policy proposals that are not unlike previous writings on the subject. They contain many good ideas - good enough to stand on their own, and not in need of buttressing by data procured with slapdash methodology, tendentious apologetics, and reckless disregard for consequences and the truth.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.