INS NYC 2024 Program

Poster

Poster Session 11 Program Schedule

02/17/2024
10:45 am - 12:00 pm
Room: Shubert Complex (Posters 1-60)

Poster Session 11: Cultural Neuropsychology | Education/Training | Professional Practice Issues


Final Abstract #35

Overlooked Effects of Biases in Assessments: Implications of Pervasive Racial Bias on the CERAD List Learning Test

William Goette, University of Texas Southwestern Medical Center, Dallas, United States
Heidi Rossetti, University of Texas Southwestern Medical Center, Dallas, United States
Linda Hynan, University of Texas Southwestern Medical Center, Dallas, United States
Munro Cullum, University of Texas Southwestern Medical Center, Dallas, United States
Laura Lacritz, University of Texas Southwestern Medical Center, Dallas, United States

Category: Inclusion and Diversity/Multiculturalism

Keyword 1: demographic effects on test performance
Keyword 2: minority issues
Keyword 3: psychometrics

Objective:

Concerns about bias in cognitive assessments have existed since their widespread adoption in the early 1900s. Despite the development of methods like differential item functioning (DIF) and multigroup confirmatory factor analysis (MGCFA) to investigate these concerns, the assumptions of these statistical methods can make results misleading. These methods assume that quantified bias should be minimized and attributable only to specific items as the cause, implying that those items are anomalous despite most items being very similar to one another. The pervasive bias approach, an extension of item response theory (IRT), loosens these assumptions by treating the shared qualities of items as the cause rather than the items themselves. The detection and estimated magnitude of racial bias on a word list-learning test were compared between the pervasive bias approach and four common alternative methods to evaluate how statistical assumptions impact conclusions regarding bias.

Participants and Methods:

Data are from participants (n=2074, MeanAge=75.4, MeanEduc=13.2) of the Harmonized Cognitive Assessment Protocol who completed the CERAD list-learning test in English, were determined to be cognitively normal, and self-identified as either racially Black (17%) or White (83%). The CERAD consists of 10 words presented over three learning trials and a delayed free recall trial, resulting in 40 test items. Seventeen characteristics of the words’ orthography/phonology, semantic associations, and general familiarity were obtained from the South Carolina Psycholinguistic Metabase. The evidence for assessment bias based on results from five different methods were compared: DIF, MGCFA, and t-tests on the raw scores, unadjusted IRT scores, and the pervasive bias IRT scores.

Results:

DIF detection identified 6-7 of the 40 items as biased. In contrast, MGCFA found no parameter differences between the groups. A t-test on raw scores revealed that White participants performed 1.19 points higher on average (d=0.20, 95%CI: 0.08-0.31). Without modeling bias, Black participants had lower average IRT-based scores (d=0.25, 95%CI: 0.20-0.30). Using pervasive bias modeling, the opposite is true, wherein Black participants’ average IRT-based score were estimated to be higher than White participants’ (d=0.24, 95%CI: 0.15-0.34). Despite this superior IRT-estimated ability, the small 1.19 average difference in words recalled was attributable to a model finding that test items were more difficult for Black compared to White participants (d=0.47, 95%CI: 0.21-0.76).

Conclusions:

These results demonstrate the importance of understanding implicit assumptions in statistical methods and how these methods shape results of data analysis. Each examined method supported different conclusions about the presence or clinical relevance of assessment bias. Unlike DIF and MGCFA methods, the pervasive bias model makes less restrictive assumptions for how bias could affect the measurement characteristics of a test rather than assuming that such differences are statistical anomalies of measurement error. Pervasive bias modeling has many advantages over other methods: it explicitly permits bias to (a) exist at a level other than the items, (b) interact with characteristics of test takers (e.g., educational quality, acculturation), (c) be explainable as a function of these person and item characteristics, and (d) adjust the measurement of individual persons’ ability based on these factors.