Introducing the NIHTB V3 Cognition Battery: Results from a Large-Scale Norming Study Across the Life Span

Emily Ho, Northwestern University, Chicago, United States
Aaron Kaat, Northwestern University, Chicago, United States
Erica LaForte, Northwestern University, Chicago, United States
Amy Giella, Northwestern University, Chicago, United States
Julie Hook, Northwestern University, Chicago, United States
Richard Gershon, Northwestern University, Chicago, United States


To describe the results of a large-scale norming study of the NIH Toolbox Cognition Battery (V3), including measures of convergent validity, divergent validity, and other relevant psychometric indices.

Participants and Methods:

In a large-scale nationally representative sample of US participants, N = 3848 participants (52.3% female), across ages 3-94 completed the newly revised NIH Toolbox Cognition Battery. This included two measures of crystallized intelligence and five measures of fluid intelligence (e.g., executive functioning, memory, and nonverbal reasoning). Two newly developed tests, Speeded Matching and Visual Reasoning, were also administered. A subset of 200 participants completed a retest 7 to 21 days later. Convergent validity measures such as the Weschler tests, the California Verbal Learning Test. 

We performed a raking procedure to assign population weights to each of the participants. For each individual test, person ability estimates were calculated, and a psychometric model was tested and selected for each measure. When possible, an IRT model was chosen (e.g., Rasch, or a 2-parameter model). Then Change Sensitive Scores (CSS), or criterion-referenced estimates of examinee ability were established (i.e., linearly transforming the logit metric and adding a meaningful constant), and ensuing Age-Adjusted and Age-and-Education Adjusted scores were developed using continuous norming procedures. Factor analyses were conducted on the core cognition tests using a two-factor theory of cognition.


The growth curve for each of the measures follows hypothesized trajectories across the life span. Confirmatory factor analyses showed a two-factor model that separates fluid and crystallized intelligence fit well, with moderate correlations between the two factors (0.46 for children and 0.41 for adults). The convergent validity correlations between the total Toolbox cognition composite and the Weschler tests was 0.64, demonstrating good convergence with established gold standards.


NIH Toolbox is a multidimensional set of assessments meant to be a “common currency” for a diverse set of study design and research settings. The updated NIH Toolbox V3 incorporates new scientific developments in neuropsychology and psychometrics, includes two validated measures of processing speed and non-verbal reasoning, respectively. There is good convergence with established gold standards and a robust factor structure that aligns with a two-factor model of cognition.

Category: Assessment/Psychometrics/Methods (Adult)

Keyword 1: cognitive functioning
Keyword 2: aging (normal)
Keyword 3: neuropsychological assessment