INS NYC 2024 Program

Poster

Poster Session 06 Program Schedule

02/15/2024
04:00 pm - 05:15 pm
Room: Majestic Complex (Posters 61-120)

Poster Session 06: Aging | MCI | Neurodegenerative Disease - PART 2


Final Abstract #77

A Pipeline to Automatically Capture Speech and Language Features in Alzheimer’s Disease

Brittany Morin, University of California, San Francisco, United States
David Rosado Rolon, University of California, San Francisco, United States
Buddhika Ratnasiri, University of California, San Francisco, United States
David Paul Baquirin, University of California, San Francisco, United States
Zoe Ezzes, University of California, San Francisco, United States
Lisa Wauters, University of Texas, Austin, United States
Maya Henry, University of Texas, Austin, United States
Zachary Miller, University of California, San Francisco, United States
Maria Luisa Mandelli, University of California, San Francisco, United States
Bruce Miller, University of California, San Francisco, United States
Maria Luisa Gorno Tempini, University of California, San Francisco, United States
Jet Vonk, University of California, San Francisco, United States

Category: Neurodegenerative Disorders

Keyword 1: speech
Keyword 2: language
Keyword 3: dementia - Alzheimer's disease

Objective:

Early signals of decline in spontaneous language production can serve as a vital indicator of Alzheimer's disease. It is therefore important to identify the features of an individual’s speech and language that vary between healthy aging individuals and those with dementia. To determine these features, we built an automated pipeline to extract lexical and acoustic aspects of older adults’ spontaneous speech samples, and examined which features differed between healthy aging and Alzheimer’s disease.

Participants and Methods:

We built our pipeline on freely available tools and software. First, speech samples from a picture description task (Picnic Scene) were automatically transcribed from audio recording into text with Whisper by OpenAI. Subsequently, we used a combination of tools predominantly in Python to allow feature extraction across the full range of linguistic domains (phonetics, morphology, lexical, semantics, syntax, pragmatics). To test the pipeline’s application, we analyzed samples from 25 individuals with Alzheimer’s disease (age m = 67+8) and 29 cognitively healthy older adults (age m = 76+7). We conducted a MANCOVA adjusting for age, sex/gender, and years of education to identify individual features that differentiate between groups. Subsequently, we built stepwise logistic regression models to identify which combination of features best predicted diagnostic group status and tested the models’ sensitivity and specificity with ROC curve analysis.

Results:

The pipeline extracted 173 speech and language metrics. The MANCOVA identified a total of 20 speech metrics that showed group differences. After adjusting for multicollinearity, 16 features were entered into logistic regression models using forward and backward stepwise regression for optimal feature selection. Both forward and backward selection models identified two significant features: mean lexical concreteness and conjunction-to-total-word ratio. Mean lexical concreteness (the extent to which a word denotes a concept that can be experienced by the senses) was measured across participants’ nouns, verbs, adjectives, and adverbs. The ratio of conjunction words was measured as the total number of conjunction words to all uttered words. Confusion matrices for diagnostic status showed an overall predictive accuracy of 85.2%. ROC curve analysis showed that the backward stepwise logistic regression model possessed a 94.2% prediction accuracy rate and the forward stepwise model 91.2% to predict diagnosis. Significance for both ROC models was p < .001.

Conclusions:

Our extensive pipeline and analyses propounded two language features that can be used as indicators of abnormal spontaneous speech in Alzheimer’s disease: compared to healthy aging, those with Alzheimer’s disease produced on average words with a higher lexical concreteness and used more conjunction words while describing a picture scene. The use of language analysis to identify cognitive signs may contribute to early diagnosis and promote early intervention of care. Our automated pipeline could offer a quick, non-invasive, automated, remote, scalable, and low-cost cognitive tool for tailored assessment and longitudinal monitoring, and its use could easily be extended to other types of dementia as well as other neurological syndromes.