Language use and social interactions have demonstrated a close relationship with cognitive measures. It is important to improve the understanding of language use and behavioral indicators from social context to study the early prediction of cognitive decline among healthy populations of older adults.
This study aimed at predicting an important cognitive ability, working memory, of 98 healthy older adults participating in a 4-day-long naturalistic observation study. We used linguistic measures, part-of-speech (POS) tags, and social context information extracted from 7450 real-life audio recordings of their everyday conversations.
The methods in this study comprise (1) the generation of linguistic measures, representing idea density, vocabulary richness, and grammatical complexity, as well as POS tags with natural language processing (NLP) from the transcripts of real-life conversations and (2) the training of machine learning models to predict working memory using linguistic measures, POS tags, and social context information. We measured working memory using (1) the Keep Track test, (2) the Consonant Updating test, and (3) a composite score based on the Keep Track and Consonant Updating tests. We trained machine learning models using random forest, extreme gradient boosting, and light gradient boosting machine algorithms, implementing repeated cross-validation with different numbers of folds and repeats and recursive feature elimination to avoid overfitting.
For all three prediction routines, models comprising linguistic measures, POS tags, and social context information improved the baseline performance on the validation folds. The best model for the Keep Track prediction routine comprised linguistic measures, POS tags, and social context variables. The best models for prediction of the Consonant Updating score and the composite working memory score comprised POS tags only.
The results suggest that machine learning and NLP may support the prediction of working memory using, in particular, linguistic measures and social context information extracted from the everyday conversations of healthy older adults. Our findings may support the design of an early warning system to be used in longitudinal studies that collects cognitive ability scores and records real-life conversations unobtrusively. This system may support the timely detection of early cognitive decline. In particular, the use of a privacy-sensitive passive monitoring technology would allow for the design of a program of interventions to enable strategies and treatments to decrease or avoid early cognitive decline.
语言使用和社会互动已表明与认知指标密切相关。从社会背景出发提高对语言使用和行为指标的理解,对于研究老年健康人群认知衰退的早期预测非常重要。
本研究旨在对参与一项为期4天的自然观察研究的98名健康老年人的一种重要认知能力——工作记忆进行预测。我们使用了从他们日常对话的7450份真实生活录音中提取的语言指标、词性(POS)标注以及社会背景信息。
本研究的方法包括(1)生成语言指标,代表概念密度、词汇丰富度和语法复杂性,以及利用自然语言处理(NLP)从真实生活对话的文本中生成词性标注;(2)训练机器学习模型,利用语言指标、词性标注和社会背景信息来预测工作记忆。我们使用(1)跟踪测试、(2)辅音更新测试以及(3)基于跟踪测试和辅音更新测试的综合分数来测量工作记忆。我们使用随机森林、极端梯度提升和轻量级梯度提升机算法训练机器学习模型,通过不同数量的折数和重复次数进行重复交叉验证以及递归特征消除以避免过拟合。
对于所有三种预测流程,包含语言指标、词性标注和社会背景信息的模型提高了验证折上的基线性能。跟踪预测流程的最佳模型包含语言指标、词性标注和社会背景变量。预测辅音更新分数和综合工作记忆分数的最佳模型仅包含词性标注。
研究结果表明,机器学习和自然语言处理可能有助于利用从健康老年人日常对话中提取的语言指标和社会背景信息来预测工作记忆。我们的研究结果可能有助于设计一种预警系统,用于纵向研究,该系统可收集认知能力分数并无干扰地记录真实生活对话。这个系统可能有助于及时发现早期认知衰退。特别是,使用对隐私敏感的被动监测技术将有助于设计干预方案,使相关策略和治疗能够减少或避免早期认知衰退。