Enhancing research on speech and deep learning through holistic acoustic analysis

通过整体声学分析加强语音和深度学习研究

基本信息

  • 批准号:
    2219843
  • 负责人:
  • 金额:
    $ 100万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-08-15 至 2026-07-31
  • 项目状态:
    未结题

项目摘要

You can guess a lot about a person from the way they pronounce words. Remarkably, human listeners can tell if it is likely that talkers learned English as a first language or a second language, or if the talkers might have a brain injury that makes it difficult for them to speak. Such intuitions rely on human listeners’ holistic pattern recognition abilities; these allow us to perceive the important, meaningful, yet subtle differences between pronunciations. However, the methods scientists currently use to measure speech objectively – based on a small number of properties of speech sounds – fail to capture these differences, hampering our ability to use speech to learn about the mind and brain. This project brings together speech scientists, computer scientists, and neuroscientists to test a radically different approach to this problem. Machine learning will be used to discover a new method for quantifying differences between spoken utterances based on holistic pattern recognition. This will be tested against new and existing data from bilingual speakers. If successful, this will yield a fully general method that can be applied to speech from any language or any domain of language usage, allowing scientists to capitalize on the wealth of information in speech to develop powerful new insights into the mind and brain. Improved detection of subtle problems with pronunciation, such as occurs with Alzheimer’s disease, will advance our understanding of the brain mechanisms that humans use to produce speech. The results of this testing will also allow computer scientists to advance our understanding of how machine learning algorithms process sounds, driving improvements in the algorithms and supporting applications in any area of speech and language technology that relies on spoken language processing. Speech variability across talkers provides a treasure trove of information for cognitive neuroscientists, leading to important insights into the cognitive mechanisms underlying language processing and potentially providing early signs of brain dysfunction. Current studies of speech are hamstrung by analyses that require preselecting specific temporal scales and acoustic dimensions. We propose a radically different approach: using unsupervised deep learning to discover a representational space for analysis of acoustic variation. To test this highly general approach, this method will be compared to current state-of-the art methods for analyzing individual variation in bilingual speech. This includes using the acoustic variation in second language speech to predict intelligibility and to detect difficulties in code-switching, particularly the challenges faced by individuals with Alzheimer’s Disease. The results will inform development of deep learning and cognitive neuroscience. The machine learning algorithm is fully general; it can be applied to speech from any language or any domain of language usage, expanding the range of populations and contexts that can be served by speech technology or studied by cognitive neuroscientists. The project’s integrative approach will allow computer scientists to advance our understanding of the extent to which modern deep learning architectures do or do not approximate human speech processing and allow cognitive neuroscientists to further our understanding of how meaningful acoustic distinctions are represented in speech perception and production. human speech representation. This project is funded by the Integrative Strategies for Understanding Neural and Cognitive Systems (NCS) program, which is jointly supported by the Directorates for Computer and Information Science and Engineering (CISE), Education and Human Resources (EHR), Engineering (ENG), and Social, Behavioral, and Economic Sciences (SBE).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
您可以从一个人发音的方式来猜测很多人。值得注意的是,人类的听众可以判断说话者是否可能学到英语是一种语言或第二语言,或者说话者是否会遭受脑损伤,这使他们很难说话。这种直觉依赖于人类听众的整体模式识别能力;这些使我们能够感知发音之间的重要,有意义但微妙的差异。但是,科学家目前使用的方法基于语音的少量属性来客观地衡量语音 - 无法捕获这些差异,从而阻碍了我们使用语音学习思想和大脑的能力。该项目汇集了言语科学家,计算机科学家和神经科学家,以测试针对此问题的截然不同的方法。机器学习将用于发现一种新方法,以量化基于整体模式识别的讲话话语之间的差异。这将根据双语扬声器的新数据和现有数据进行测试。如果成功的话,这将产生一种完全通用的方法,可以应用于任何语言或任何语言使用领域的语音,使科学家能够利用语音中的大量信息,以发展对思想和大脑的强大新见解。改善了对发音的微妙问题的检测,例如阿尔茨海默氏病发生的,将促进我们对人类用来产生言语的大脑机制的理解。该测试的结果还将允许计算机科学家提高我们对机器学习算法如何处理声音,推动算法改进的理解,并支持任何依赖口语处理的语音和语言技术领域的应用程序。谈话者之间的语音变异性为认知神经科学家提供了一大批信息,从而引发了对语言处理基础的认知机制的重要见解,并可能提供了脑功能障碍的早期迹象。当前的语音研究通过需要预览特定的临时量表和声学维度的分析进行障碍。我们提出了一种根本不同的方法:使用无监督的深度学习来发现代表性的空间来分析声学变异。为了测试这种高度通用的方法,将将这种方法与当前的最新方法进行比较,以分析双语语音中的个体变化。这包括使用第二语言语音中的声学差异来预测可理解性并发现代码转换中的困难,尤其是阿尔茨海默氏病患者面临的挑战。结果将为深度学习和认知神经科学的发展提供信息。机器学习算法是完全普遍的。它可以应用于任何语言或任何语言使用领域的语音,扩大语音技术或认知神经科学家可以使用的人群和上下文的范围。该项目的综合方法将使计算机科学家能够促进我们对现代深度学习体系结构进行或不近似人类语音处理的程度的理解,并允许认知神经科学家进一步了解我们对言语感知和生产中如何表示有意义的声学区分。人类言语表示。该项目由理解神经和认知系统(NCS)计划的综合策略提供资金,该计划由计算机,信息科学与工程局(CISE),教育与人力资源(EHR),工程(ENG),社会,行为,行为和经济科学(SBE)的诚实授予的诚实传统的授权宣传,共同支持该局。更广泛的影响审查标准。

项目成果

期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Inhibitory control of the dominant language: Reversed language dominance is the tip of the iceberg
  • DOI:
    10.1016/j.jml.2023.104410
  • 发表时间:
    2023-01-23
  • 期刊:
  • 影响因子:
    4.3
  • 作者:
    Goldrick, Matthew;Gollan, Tamar H.
  • 通讯作者:
    Gollan, Tamar H.
Advancement of phonetics in the 21st century: Exemplar models of speech production
21 世纪语音学的进步:语音产生的范例模型
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    1.9
  • 作者:
    Goldrick, Matthew;Cole, Jennifer
  • 通讯作者:
    Cole, Jennifer
共 2 条
  • 1
前往

Matthew Goldrick其他文献

Language and the Brain: Developments in Neurology/Neuroscience, Linguistics, and Psycholinguistics
语言与大脑:神经病学/神经科学、语言学和心理语言学的发展
  • DOI:
  • 发表时间:
    2014
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Lise Menn;Matthew Goldrick
    Lise Menn;Matthew Goldrick
  • 通讯作者:
    Matthew Goldrick
    Matthew Goldrick
The perception of code-switched speech in noise.
噪声中语码转换语音的感知。
  • DOI:
    10.1121/10.0025375
    10.1121/10.0025375
  • 发表时间:
    2024
    2024
  • 期刊:
  • 影响因子:
    1
  • 作者:
    M. Gavino;Matthew Goldrick
    M. Gavino;Matthew Goldrick
  • 通讯作者:
    Matthew Goldrick
    Matthew Goldrick
共 2 条
  • 1
前往

Matthew Goldrick的其他基金

Doctoral Dissertation Research: The effects of experience and attitudes on heritage bilinguals' language processing
博士论文研究:经验和态度对传统双语者语言处理的影响
  • 批准号:
    2141430
    2141430
  • 财政年份:
    2022
  • 资助金额:
    $ 100万
    $ 100万
  • 项目类别:
    Standard Grant
    Standard Grant
Doctoral Dissertation Research: Role of Prior Knowledge in Consolidation of Novel Phonotactic Patterns for Speech Production
博士论文研究:先验知识在巩固语音生成的新型语音模式中的作用
  • 批准号:
    2116802
    2116802
  • 财政年份:
    2021
  • 资助金额:
    $ 100万
    $ 100万
  • 项目类别:
    Standard Grant
    Standard Grant
Doctoral Dissertation Research: Why adapt? Phonotactic learning as non-native language adaptation
博士论文研究:为什么要适应?
  • 批准号:
    1728173
    1728173
  • 财政年份:
    2017
  • 资助金额:
    $ 100万
    $ 100万
  • 项目类别:
    Standard Grant
    Standard Grant
Doctoral Dissertation Research on the Role of Domain-General Executive Functions in Language Production: Resolving conflict in lexical selection
域一般执行功能在语言产生中的作用的博士论文研究:解决词汇选择中的冲突
  • 批准号:
    1420820
    1420820
  • 财政年份:
    2014
  • 资助金额:
    $ 100万
    $ 100万
  • 项目类别:
    Standard Grant
    Standard Grant
Doctoral Dissertation Research: Learning of Novel Phonetic Categories After Training in Perception and Production
博士论文研究:感知和生产训练后新语音类别的学习
  • 批准号:
    0951943
    0951943
  • 财政年份:
    2010
  • 资助金额:
    $ 100万
    $ 100万
  • 项目类别:
    Standard Grant
    Standard Grant
CAREER: Integrating Grammatical and Psycholinguistic Approaches to Phonological Processes in Speech Production
职业:将语法和心理语言学方法整合到语音生成的语音过程中
  • 批准号:
    0846147
    0846147
  • 财政年份:
    2009
  • 资助金额:
    $ 100万
    $ 100万
  • 项目类别:
    Standard Grant
    Standard Grant

相似国自然基金

支持二维毫米波波束扫描的微波/毫米波高集成度天线研究
  • 批准号:
    62371263
  • 批准年份:
    2023
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目
腙的Heck/脱氮气重排串联反应研究
  • 批准号:
    22301211
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
水系锌离子电池协同性能调控及枝晶抑制机理研究
  • 批准号:
    52364038
  • 批准年份:
    2023
  • 资助金额:
    33 万元
  • 项目类别:
    地区科学基金项目
基于人类血清素神经元报告系统研究TSPYL1突变对婴儿猝死综合征的致病作用及机制
  • 批准号:
    82371176
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
FOXO3 m6A甲基化修饰诱导滋养细胞衰老效应在补肾法治疗自然流产中的机制研究
  • 批准号:
    82305286
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Enhancing efficacy of speech modification strategies for pediatric dysarthria
提高儿童构音障碍言语矫正策略的疗效
  • 批准号:
    10438101
    10438101
  • 财政年份:
    2022
  • 资助金额:
    $ 100万
    $ 100万
  • 项目类别:
Enhancing efficacy of speech modification strategies for pediatric dysarthria
提高儿童构音障碍言语矫正策略的疗效
  • 批准号:
    10610448
    10610448
  • 财政年份:
    2022
  • 资助金额:
    $ 100万
    $ 100万
  • 项目类别:
Collaborative Research: Enhancing Speech Science Training through Collaboration: Investigating Perception of a Variable Speech Signal
协作研究:通过协作增强语音科学训练:研究可变语音信号的感知
  • 批准号:
    2126888
    2126888
  • 财政年份:
    2021
  • 资助金额:
    $ 100万
    $ 100万
  • 项目类别:
    Standard Grant
    Standard Grant
Collaborative Research: Enhancing Speech Science Training through Collaboration: Investigating Perception of a Variable Speech Signal
协作研究:通过协作增强语音科学训练:研究可变语音信号的感知
  • 批准号:
    2126897
    2126897
  • 财政年份:
    2021
  • 资助金额:
    $ 100万
    $ 100万
  • 项目类别:
    Standard Grant
    Standard Grant
Enhancing the quality of CBT in community mental health through AI-generated fidelity feedback
通过人工智能生成的保真度反馈提高社区心理健康领域 CBT 的质量
  • 批准号:
    10324974
    10324974
  • 财政年份:
    2021
  • 资助金额:
    $ 100万
    $ 100万
  • 项目类别: