Computational Methods for Speech Analysis
语音分析的计算方法
基本信息
- 批准号:2120087
- 负责人:
- 金额:$ 24.93万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-08-01 至 2024-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This research project will develop tools for testing hypotheses about human communication. Researchers generally study human communication from textual transcripts which omit vocal tone. The project will directly address the disconnect between the data-generating process - in which speakers and listeners use the auditory channel to convey both textual and non-textual signals - and the widespread practice of discarding speech audio. The investigators will extend their prior speech model, The Model of Audio and Speech Structure, to address some limitations of the model. In particular, the statistical extensions will accommodate multiple speakers and allow for the joint modeling of text and tone. To demonstrate the value of the statistical extensions, the model will be applied to two original video corpora - police body-worn camera footage and campaign speeches for federal office. New software will be developed that makes it easy for researchers to quickly annotate a large amount of speech audio. The browser-based tools will enable automatic and manual segmentation, along with labeling. Multiple graduate students will gain experience in computationally intensive research and software development. The tools to be developed will be incorporated into ongoing public-private collaborations to improve oversight of police officers in the field.This research project will extend the Model of Audio and Speech Structure (MASS), which analyzes conversation as a nested stochastic process in which (i) the flow of conversation unfolds as a sequence of utterances transitioning between speakers and their vocal tones, based on contextual covariates; and (ii) the auditory signal within each utterance unfolds as a hidden Markov model that transitions between phonemes which generate sound. The model enables social scientists to test hypotheses about how conversations are structured by fixed covariates (e.g., speaker gender, conversation role) and time-varying covariates (e.g., exogenous external stimuli, endogenous conversation trajectory such as the previous speaker's tone). In its current implementation, however, MASS has two key limitations: First, it uses resource-intensive human annotations of tone for each speaker, which limits application to contexts with many unique speakers, such as police body-worn camera footage. This project will develop extensions allowing the model to borrow strength by partial pooling across speakers with similar speech profiles. Second, MASS incorporates text as externally given metadata. The project will develop a new approach for joint modeling of text and audio which will incorporate a dynamic topic model into the flow-of-conversation layer of MASS. The investigators will conduct two applications to demonstrate the value of the multi-speaker and joint text-audio modeling extensions.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该研究项目将开发用于测试有关人类交流的假设的工具。研究人员通常从文本成绩单中研究人类的交流,这些沟通忽略了声音。该项目将直接解决数据生成过程之间的脱节 - 在该过程中,说话者和听众使用听觉渠道传达文本和非文本信号 - 以及丢弃语音音频的广泛实践。研究人员将扩展其先前的语音模型,即音频和语音结构的模型,以解决该模型的某些局限性。特别是,统计扩展将容纳多个扬声器,并允许文本和音调的联合建模。为了证明统计扩展的价值,该模型将应用于两个原始视频语料库 - 警察饰演的摄像机镜头和联邦办公室的竞选演讲。将开发新软件,从而使研究人员易于快速注释大量的语音音频。基于浏览器的工具将启用自动和手动分割以及标签。多个研究生将获得计算密集型研究和软件开发的经验。要开发的工具将纳入正在进行的公私合作中,以改善对现场的警察的监督。该研究项目将扩展音频和语音结构(MASS)的模型,该模型将对话分析为嵌套随机过程,在该过程中,(i)对话流作为演讲者和他们的人声之间的序言,基于上下文的序言,以此为基础,以此为基础上下文。 (ii)每种话语中的听觉信号作为一个隐藏的马尔可夫模型展开,该模型在产生声音的音素之间过渡。该模型使社会科学家能够检验有关固定协变量(例如,说话者性别,对话角色)和时变的协变量(例如,外源性外部刺激,内源性对话轨迹,例如以前的扬声器的语气)的固定协变量(例如,说话者性别,对话角色)的构造假设。然而,在目前的实施中,质量有两个关键局限性:首先,它使用每个扬声器的音调的资源密集型人体注释,这将应用程序限制在许多独特的扬声器的环境中,例如警察戴着身体磨损的摄像机镜头。该项目将开发扩展,使该模型通过在具有相似语音概况的扬声器之间进行部分合并来借用强度。其次,弥撒将文本纳入外部给出的元数据。该项目将开发一种新的方法来建模文本和音频,该方法将将动态主题模型纳入质量交流层中。调查人员将进行两项申请,以证明多演讲者和联合文本审计建模扩展的价值。该奖项反映了NSF的法定任务,并使用基金会的知识分子和更广泛的影响评估审查标准,被认为值得通过评估来支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Christopher Lucas其他文献
Form-function mismatches in (formally) definite English noun phrases: Towards a diachronic account
(形式上)确定的英语名词短语中的形式功能不匹配:走向历时帐户
- DOI:
10.1075/la.171.12luc - 发表时间:
2011 - 期刊:
- 影响因子:0
- 作者:
Christopher Lucas - 通讯作者:
Christopher Lucas
Toward quantitative forecasts of volcanic ash dispersal: Using satellite retrievals for optimal estimation of source terms
火山灰扩散的定量预测:利用卫星检索对源项进行最佳估计
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
M. Zidikheri;Christopher Lucas;Rodney J. Potts - 通讯作者:
Rodney J. Potts
Contact-induced grammatical change: Towards an explicit account
接触引起的语法变化:走向明确的解释
- DOI:
10.1075/dia.29.3.01luc - 发表时间:
2012 - 期刊:
- 影响因子:0.7
- 作者:
Christopher Lucas - 通讯作者:
Christopher Lucas
On Wilmsen on the development of postverbal negation in dialectal Arabic
威尔姆森论阿拉伯语方言中动词后否定的发展
- DOI:
10.13173/zeitarabling.67.0044 - 发表时间:
2018 - 期刊:
- 影响因子:1.1
- 作者:
Christopher Lucas - 通讯作者:
Christopher Lucas
Can Transformer be Too Compositional? Analysing Idiom Processing in Neural Machine Translation
Transformer 会不会太组合了?
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Verna Dankers;Christopher Lucas;Ivan Titov - 通讯作者:
Ivan Titov
Christopher Lucas的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Christopher Lucas', 18)}}的其他基金
XMaS: The National Material Science Beamline Research Facility at the ESRF
XMaS:ESRF 的国家材料科学光束线研究设施
- 批准号:
EP/Y031164/1 - 财政年份:2024
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Dissecting macrophage regulation of lung epithelial regeneration
剖析巨噬细胞对肺上皮再生的调节
- 批准号:
MR/X019314/1 - 财政年份:2023
- 资助金额:
$ 24.93万 - 项目类别:
Fellowship
XMaS Capital Equipment Upgrade
XMaS 资本设备升级
- 批准号:
EP/X035131/1 - 财政年份:2023
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Inflammation in Covid-19: Exploration of Critical Aspects of Pathogenesis (ICECAP)
Covid-19 中的炎症:发病机制关键方面的探索 (ICECAP)
- 批准号:
MR/V028790/1 - 财政年份:2020
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
XMaS: The UK Materials Science Facility at the ESRF
XMaS:ESRF 的英国材料科学设施
- 批准号:
EP/S020802/1 - 财政年份:2018
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Arabic and contact-induced language change
阿拉伯语和接触引起的语言变化
- 批准号:
AH/P014089/1 - 财政年份:2017
- 资助金额:
$ 24.93万 - 项目类别:
Fellowship
Developing Electrochemical Structure-Function Relationships in Non-aqueous Electrolytes
开发非水电解质中的电化学结构-功能关系
- 批准号:
EP/K002236/1 - 财政年份:2012
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Combined Atomic Imaging and Diffraction Studies of the Electrooxidation of Supported Metal Multilayers
负载金属多层电氧化的原子成像和衍射联合研究
- 批准号:
EP/G068372/1 - 财政年份:2009
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Atomic-scale Structural Studies of the Electrochemical Interface
电化学界面的原子尺度结构研究
- 批准号:
EP/F036418/1 - 财政年份:2008
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Exploiting XMaS Studies of Highly Correlated Electron Systems, Real Surfaces and Biomaterials
利用高度相关电子系统、真实表面和生物材料的 XMaS 研究
- 批准号:
EP/F000766/1 - 财政年份:2007
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
相似国自然基金
地下水超采区承压含水层系统时序InSAR监测方法
- 批准号:42374013
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
基于深度学习方法的南海海气耦合延伸期智能预报研究
- 批准号:42375143
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
肝癌外周血测序数据中循环肿瘤DNA占比的精确解耦方法研究
- 批准号:62303271
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于高阶读数的拓扑关联结构域识别和比对方法研究
- 批准号:62372156
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
基于矩阵方法的电价博弈分析与控制策略研究
- 批准号:62303170
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Dynamic neural coding of spectro-temporal sound features during free movement
自由运动时谱时声音特征的动态神经编码
- 批准号:
10656110 - 财政年份:2023
- 资助金额:
$ 24.93万 - 项目类别:
Beyond dopamine: dual neuromodulator regulation of motor variability and learning
超越多巴胺:运动变异性和学习的双重神经调节剂调节
- 批准号:
10605853 - 财政年份:2023
- 资助金额:
$ 24.93万 - 项目类别:
Machine learning-based methods for the analysis of microbial glycomes and proteomes in inflammatory bowel disease.
基于机器学习的方法,用于分析炎症性肠病中微生物糖组和蛋白质组。
- 批准号:
10591842 - 财政年份:2023
- 资助金额:
$ 24.93万 - 项目类别:
Computational psycholinguistic analysis of speech samples in PPA and AD and FTD
PPA、AD 和 FTD 中语音样本的计算心理语言学分析
- 批准号:
10373191 - 财政年份:2022
- 资助金额:
$ 24.93万 - 项目类别:
Computational psycholinguistic analysis of speech samples in PPA and AD and FTD
PPA、AD 和 FTD 中语音样本的计算心理语言学分析
- 批准号:
10563169 - 财政年份:2022
- 资助金额:
$ 24.93万 - 项目类别: