RI: Small: Integrative, Semantic-Aware, Speech-Driven Models for Believable Conversational Agents with Meaningful Behaviors
RI:小型:集成的、语义感知的、语音驱动的模型,用于具有有意义行为的可信会话代理
基本信息
- 批准号:1718944
- 负责人:
- 金额:$ 49.41万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-09-01 至 2022-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project will analyze, model and synthesize human behaviors to create a believable Conversational Agent (CA). A CA is a virtual agent that interacts with a user, displaying human-like behaviors not only through speech but also through facial expressions and head movements. Replicating or representing human behavior includes generating gestures that are synchronized with speech, convey appropriate meaning in the message, and respond to the behaviors displayed by the user. An appealing approach to synthesize human-like behaviors is the use of data-driven methods, which have the potential of capturing naturalistic variations of the behaviors. Modeling the dependencies between speech and gestures brings insights about verbal and nonverbal communication, underlying the production and coordination mechanisms used during natural human interactions. CAs can be used in a variety of health care applications, such as helping hearing impaired individuals and teaching social skills to autistic children. Tutoring systems that display human-like behaviors to communicate and acknowledge active listening will engage better with the students, helping them in their learning. The project promises a fertile ground for interdisciplinary training of graduate and undergraduate students. The models will be evaluated with an assistive agent (CA or embodied robot) interacting with UT Dallas students, serving as a platform to reach out students from all majors, especially woman and underrepresented minorities.The project will take an integrative, cross-disciplinary approach to generate believable and meaningful behaviors by exploring the intrinsic relation between speech, head motion, and facial expressions, constrained by important aspects of spoken language. The planned research leverages some of the latest developments in the field of deep learning in an integrative fashion, pulling together acoustic features and semantic language structure, to build models that are able to account for the correlation between various facial and head movements. The speech-driven approach will capture the variability of human behavior in a manner that is not easily possible with rule-based approaches. Dialog acts and emotions will be inferred and used to constrain the speech driven models, capturing the relation between high-level conversational functions and facial gestures. The project will offer novel, principled methods to generate behaviors driven by synthesized speech, opening new application domain when only text is available. The approach will capture the acoustic variability in synthesized speech, while maintaining the temporal dependency between gestures and speech. The project will also explore schemes to modify the behaviors of the user by displaying carefully designed gestures generated with our data-driven framework. By tracking the behaviors of the user, the system will provide appropriate responses, closing the loop in the interaction.
该项目将分析、建模和综合人类行为,以创建可信的对话代理(CA)。 CA 是一种与用户交互的虚拟代理,不仅通过语音,还通过面部表情和头部动作来显示类人行为。复制或表示人类行为包括生成与语音同步的手势、在消息中传达适当的含义以及对用户显示的行为做出响应。合成类人行为的一种有吸引力的方法是使用数据驱动的方法,该方法有可能捕获行为的自然变化。对语音和手势之间的依赖关系进行建模可以深入了解言语和非言语交流,这是人类自然交互过程中使用的生产和协调机制的基础。 CA 可用于各种医疗保健应用,例如帮助听力障碍人士和向自闭症儿童教授社交技能。显示出类人行为来进行交流和承认积极倾听的辅导系统将更好地与学生互动,帮助他们学习。该项目为研究生和本科生的跨学科培训提供了肥沃的土壤。这些模型将通过与 UT 达拉斯分校学生互动的辅助代理(CA 或实体机器人)进行评估,作为接触所有专业学生的平台,特别是女性和代表性不足的少数族裔。该项目将采取综合、跨学科的方法通过探索言语、头部运动和面部表情之间的内在关系,并受到口语重要方面的约束,产生可信且有意义的行为。计划中的研究以综合的方式利用深度学习领域的一些最新进展,将声学特征和语义语言结构结合在一起,建立能够解释各种面部和头部运动之间相关性的模型。语音驱动的方法将以基于规则的方法无法轻易实现的方式捕获人类行为的可变性。对话行为和情感将被推断并用于约束语音驱动模型,捕获高级对话功能和面部手势之间的关系。该项目将提供新颖、原则性的方法来生成由合成语音驱动的行为,从而在只有文本可用时开辟新的应用领域。该方法将捕获合成语音中的声学变化,同时保持手势和语音之间的时间依赖性。该项目还将探索通过显示由我们的数据驱动框架生成的精心设计的手势来修改用户行为的方案。通过跟踪用户的行为,系统将提供适当的响应,从而形成交互的闭环。
项目成果
期刊论文数量(19)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Meaningful head movements driven by emotional synthetic speech
由情感合成语音驱动的有意义的头部运动
- DOI:10.1016/j.specom.2017.07.004
- 发表时间:2017-12
- 期刊:
- 影响因子:3.2
- 作者:Sadoughi, Najmeh;Liu, Yang;Busso, Carlos
- 通讯作者:Busso, Carlos
Speech-Driven Expressive Talking Lips with Conditional Sequential Generative Adversarial Networks
具有条件序列生成对抗网络的语音驱动的表达性说话嘴唇
- DOI:10.1109/taffc.2019.2916031
- 发表时间:2018-06-01
- 期刊:
- 影响因子:11.2
- 作者:Najmeh Sadoughi;C. Busso
- 通讯作者:C. Busso
Expressive Speech-Driven Lip Movements with Multitask Learning
具有多任务学习功能的富有表现力的言语驱动的嘴唇运动
- DOI:10.1109/fg.2018.00066
- 发表时间:2018-05-15
- 期刊:
- 影响因子:0
- 作者:Najmeh Sadoughi;C. Busso
- 通讯作者:C. Busso
End-to-End Audiovisual Speech Recognition System With Multitask Learning
具有多任务学习的端到端视听语音识别系统
- DOI:10.1109/tmm.2020.2975922
- 发表时间:2024-09-13
- 期刊:
- 影响因子:7.3
- 作者:Fei Tao;C. Busso
- 通讯作者:C. Busso
Joint Learning of Speech-Driven Facial Motion with Bidirectional Long-Short Term Memory
语音驱动的面部运动与双向长短期记忆的联合学习
- DOI:10.1007/978-3-319-67401-8_49
- 发表时间:2017-08-27
- 期刊:
- 影响因子:0
- 作者:Najmeh Sadoughi;C. Busso
- 通讯作者:C. Busso
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Carlos Busso其他文献
Understanding Bias in Multispectral Autofluorescence Lifetime Imaging: Are Models Sensitive to Oral Location?
了解多光谱自发荧光寿命成像中的偏差:模型对口腔位置敏感吗?
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Kayla Caughlin;Rodrigo Cuenca Martinez;Gabriel P. Tortorelli;Dds Kathleen E. Higgins;Dds Ronald Faram;Javier A. Jo;Carlos Busso - 通讯作者:
Carlos Busso
Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition
揭示说话人嵌入中的情感簇:语音情感识别的对比学习策略
- DOI:
10.1109/icassp48485.2024.10447060 - 发表时间:
2024-01-19 - 期刊:
- 影响因子:0
- 作者:
Ismail Rasim Ulgen;Zongyang Du;Carlos Busso;Berrak Sisman - 通讯作者:
Berrak Sisman
Driver Head Pose Estimation with Multimodal Temporal Fusion of Color and Depth Modeling Networks
使用颜色和深度建模网络的多模态时间融合进行驾驶员头部姿势估计
- DOI:
- 发表时间:
1970-01-01 - 期刊:
- 影响因子:0
- 作者:
Susmitha Gogineni;Carlos Busso - 通讯作者:
Carlos Busso
SPEECH EMOTION RECOGNITION IN REAL STATIC AND DYNAMIC HUMAN-ROBOT INTERACTION SCENARIOS
真实静态和动态人机交互场景中的语音情感识别
- DOI:
10.1016/j.csl.2024.101666 - 发表时间:
2024-05-01 - 期刊:
- 影响因子:0
- 作者:
Nicolás Grágeda;Carlos Busso;Eduardo Alvarado;Ricardo García;R. Mahú;F. Huenupán;N. B. Yoma - 通讯作者:
N. B. Yoma
MSP-DISK: Naturalistic and Diverse In-Vehicle Database for Joint Pose and Seat Belt Detection
MSP-DISK:用于关节姿势和安全带检测的自然且多样化的车载数据库
- DOI:
10.1109/itsc57777.2023.10422221 - 发表时间:
2023-09-24 - 期刊:
- 影响因子:0
- 作者:
Isaac Brooks;Susmitha Gogineni;Sumit Kumar Jha;Soumitry J. Ray;Rajesh Narasimha;N. Al;Carlos Busso - 通讯作者:
Carlos Busso
Carlos Busso的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Carlos Busso', 18)}}的其他基金
CCRI: Medium: MSP-Podcast: Creating The Largest Speech Emotional Database By Leveraging Existing Naturalistic Recordings
CCRI:媒介:MSP-Podcast:利用现有的自然主义录音创建最大的语音情感数据库
- 批准号:
2016719 - 财政年份:2020
- 资助金额:
$ 49.41万 - 项目类别:
Standard Grant
CCRI: Medium: MSP-Podcast: Creating The Largest Speech Emotional Database By Leveraging Existing Naturalistic Recordings
CCRI:媒介:MSP-Podcast:利用现有的自然主义录音创建最大的语音情感数据库
- 批准号:
2016719 - 财政年份:2020
- 资助金额:
$ 49.41万 - 项目类别:
Standard Grant
CRI: CI-P: Creating the Largest Speech Emotional Database by Leveraging Existing Naturalistic Recordings
CRI:CI-P:利用现有的自然录音创建最大的语音情感数据库
- 批准号:
1823166 - 财政年份:2018
- 资助金额:
$ 49.41万 - 项目类别:
Standard Grant
FG 2015 Doctoral Consortium: Travel Support for Graduate Students
FG 2015 博士联盟:研究生旅行支持
- 批准号:
1540944 - 财政年份:2015
- 资助金额:
$ 49.41万 - 项目类别:
Standard Grant
CAREER: Advanced Knowledge Extraction of Affective Behaviors During Natural Human Interaction
职业:人类自然互动过程中情感行为的高级知识提取
- 批准号:
1453781 - 财政年份:2015
- 资助金额:
$ 49.41万 - 项目类别:
Continuing Grant
WORKSHOP: Doctoral Consortium for the International Conference on Multimodal Interaction (ICMI 2013)
研讨会:多模式交互国际会议博士联盟 (ICMI 2013)
- 批准号:
1346655 - 财政年份:2013
- 资助金额:
$ 49.41万 - 项目类别:
Standard Grant
EAGER: Exploring the Use of Synthetic Speech as Reference Model to Detect Salient Emotional Segments in Speech
EAGER:探索使用合成语音作为参考模型来检测语音中的显着情感片段
- 批准号:
1329659 - 财政年份:2013
- 资助金额:
$ 49.41万 - 项目类别:
Standard Grant
RI: Small: Collaborative Research: Exploring Audiovisual Emotion Perception using Data-Driven Computational Modeling
RI:小型:协作研究:使用数据驱动的计算模型探索视听情感感知
- 批准号:
1217104 - 财政年份:2012
- 资助金额:
$ 49.41万 - 项目类别:
Continuing Grant
Workshop: Doctoral Consortium at the 14th International Conference on Multimodal Interaction
研讨会:第14届多模态交互国际会议博士联盟
- 批准号:
1249319 - 财政年份:2012
- 资助金额:
$ 49.41万 - 项目类别:
Standard Grant
相似国自然基金
ALKBH5介导的SOCS3-m6A去甲基化修饰在颅脑损伤后小胶质细胞炎性激活中的调控作用及机制研究
- 批准号:82301557
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
miRNA前体小肽miPEP在葡萄低温胁迫抗性中的功能研究
- 批准号:
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:
PKM2苏木化修饰调节非小细胞肺癌起始细胞介导的耐药生态位的机制研究
- 批准号:82372852
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
基于翻译组学理论探究LncRNA H19编码多肽PELRM促进小胶质细胞活化介导电针巨刺改善膝关节术后疼痛的机制研究
- 批准号:82305399
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
CLDN6高表达肿瘤细胞亚群在非小细胞肺癌ICB治疗抗性形成中的作用及机制研究
- 批准号:82373364
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
相似海外基金
Multivariate Statistics and Machine Learning for Quality Control of Dried Ocimum Products
用于干罗勒产品质量控制的多元统计和机器学习
- 批准号:
10676412 - 财政年份:2023
- 资助金额:
$ 49.41万 - 项目类别:
Implementing a coupled system of integrative ML modeling and data validation for elucidating microglial therapeutic targets in neurodegenerative disease
实施集成机器学习建模和数据验证的耦合系统,以阐明神经退行性疾病中的小胶质细胞治疗靶点
- 批准号:
10699794 - 财政年份:2023
- 资助金额:
$ 49.41万 - 项目类别:
Integrative Approaches for Characterising Small-Molecule Binding to Disordered Proteins
表征小分子与无序蛋白质结合的综合方法
- 批准号:
BB/X009955/1 - 财政年份:2023
- 资助金额:
$ 49.41万 - 项目类别:
Fellowship
Digital clinical hypnosis for chronic pain management
用于慢性疼痛管理的数字临床催眠
- 批准号:
10696872 - 财政年份:2023
- 资助金额:
$ 49.41万 - 项目类别:
Integrative Multiomics to Uncover Novel Genes and Networks in Pulmonary Arterial Hypertension
综合多组学揭示肺动脉高压的新基因和网络
- 批准号:
10723950 - 财政年份:2023
- 资助金额:
$ 49.41万 - 项目类别: