Exemplar-based Expressive Speech Synthesis
基于样本的表达性语音合成
基本信息
- 批准号:EP/V046772/1
- 负责人:
- 金额:$ 27.81万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2021
- 资助国家:英国
- 起止时间:2021 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Synthetic voices are becoming ubiquitous: `smart' speakers at home, announcement systems on public transport, and voice-enabled assistants on call lines. There exist a strong public demand for `smarter' assistants capable of laughing at our jokes; interacting with our children as encouraging and emphatic tutors; calling to check up on our parents; providing a reassuring `ear' for an isolated person; and offering calming and supportive virtual therapy. To support current and future applications, voice synthesis technology needs to satisfy a number of requirements. First, it needs to be customisable for rapid research and development, and second, it needs to be able to produce any spoken content, including expressive voice characteristics. However, none of the current synthesis technologies can simultaneously satisfy all of the above requirements. For instance, while current non-machine learning approaches allow pre-recorded phrases to be efficiently combined into complete sentences, it also means that missing necessary phrases must be recorded first, thereby limiting their flexibility and efficiency. On the other hand, current machine learning models can seamlessly synthesise any spoken content. However, creating such models is a very costly, time-consuming and computationally demanding process. Furthermore, these models offer a very limited control over the qualities of the voice characteristics and lack interpretability, which are highly desirable conditions in both research and commercial settings.In this project, the objective is to develop a computationally efficient, customisable, expressive and interpretable speech synthesis, by drawing from the concept of `exemplars' in cognitive science.In the field of cognitive science, the notions of `exemplars' and `prototypes' form a part of a prominent view on how humans categorise concepts. In particular, exemplar theory argues that singular examples, rather than prototypes (an average of examples), form the basic building blocks of how we understand and interact with the world. The key argument in favour of exemplar theory is our ability as humans to solve complex tasks based on just a few examples, which makes this theory appealing to applications that involve complex phenomena or that require high computational efficiency. Furthermore, expressive speech synthesis combines expressivity and speech production, which are two complex phenomena that remain poorly understood. Unlike prototype theory, exemplar theory, at least theoretically, enables to produce expressive speech, provided that at least one recording of the desired spoken content and one recording featuring the desired expressivity are available. Lastly, adopting exemplar theory promotes transparency during the decision making process through the use of real examples that can be inspected, modified, replaced, added, etc. within the task.The objective will be achieved through three innovative means by: i) formulating a methodological framework for exemplar-based speech synthesis, ii) building an exemplar-based representation for speech expressivity from pre-recorded examples and iii) presenting a novel methodology for integrating this expressivity-based representation into the framework of i).
综合声音变得无处不在:在家中的“智能”扬声器,公共交通的公告系统以及呼叫线上支持语音的助手。人们对能够嘲笑我们笑话的“更聪明”的助手的公众需求很大。与我们的孩子互动是令人鼓舞和强调的导师;打电话检查我们的父母;为一个孤立的人提供令人放心的“耳朵”;并提供平静和支持的虚拟疗法。为了支持当前和将来的应用,语音合成技术需要满足许多要求。首先,它需要为快速的研发而定制,其次,它需要能够生成任何口头内容,包括表现力的语音特征。但是,当前的合成技术都无法同时满足上述所有要求。例如,虽然当前的非计算学习方法允许将预录的短语有效合并为完整的句子,但这也意味着必须先记录缺少必要的短语,从而限制其灵活性和效率。另一方面,当前的机器学习模型可以无缝综合任何口语内容。但是,创建这样的模型是一个非常昂贵的,耗时且在计算苛刻的过程中。此外,这些模型对语音特征和缺乏可解释性的质量提供了非常有限的控制,这些条件在研究和商业环境中是高度可取的条件。在这个项目中,目标是通过从认知科学的概念和cognions''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''概述中汲取灵感而开发的目的。就人类如何对概念进行分类的重要看法的一部分。特别是,示例理论认为,奇异的例子而不是原型(平均示例)构成了我们如何理解和与世界互动的基本构建基础。支持示例理论的关键论点是我们作为人类仅基于几个示例来解决复杂任务的能力,这使得该理论吸引了涉及复杂现象或需要高计算效率的应用。此外,表达性语音综合结合了表达性和言语产生,这是两个复杂的现象,这些现象仍然很少理解。与原型理论不同,至少从理论上讲,示例理论可以产生表达性的语音,前提是至少有一个所需的口语内容的记录和一种带有所需表现力的录音。 Lastly, adopting exemplar theory promotes transparency during the decision making process through the use of real examples that can be inspected, modified, replaced, added, etc. within the task.The objective will be achieved through three innovative means by: i) formulating a methodological framework for exemplar-based speech synthesis, ii) building an exemplar-based representation for speech expressivity from pre-recorded examples and iii) presenting a novel methodology for integrating this基于表达的表示i)。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Anton Ragni其他文献
Adapting Pretrained Models for Adult to Child Voice Conversion
采用预训练模型进行成人到儿童的语音转换
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Protima Nomo Sudro;Anton Ragni;Thomas Hain - 通讯作者:
Thomas Hain
Training Data Augmentation for Dysarthric Automatic Speech Recognition by Text-to-Dysarthric-Speech Synthesis
通过文本到构音障碍语音合成来增强构音障碍自动语音识别的训练数据
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Wing;Mattias Cross;Anton Ragni;Stefan Goetze - 通讯作者:
Stefan Goetze
Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users using Intermediate ASR Features and Human Memory Models
使用中级 ASR 特征和人类记忆模型对听力受损用户进行非侵入式语音清晰度预测
- DOI:
10.48550/arxiv.2401.13611 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Rhiannon Mogridge;George Close;Robert Sutherland;Thomas Hain;Jon Barker;Stefan Goetze;Anton Ragni - 通讯作者:
Anton Ragni
Anton Ragni的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
基于人类血清素神经元报告系统研究TSPYL1突变对婴儿猝死综合征的致病作用及机制
- 批准号:82371176
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
智慧城市导向下基于街景视觉表征的“人-环境”数字互联机制
- 批准号:52308015
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于上转换发光微球的光子晶体构筑及其角度相关发光性能多重调控机制研究
- 批准号:22308200
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于非共价键协同效应的受限偶氮均聚物易位合成、组装及性能研究
- 批准号:22361006
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
基于大塑性变形晶粒细化的背压触变反挤压锡青铜偏析行为调控研究
- 批准号:52365047
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
相似海外基金
Writing to Heal: Developing an Internet-Based Cognitive-Behavioral Writing Intervention for Alzheimer’s Disease Spousal Caregivers
写作治愈:为阿尔茨海默病配偶照顾者开发基于互联网的认知行为写作干预
- 批准号:
10449918 - 财政年份:2022
- 资助金额:
$ 27.81万 - 项目类别:
Writing to Heal: Developing an Internet-Based Cognitive-Behavioral Writing Intervention for Alzheimer’s Disease Spousal Caregivers
写作治愈:为阿尔茨海默病配偶照顾者开发基于互联网的认知行为写作干预
- 批准号:
10640904 - 财政年份:2022
- 资助金额:
$ 27.81万 - 项目类别:
Next-Generation Expressive Personalized Voices for Speech-Generating Devices
用于语音生成设备的下一代富有表现力的个性化声音
- 批准号:
10547241 - 财政年份:2022
- 资助金额:
$ 27.81万 - 项目类别:
Efficacy of a Trauma Intervention for Affect Regulation, Adherence, and Substance Use to Optimize PrEP for Women Who Inject Drugs
创伤干预对影响调节、依从性和物质使用的功效,以优化注射吸毒女性的 PrEP
- 批准号:
10425426 - 财政年份:2021
- 资助金额:
$ 27.81万 - 项目类别:
Efficacy of a Trauma Intervention for Affect Regulation, Adherence, and Substance Use to Optimize PrEP for Women Who Inject Drugs
创伤干预对影响调节、依从性和物质使用的功效,以优化注射吸毒女性的 PrEP
- 批准号:
10614581 - 财政年份:2021
- 资助金额:
$ 27.81万 - 项目类别: