Data-driven articulatory modelling: foundations for a new generation of speech synthesis

数据驱动的发音建模：新一代语音合成的基础

基本信息

批准号：
EP/E027741/1
负责人：
Steve Renals
金额：
$ 36.55万
依托单位：
University of Edinburgh
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2006
资助国家：
英国
起止时间：
2006 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=EP%2FE027741%2F1
关键词：
Data driven articulatory modelling foundations

项目摘要

Technology to automatically generate artificial speech (speech synthesis) has come to sound natural enough within the past five years that its use has widened dramatically. Leaders in industry have integrated text-to-speech (TTS) systems into useful real-world applications, such as automated call-centres and call routing, telephone-based information systems (e.g. telephone banking or news services), readers for the visually impaired, and hands-free interfaces, such as car navigation systems.However, in spite of this success, state-of-the-art TTS systems are still severely limited in terms of control. In short, we can readily control what synthesisers say, but not how they say it. Therefore, although such systems are suitable for giving factual information in speech form, they are completely inadequate where a high level of expressiveness is required. By expressiveness we mean the ability to indicate questions or emphasis on selected words, or to convey emotion. Furthermore, the process of generating new synthetic voices is costly and labour-intensive. It is the aim of this project to develop an alternative to current speech synthesis technology with a comparable level of intelligibility and naturalness, but which affords far greater flexibility and control.Unit selection uses large collections of pre-recorded speech to perform synthesis by merely gluing together appropriate fragments in sequence. There is in effect little or no modelling of speech involved. In contrast, this project aims to develop a new model which is trained on pre-recorded speech and interprets it in a novel way: on the basis of its underlying articulation. The aim of this model is to produce synthetic speech which not only retains the qualities of the original speech used for training, but which also is much more versatile and therefore has the potential to be used in new and exciting ways.

在过去的五年中，自动产生人工语音（语音综合）的技术听起来足够自然，其使用已大大扩大。工业领域的领导者已将文本到语音系统集成到有用的现实应用程序中，例如自动呼叫中心和呼叫路由，基于电话的信息系统（例如电话银行或新闻服务），视力受损的读者，以及无手持的接口，以及诸如汽车导航系统的无限范围。简而言之，我们可以轻松地控制合成器所说的话，但不能控制他们的说法。因此，尽管此类系统适合以语音形式提供事实信息，但在需要高度表现力的情况下，它们完全不足。通过表现力，我们的意思是指出问题或强调选定单词或传达情感的能力。此外，产生新的合成声音的过程是昂贵和劳动力的。该项目的目的是开发具有可比性水平和自然性水平的当前语音合成技术的替代方案，但它具有更大的灵活性和控制性。单位选择使用大量预录制的语音集合来执行合成，仅通过依次将适当的片段粘合在一起。实际上几乎没有涉及的语音建模。相比之下，该项目旨在开发一种新的模型，该模型经过预先录制的语音培训，并以一种新颖的方式解释：基于其基础表达。该模型的目的是生产综合语音，该语音不仅保留了用于培训的原始语音的品质，而且还具有更广泛的用途，因此有可能以新颖而令人兴奋的方式使用。

项目成果

期刊论文数量（8）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Predicting tongue shapes from a few landmark locations

从一些标志性位置预测舌头形状

DOI：
发表时间：
2008
期刊：
影响因子：
0
作者：
C Qin
通讯作者：
C Qin

Preliminary inversion mapping results with a new EMA corpus

DOI：
10.21437/interspeech.2009-724
发表时间：
2009
期刊：
影响因子：
0
作者：
Korin Richmond
通讯作者：
Korin Richmond

Announcing the Electromagnetic Articulography (Day 1) Subset of the mngu0 Articulatory Corpus

DOI：
10.21437/interspeech.2011-316
发表时间：
2011-08
期刊：
影响因子：
0
作者：
Korin Richmond;P. Hoole;Simon King
通讯作者：
Korin Richmond;P. Hoole;Simon King

Glottal spectral separation for parametric speech synthesis

DOI：
10.21437/interspeech.2008-176
发表时间：
2008-09
期刊：
Speech Commun.
影响因子：
0
作者：
João P. Cabral;S. Renals;Korin Richmond;J. Yamagishi
通讯作者：
João P. Cabral;S. Renals;Korin Richmond;J. Yamagishi

HMM-based speech synthesiser using the LF-model of the glottal source

DOI：
10.1109/icassp.2011.5947405
发表时间：
2011-05
期刊：
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
影响因子：
0
作者：
João P. Cabral;S. Renals;J. Yamagishi;Korin Richmond
通讯作者：
João P. Cabral;S. Renals;J. Yamagishi;Korin Richmond

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Steve Renals其他文献

Are extractive text summarisation techniques portable to broadcast news?

提取文本摘要技术是否可以移植到广播新闻中？

DOI：
10.1109/asru.2003.1318489
发表时间：
2003
期刊：
2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721)
影响因子：
0
作者：
Heidi Christensen;Y. Gotoh;B. Kolluru;Steve Renals
通讯作者：
Steve Renals

HMM音声合成における変分ベイズ法に基づく線形回帰

HMM语音合成中基于变分贝叶斯方法的线性回归

DOI：
发表时间：
2012
期刊：
影响因子：
0
作者：
橋本佳;山岸順一;Peter Bell;Simon King;Steve Renals;徳田恵一
通讯作者：
徳田恵一

音声の障害患者のための音声合成枝術 : Voice banking and reconstruction

适用于语音障碍患者的语音合成技术：语音库和重建

DOI：
10.20697/jasj.67.12_587
发表时间：
2011
期刊：
影响因子：
0
作者：
山岸順一;Christophe Veaux;S. King;Steve Renals
通讯作者：
Steve Renals

EURASIP Journal on Applied Signal Processing 2003:2, 128–139 c ○ 2003 Hindawi Publishing Corporation A Statistical Approach to Automatic Speech Summarization

EURASIP 应用信号处理杂志 2003:2, 128–139 c ○ 2003 Hindawi Publishing Corporation 自动语音摘要的统计方法

DOI：
发表时间：
2002
期刊：
影响因子：
0
作者：
B. Kolluru;Heidi Christensen;Y. Gotoh;Steve Renals
通讯作者：
Steve Renals

A robust speaker-adaptive HMM-based text-to-speech synthesis

基于 HMM 的稳健的说话人自适应文本到语音合成

DOI：
发表时间：
2009
期刊：
IEEE Trans.on Audio, Speech, and Language Processing vol.17, 6
影响因子：
0
作者：
Junichi Yamagishi;Takashi Nose;Heiga Zen;Zhenhua Ling;Tomoki Toda;Keiichi Tokuda;Simon King;Steve Renals
通讯作者：
Steve Renals