Natural Speech Technology
自然语音技术
基本信息
- 批准号:EP/I031022/1
- 负责人:
- 金额:$ 794.6万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2011
- 资助国家:英国
- 起止时间:2011 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Humans are highly adaptable, and speech is our natural medium for informal communication. When communicating, we continuously adjust to other people, to the situation, and to the environment, using previously acquired knowledge to make this adaptation seem almost instantaneous. Humans generalise, enabling efficient communication in unfamiliar situations and rapid adaptation to new speakers or listeners. Current speech technology works well for certain controlled tasks and domains, but is far from natural, a consequence of its limited ability to acquire knowledge about people or situations, to adapt, and to generalise. This accounts for the uneasy public reaction to speech-driven systems. For example, text-to-speech synthesis can be as intelligible as human speech, but lacks expression and is not perceived as natural. Similarly, the accuracy of speech recognition systems can collapse if the acoustic environment or task domain changes, conditions which a human listener would handle easily. Research approaches to these problems have hitherto been piecemeal and as a result progress has been patchy. In contrast NST will focus on the integrated theoretical development of new joint models for speech recognition and synthesis. These models will allow us to incorporate knowledge about the speakers, the environment, the communication context and awareness of the task, and will learn and adapt from real world data in an online, unsupervised manner. This theoretical unification is already underway within the NST labs and, combined with our record of turning theory into practical state-of-the-art applications, will enable us to bring a naturalness to speech technology that is not currently attainable.The NST programme will yield technology which (1) approaches human adaptability to new communication situations, (2) is capable of personalised communication, and (3) takes account of speaker intention and expressiveness in speech recognition and synthesis. This is an ambitious vision. Its success will be measured in terms of how the theoretical development reshapes the field over the next decade, the takeup of the software systems that we shall develop, and through the impact of our exemplar interactive applications.We shall establish a strong User Group to maximise the impact of the project, with a members concerned with clinical applications, as well as more general speech technology. Members of the User Group include Toshiba, EADS Innovation Works, Cisco, Barnsley Hospital NHS Foundation Trust, and the Euan MacDonald Centre for MND Research. An important interaction with the User Group will be validating our systems on their data and tasks, discussed at an annual user workshop.
人类具有很高的适应性,语音是我们进行非正式交流的天然媒介。交流时,我们使用先前获得的知识来使这种适应似乎几乎瞬时地适应他人,情况和环境。人类概括,在不熟悉的情况下有效地沟通,并迅速适应新的演讲者或听众。当前的语音技术在某些受控的任务和领域效果很好,但远非自然,这是由于其获得有关人或情况知识,适应和概括的能力有限的结果。这说明了公众对语音驱动系统的不安反应。例如,文本到语音的综合可以像人类的言语一样可理解,但缺乏表达,也不被认为是自然的。同样,如果声学环境或任务域的变化,语音识别系统的准确性可能会崩溃,那么人类听众可以轻松处理的条件。迄今为止,解决这些问题的研究方法是零散的,因此进步是零散的。相比之下,NST将集中于新的言语识别和综合联合模型的综合理论发展。这些模型将使我们能够纳入有关说话者,环境,沟通环境和对任务意识的知识,并以在线,无监督的方式中学习和适应现实世界数据。 This theoretical unification is already underway within the NST labs and, combined with our record of turning theory into practical state-of-the-art applications, will enable us to bring a naturalness to speech technology that is not currently attainable.The NST programme will yield technology which (1) approaches human adaptability to new communication situations, (2) is capable of personalised communication, and (3) takes account of speaker intention and expressiveness in speech recognition and synthesis.这是一个雄心勃勃的愿景。它的成功将以理论开发在未来十年的重塑领域,我们将开发的软件系统的使用以及通过我们的示例互动应用的影响来衡量。我们应建立一个强大的用户群体,以最大程度地利用该项目的影响,与临床应用有关的成员以及更多的一般语音技术以及更多的一般语音技术。用户组的成员包括东芝,EADS创新工作,思科,Barnsley Hospital NHS基金会信托基金会和Euan MacDonald MND研究中心。与用户组的重要互动将是在年度用户研讨会上讨论的数据和任务上的系统。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Multi-reference WER for evaluating ASR for languages with no orthographic rule
用于评估没有拼写规则的语言的 ASR 的多参考 WER
- DOI:
- 发表时间:2015
- 期刊:
- 影响因子:0
- 作者:Ali A
- 通讯作者:Ali A
Reactive accent interpolation through an interactive map application
通过交互式地图应用程序进行反应式重音插值
- DOI:
- 发表时间:2013
- 期刊:
- 影响因子:0
- 作者:Astrinaki M.
- 通讯作者:Astrinaki M.
TRANSCRIPTION OF MULTI-GENRE MEDIA ARCHIVES USING OUT-OF-DOMAIN DATA
- DOI:10.1109/slt.2012.6424244
- 发表时间:2012-01-01
- 期刊:
- 影响因子:0
- 作者:Bell, P. J.;Gales, M. J. F.;Woodland, P. C.
- 通讯作者:Woodland, P. C.
A system for automatic alignment of broadcast media captions using weighted finite-state transducers
- DOI:10.1109/asru.2015.7404861
- 发表时间:2015-12
- 期刊:
- 影响因子:0
- 作者:P. Bell;S. Renals
- 通讯作者:P. Bell;S. Renals
Multi-level adaptive networks in tandem and hybrid ASR systems
- DOI:10.1109/icassp.2013.6639014
- 发表时间:2013-05
- 期刊:
- 影响因子:0
- 作者:P. Bell;P. Swietojanski;S. Renals
- 通讯作者:P. Bell;P. Swietojanski;S. Renals
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Steve Renals其他文献
Are extractive text summarisation techniques portable to broadcast news?
提取文本摘要技术是否可以移植到广播新闻中?
- DOI:
10.1109/asru.2003.1318489 - 发表时间:
2003 - 期刊:
- 影响因子:0
- 作者:
Heidi Christensen;Y. Gotoh;B. Kolluru;Steve Renals - 通讯作者:
Steve Renals
HMM音声合成における変分ベイズ法に基づく線形回帰
HMM语音合成中基于变分贝叶斯方法的线性回归
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
橋本佳;山岸順一;Peter Bell;Simon King;Steve Renals;徳田恵一 - 通讯作者:
徳田恵一
音声の障害患者のための音声合成枝術 : Voice banking and reconstruction
适用于语音障碍患者的语音合成技术:语音库和重建
- DOI:
10.20697/jasj.67.12_587 - 发表时间:
2011 - 期刊:
- 影响因子:0
- 作者:
山岸 順一;Christophe Veaux;S. King;Steve Renals - 通讯作者:
Steve Renals
EURASIP Journal on Applied Signal Processing 2003:2, 128–139 c ○ 2003 Hindawi Publishing Corporation A Statistical Approach to Automatic Speech Summarization
EURASIP 应用信号处理杂志 2003:2, 128–139 c ○ 2003 Hindawi Publishing Corporation 自动语音摘要的统计方法
- DOI:
- 发表时间:
2002 - 期刊:
- 影响因子:0
- 作者:
B. Kolluru;Heidi Christensen;Y. Gotoh;Steve Renals - 通讯作者:
Steve Renals
A robust speaker-adaptive HMM-based text-to-speech synthesis
基于 HMM 的稳健的说话人自适应文本到语音合成
- DOI:
- 发表时间:
2009 - 期刊:
- 影响因子:0
- 作者:
Junichi Yamagishi;Takashi Nose;Heiga Zen;Zhenhua Ling;Tomoki Toda;Keiichi Tokuda;Simon King;Steve Renals - 通讯作者:
Steve Renals
Steve Renals的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Steve Renals', 18)}}的其他基金
Ultrax2020: Ultrasound Technology for Optimising the Treatment of Speech Disorders.
Ultrax2020:优化言语障碍治疗的超声技术。
- 批准号:
EP/P02338X/1 - 财政年份:2017
- 资助金额:
$ 794.6万 - 项目类别:
Research Grant
Ultrax: Real-time tongue tracking for speech therapy using ultrasound
Ultrax:使用超声波进行言语治疗的实时舌头追踪
- 批准号:
EP/I027696/1 - 财政年份:2011
- 资助金额:
$ 794.6万 - 项目类别:
Research Grant
MultiMemoHome: Multimodal Reminders Within the Home
MultiMemoHome:家庭内的多模式提醒
- 批准号:
EP/G060614/1 - 财政年份:2009
- 资助金额:
$ 794.6万 - 项目类别:
Research Grant
Data-driven articulatory modelling: foundations for a new generation of speech synthesis
数据驱动的发音建模:新一代语音合成的基础
- 批准号:
EP/E027741/1 - 财政年份:2006
- 资助金额:
$ 794.6万 - 项目类别:
Research Grant
相似国自然基金
融合多模态学习分析的英语演讲能力评估模型与应用研究
- 批准号:62107005
- 批准年份:2021
- 资助金额:24.00 万元
- 项目类别:青年科学基金项目
融合多模态学习分析的英语演讲能力评估模型与应用研究
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
儿童植入耳蜗后听觉行为与言语发展进程的关联性研究
- 批准号:81170916
- 批准年份:2011
- 资助金额:65.0 万元
- 项目类别:面上项目
儿童植入人工耳蜗后开放式听觉言语发育特性研究
- 批准号:30872859
- 批准年份:2008
- 资助金额:30.0 万元
- 项目类别:面上项目
相似海外基金
Connected Language and Speech Along the Spectrum of Alzheimer’s Disease and Related Dementias: Digital Assessment and Monitoring.
阿尔茨海默病和相关痴呆症范围内的互联语言和言语:数字评估和监测。
- 批准号:
10662754 - 财政年份:2023
- 资助金额:
$ 794.6万 - 项目类别:
Neural control of speech generation in human motor cortex
人类运动皮层语音生成的神经控制
- 批准号:
10722067 - 财政年份:2023
- 资助金额:
$ 794.6万 - 项目类别:
Developing a Personalized and Culturally Responsive Virtual Coach to Engage Persons with Alzheimer's Disease and Related Dementias in Cognitive and Physical Activities
开发个性化且具有文化适应性的虚拟教练,让阿尔茨海默病和相关痴呆症患者参与认知和体育活动
- 批准号:
10699847 - 财政年份:2023
- 资助金额:
$ 794.6万 - 项目类别:
Identification of Prodromal Neurodegeneration in Serotonergic-Induced REM sleep Behavior Disorder
血清素诱导的快速眼动睡眠行为障碍中前驱神经变性的鉴定
- 批准号:
10734350 - 财政年份:2023
- 资助金额:
$ 794.6万 - 项目类别:
Phase III Development of a Valid, Reliable, Clinically Feasible Measure of Transactional Success in Aphasic Conversation: Modernizing Methods of Acquisition and Analysis of Discourse Data
失语对话中交易成功的有效、可靠、临床可行的衡量标准的第三阶段开发:话语数据采集和分析的现代化方法
- 批准号:
10617305 - 财政年份:2022
- 资助金额:
$ 794.6万 - 项目类别: