Synthesis of speech in any speaking styles based on corpus-based generation of prosodic features using the generation process model
使用生成过程模型基于语料库生成韵律特征来合成任何说话风格的语音
基本信息
- 批准号:17300055
- 负责人:
- 金额:$ 10.79万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for Scientific Research (B)
- 财政年份:2005
- 资助国家:日本
- 起止时间:2005 至 2007
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Research works were conducted to establish a corpus-based speech synthesis method, which is based on generation process model of fundamental frequency contours and can generate high-quality speech in any speaking styles. The original research plan was fulfilled with the following results :1. A method was developed to predict the command parameters of the generation process model using binary decision trees with inputs such as linguistic information available by parsing texts, and thus to synthesize fundamental frequency contours. An integrated method of prosodic control was realized by integrating the above method with other methods using binary decision trees to predict pause positions and lengths and phoneme durations. The validity of the method was shown through experiments on speech synthesis of various styles including emotional speech. A method was also developed to automatically extract the command parameters from observed fundamental frequency contours using binary decision tre … More es. It was shown that the accuracy of extraction increased by including linguistic information of the text into inputs of the trees.2. Binary decision trees were constructed to predict deviations in phrase and accent commands of the utterances with specific focuses from those without. Their inputs are accent types and positions in sentences of the focused words, and command values of the corresponding parts of the utterances without specific focus. An appropriate focus control was realized by modifying the phrase and accent commands predicted by the method in section 1 based on the predicted deviations.3. A two-step method was developed for generating fundamental frequency contours of Standard Chinese. It first generates phrase components in a corpus-based way, and then generates tone components in a corpus-based way. The method has a high flexibility in synthesizing fundamental frequency contours. As an example of flexible control, it was shown that proper focus control could be realized in a simple set of rules.4. Speech synthesis systems were constructed for Japanese and Chinese by integrating methods developed in sections 1 and 2 above with HMM speech synthesis. It was shown that synthetic speech with higher natural ness could be realized by our system than using "full" HMM synthesizer, where prosodic control was done in HMM framework. It was also shown that various styles of synthetic speech could be realized by our system.5. Spoken dialogue systems for road guidance and TV program guidance were constructed using the above speech synthesis systems. The validity of the developed speech synthesis method was proved through experiments on the control of speaking styles of reply speech depending on the user's characters and situations. Less
研究工作是建立一种基于语料库的语音合成方法,该方法基于基频轮廓生成过程模型,可以生成任何说话风格的高质量语音,完成了原研究计划,取得了以下成果: 1.开发了一种方法,使用二元决策树来预测生成过程模型的命令参数,并通过解析文本获得语言信息等输入,从而合成基频轮廓,从而通过集成上述方法实现了韵律控制的集成方法。与其他使用二元决策树的方法进行预测通过对包括情感语音在内的各种风格的语音合成的实验证明了该方法的有效性,该方法可以使用二元决策从观察到的基频轮廓中自动提取命令参数。结果表明,通过将文本的语言信息包含到树的输入中,可以提高提取的准确性。 2.构建二元决策树来预测具有特定焦点的话语和没有特定焦点的话语的短语和重音命令的偏差。口音类型以及焦点词在句子中的位置,以及没有特定焦点的话语相应部分的命令值,通过根据预测的偏差修改第1节的方法预测的短语和重音命令,实现了适当的焦点控制。 .3.提出了一种生成标准汉语基频轮廓的方法,首先基于语料库生成短语成分,然后基于语料库生成声调成分。该方法具有较高的灵活性。合成基频轮廓。作为灵活控制的一个例子,结果表明可以通过一组简单的规则实现适当的焦点控制。 4.通过将上面第1 节和第2 节中开发的方法与HMM 语音合成相结合,构建了日语和中文语音合成系统。结果表明,与使用“完整”HMM 合成器(在 HMM 框架中进行韵律控制)相比,我们的系统可以实现具有更高自然度的合成语音。还表明,我们的系统可以实现各种风格的合成语音。 5. 口语对话系统利用上述语音合成系统构建了用于道路引导和电视节目引导的语音合成方法,通过根据用户的性格和情况控制回复语音的说话风格的实验证明了所开发的语音合成方法的有效性。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Estimation of intonation variation with constrained tone transformations
通过受约束的声调变换来估计声调变化
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:Keikichi Hirose;Yusuke Furuyama;Nobuaki Minematsu;Keikichi Hirose;Keikichi Hirose;広瀬啓吉;Keikichi Hirose;Keikichi Hirose;Quinghua Sun;Jinfu Ni
- 通讯作者:Jinfu Ni
日本語テキスト音声合成用アクセント結合規則の改良
改进日语文本语音合成的重音组合规则
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:Keikichi Hirose;Yusuke Furuyama;Nobuaki Minematsu;Keikichi Hirose;Keikichi Hirose;広瀬啓吉;Keikichi Hirose;Keikichi Hirose;Quinghua Sun;Jinfu Ni;Keikichi Hirose;黒岩 龍
- 通讯作者:黒岩 龍
Constrained tone transformation technique for separation and combination of Mandarin tone and intonation
普通话声调与语调分离与组合的约束声调变换技术
- DOI:
- 发表时间:2006
- 期刊:
- 影响因子:0
- 作者:Corinne Touati;Atsushi Inoie;Hisao Kameda;H.Kameda;Jinfu Ni
- 通讯作者:Jinfu Ni
Corpus-based extraction of F_0 contour generation process model parameters
基于语料库提取F_0轮廓生成过程模型参数
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:Keikichi Hirose;Yasufumi Asano;Nobuaki Minematsu;Jinfu Ni;Wentao Gu;Keikichi Hirose;Qinghua Sun;Keikichi Hirose;越智景子;Keikichi Hirose;Jinfu Ni;Quinghua Sun;広瀬 啓吉;浅野 泰史;河村 美由紀;孫慶華;Keikichi Hirose;Keikichi Hirose
- 通讯作者:Keikichi Hirose
Prosody in spoken language technologies(Special Lecture)
口语技术中的韵律(专题讲座)
- DOI:
- 发表时间:2007
- 期刊:
- 影响因子:0
- 作者:Keikichi Hirose;Qinghua Sun;Nobuaki Minematsu;八木 裕司;Keikichi Hirose;Keikichi Hirose;Keikichi Hirose
- 通讯作者:Keikichi Hirose
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
HIROSE Keikichi其他文献
HIROSE Keikichi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('HIROSE Keikichi', 18)}}的其他基金
Pronunciation education system based on the systematization of non-mothor tongue speech prosody using generation process model and speech synthesis
基于生成过程模型和语音合成的非母语语音韵律系统化的发音教育系统
- 批准号:
24652115 - 财政年份:2012
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Advanced method of prosody control in statistical-based speech synthesis using generation process model of fundamental frequency contours
使用基频轮廓生成过程模型的基于统计的语音合成中韵律控制的先进方法
- 批准号:
24300068 - 财政年份:2012
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Expressive Multi-language Speech Synthesis Based on the Generation Process Model and Its Use for Automatic Speech Translation
基于生成过程模型的表达性多语言语音合成及其在自动语音翻译中的应用
- 批准号:
21300061 - 财政年份:2009
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
High-quality Speech Synthesis based on Accurate Analysis Method and Statistical Method
基于精确分析方法和统计方法的高质量语音合成
- 批准号:
12480079 - 财政年份:2000
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Naturally Sounding Speech Synthesis and Recognition Based on the Formulation of Prosody
基于韵律表述的自然语音合成与识别
- 批准号:
09480061 - 财政年份:1997
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Develoment of Spoken Dialogue System for Japanese and Chinese
日汉口语对话系统的开发
- 批准号:
08558028 - 财政年份:1996
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Formulation of Prosodic Features of Speech and its Application to Continuous Speech Recognition
语音韵律特征的制定及其在连续语音识别中的应用
- 批准号:
06452397 - 财政年份:1994
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Rule-Synthesis of Spoken Sentences for the Speech Dialogue Systems
语音对话系统的口语句子规则合成
- 批准号:
03452288 - 财政年份:1991
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for General Scientific Research (B)
Development of Output System of Announcing Speech with Input of Kanji-Kana Sentences
输入汉字假名句子的语音播报输出系统的开发
- 批准号:
01850073 - 财政年份:1989
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Developmental Scientific Research (B).
相似海外基金
High-quality Speech Synthesis based on Accurate Analysis Method and Statistical Method
基于精确分析方法和统计方法的高质量语音合成
- 批准号:
12480079 - 财政年份:2000
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Automatic Estimation of Fundamental Frequency Contour Parameters and Automatic Acquisition of Generative rules
基频轮廓参数自动估计及生成规则自动获取
- 批准号:
11480090 - 财政年份:1999
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Scientific Research (B).
A System for Rule Synthesis of Prosodic Features of Speech of Multiple Language Based on a Generative Model of Fundamental Frequency Contours
基于基频轮廓生成模型的多语言语音韵律特征规则综合系统
- 批准号:
08458090 - 财政年份:1996
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Studies on Feature Extraction and Discrimination of Spoken Languages.
口语特征提取与判别研究。
- 批准号:
07458065 - 财政年份:1995
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Formulation of Prosodic Features of Speech and its Application to Continuous Speech Recognition
语音韵律特征的制定及其在连续语音识别中的应用
- 批准号:
06452397 - 财政年份:1994
- 资助金额:
$ 10.79万 - 项目类别:
Grant-in-Aid for Scientific Research (B)