Speech recognition accepting utterances including out-of-vocabularies

语音识别接受包括词汇表之外的话语

基本信息

  • 批准号:
    14380168
  • 负责人:
  • 金额:
    $ 8.96万
  • 依托单位:
  • 依托单位国家:
    日本
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
  • 财政年份:
    2002
  • 资助国家:
    日本
  • 起止时间:
    2002 至 2005
  • 项目状态:
    已结题

项目摘要

A speech recognition scheme was studied to accept utterances including out-of-vocabularies (OOVs). A hierarchical statistical language model was newly proposed to cope with OOVs and speech recognition experiments have been carried out to confirm its effectiveness. In this language model, we described word-neighboring characteristics of unregistered expressions and constituent phonotactic constraints statistically independently to cope with unregistered expressions. The upper layer of this hierarchical model consists of inter-word statistics expressed by multi-dimensional composite word N-grams and the lower layer expresses infra-word statistical phonotactics using multi-dimensional composite sub-word units. A series of speech recognition experiments have shown that this language modeling enables the effective use of independent statistics and achieved high recognition performance for utterances including OOVs. By expandingthis lower layer model for single words such as personal names a … More nd city names to much longer named entity such as book titles and movie titles, we have successfully shown the validity of this modeling to other unregistered expressions consisting of multiple words. This success suggests that the proposed language model is effective for OOVs task independently and the possibility of a task-free statistical language model by integrating different statistical constraints independently.In speech recognition experiments, long unregistered expressions for movie titles were expressed by multi-dimensionalcomposite word N-grams as a lower-layer model. Experimental results showed that the proposed model recognition accuracy almost corresponded to the theoretical upper limit obtained by registering all OOVs as recognition lexicons. Furthermore, multiple Markov models have been automatically obtained by splitting OOV characteristics into multiple lower layered models. The use of word-class intrinsic models and automatically derived unsupervised models were proved to be useful for general unspecified OOVs, which gives a guideline of building statistical language models according to the size and the quality of available language data. Less
研究了一种语音识别方案,以接受包括词汇外 (OOV) 在内的话语。我们新提出了一种分层统计语言模型来应对 OOV,并进行了语音识别实验来证实该语言模型的有效性。描述了未注册表达的单词相邻特征和构成语音约束,特别独立地处理未注册表达,该层次模型的上层由多维复合词表示的词间统计组成。 N-grams和下层使用多维复合子词单元表达词内统计语音策略,一系列语音识别实验表明,这种语言建模能够有效利用独立统计数据,并实现了对包括以下在内的话语的高识别性能。通过将单个单词(例如个人姓名和城市名称)的下层模型扩展到更长的命名实体(例如书名和电影标题),我们成功地证明了该模型对于由多个组成的其他未注册表达的有效性。这一成功表明,所提出的语言模型对于独立地完成 OOVs 任务是有效的,并且通过独立地集成不同的统计约束来实现无任务统计语言模型的可能性。 在语音识别实验中,电影标题的长未注册表达由多个表达。实验结果表明,所提出的模型识别精度几乎符合将所有OOV注册为识别词典所获得的理论上限,并且通过分割自动获得了多个马尔可夫模型。事实证明,词类内在模型和自动导出的无监督模型的使用对于一般未指定的 OOV 是有用的,这为根据可用语言的大小和质量构建统计语言模型提供了指导。数据较少。

项目成果

期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Mis-recognized utterance detection using hierarchical language model
使用分层语言模型进行错误识别的话语检测
Speech recognition of a named entity
命名实体的语音识别
  • DOI:
  • 发表时间:
    2005
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Tatsuhiko Tomita;Yoshiyuki Okimoto;Hirofumi Yamamoto;Yoshinori Sagisaka
  • 通讯作者:
    Yoshinori Sagisaka
タスク外語彙を含む音声の認識
包含非任务词汇的语音识别
S.Onishi, H.Yamamoto, G.Kikui, Y.Sagisaka: "A statistical word model using word-class specific constraints for handling out-of-vocabulary words in speech recognition"Proceedings of SNLP-Oriental COCOSDA 2002. 37-42 (2002)
S.Onishi、H.Yamamoto、G.Kikui、Y.Sagisaka:“使用特定于词类的约束来处理语音识别中的词汇外单词的统计单词模型”SNLP-东方 COCOSDA 会议记录 2002. 37-42
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Speech recognition of OOV expressions and OOV words
OOV 表情和 OOV 单词的语音识别
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

SAGISAKA Yoshinori其他文献

SAGISAKA Yoshinori的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('SAGISAKA Yoshinori', 18)}}的其他基金

Quantitative analysis and the modeling of L2 timing control through the comparison with L1 timing characteristics
通过与L1时序特性的比较对L2时序控制进行定量分析和建模
  • 批准号:
    23320091
  • 财政年份:
    2011
  • 资助金额:
    $ 8.96万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Speech synthesis with communicative prosody driven by the impressions of output lexicons
由输出词典印象驱动的具有交际韵律的语音合成
  • 批准号:
    18300063
  • 财政年份:
    2006
  • 资助金额:
    $ 8.96万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了