Universal-Phonetic-Segment-Based Speech Coding and Its Applications to Speech Processing

基于通用语音段的语音编码及其在语音处理中的应用

基本信息

批准号：
15300026
负责人：
TANAKA Kazuyo
金额：
$ 10.56万
依托单位：
University of Tsukuba
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (B)
财政年份：
2003
资助国家：
日本
起止时间：
2003 至 2005
项目状态：
已结题

来源：
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-15300026/
关键词：
speech recognition spoken document retrieval phonetic code IPA Dynamic Programming phone model multilingual open vocabulary 汎用音声符号音声音響モデル音声要約

项目摘要

In this project, we present a novel speech processing framework, where all of the acoustic speech samples are once encoded into universal phonetic segment (UPS) sequences and spoken document processing (SDP) systems, such as recognition, retrieval, indexing, are constructed on this UPS domain. Adopting this framework, the SDP systems are separated from the original acoustic correlates or environments. This makes it possible to realize such flexibility that recognition-type processing can be handled by just calculating distances between UPS sequences, and also can be constructed on distributed processing schemes.Through this project, we have developed the following component techniques on this framework : 1)an original fine sub-phonetic segment (SPS) set as the UPS set, which brought high performance recognition and easy processing of multilingual speech, 2)effective DP(dynamic programming)-based sequence matching algorithms, called Shift CDP and Relay CDP. Effectiveness of the processing framework, the SPS set, and DP-based algorithms are evaluated by constructing speech recognition and open vocabulary spoken document retrieval (SDR) systems. Experimental results showed that the proposed SDP systems are superior to those based on conventional methods in performance evaluation. We have finally constructed a real time open vocabulary SDR system for demonstration, in which the system can retrieve broadcast video by user's speech.

在这个项目中，我们提出了一个新颖的语音处理框架，其中所有声音语音样本曾经曾经编码为通用的语音段（UPS）序列（UPS）序列和口语文档处理（SDP）系统，例如识别，检索，索引，是在此UPS域中构建的。采用此框架，SDP系统与原始的声学相关性或环境分开。 This makes it possible to realize such flexibility that recognition-type processing can be handled by just calculating distances between UPS sequences, and also can be constructed on distributed processing schemes.Through this project, we have developed the following component techniques on this framework : 1)an original fine sub-phonetic segment (SPS) set as the UPS set, which brought high performance recognition and easy processing of multilingual speech, 2)effective DP(dynamic基于编程）匹配算法的序列，称为移位CDP和继电器CDP。处理框架，SPS集和基于DP的算法的有效性是通过构建语音识别和开放词汇式口语文件检索（SDR）系统来评估的。实验结果表明，所提出的SDP系统优于基于性能评估中常规方法的系统。我们终于构建了一个实时开放词汇SDR系统进行演示，在该系统中，系统可以通过用户的语音检索广播视频。

项目成果

期刊论文数量（87）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

HMM-based noise-robust feature compensation

DOI：
10.1016/j.specom.2006.03.002
发表时间：
2006-09
期刊：
Speech Commun.
影响因子：
0
作者：
A. Sasou;F. Asano;Satoshi Nakamura;Kazuyo Tanaka
通讯作者：
A. Sasou;F. Asano;Satoshi Nakamura;Kazuyo Tanaka

用于分析流数据结构的相似部分提取

DOI：
发表时间：
2004
期刊：
Proc.of 5th European Conference on Machine Learning (ECML2004) 1
影响因子：
0
作者：
Itoh;Y.;Tanaka;K.;Lee;S.W.
通讯作者：
S.W.

HMM-Based Feature Compensation Method : An Evaluation Using the AURORA2

基于 HMM 的特征补偿方法：使用 AURORA2 进行评估

DOI：
发表时间：
2004
期刊：
Proc.of International Conference on Spoken Language Processing (ICSLP2004) 1
影响因子：
0
作者：
Sasou;A.;Asano;F.;Tanaka;K.;Nakamura;S.
通讯作者：
S.

音声工学

音频工程

DOI：
发表时间：
2005
期刊：
影响因子：
0
作者：
Akira Sasou;Futoshi Asano;Satoshi Nakamura;Kazuyo Tanaka;Kazuyo Narita;坂本雄児;板橋秀一
通讯作者：
板橋秀一

音素片のカーネル主成分分析を用いたトピックセグメンテーション

使用音素片段的核主成分分析进行主题分割

DOI：
发表时间：
2005
期刊：
人工知能学会 1E2-03
影响因子：
0
作者：
佐土原健;児島宏明;李時旭
通讯作者：
李時旭

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

TANAKA Kazuyo其他文献

TANAKA Kazuyo的其他文献

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

{{ truncateString('TANAKA Kazuyo', 18)}}的其他基金

Development of Continuous Voice Morphing Using Separated Vocal TractArea Functions, Glottal Source Waves, and Prosodic Features

使用分离声带区域功能、声门源波和韵律特征开发连续语音变形

批准号：
22500145
财政年份：
2010
资助金额：
$ 10.56万
项目类别：
Grant-in-Aid for Scientific Research (C)

Realtime and multiple degree-of-freedom electric hand system based on electromyogram signals

基于肌电信号的实时多自由度电动手系统

批准号：
19500377
财政年份：
2007
资助金额：
$ 10.56万
项目类别：
Grant-in-Aid for Scientific Research (C)

相似海外基金

Studies of speech, image and natural language processing for multimodal spoken document retrieval

多模态语音文档检索的语音、图像和自然语言处理研究

批准号：
23K11216
财政年份：
2023
资助金额：
$ 10.56万
项目类别：
Grant-in-Aid for Scientific Research (C)

Improvement of Spoken Term Detection Technique and its Application to Speech Recognition and Spoken Document Retrieval

口语检测技术的改进及其在语音识别和口语文档检索中的应用

批准号：
23700111
财政年份：
2011
资助金额：
$ 10.56万
项目类别：
Grant-in-Aid for Young Scientists (B)

会员权益说明：