Augmented speech communication using multi-modal signals with real-time, low-latency voice conversion
使用具有实时、低延迟语音转换的多模信号的增强语音通信
基本信息
- 批准号:22KJ1519
- 负责人:
- 金额:$ 1.41万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for JSPS Fellows
- 财政年份:2023
- 资助国家:日本
- 起止时间:2023-03-08 至 2024-03-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The purpose of this research is to apply voice conversion (VC) to realize an interactive speech production paradigm for real-world applications, with the help of multimodal signals and real-time processing techniques. In the second year, the applicant focused on three aspects.(1) Continued improvement on fundamental VC techniques, specifically self-supervised speech representation (S3R)-based VC, an emerging trend which reduces training data requirements. The applicant kept on updating S3PRL-VC, an open-source toolkit for researchers to evaluate S3R models for VC, and published the latest experimental results in the IEEE Journal of Selected Topics in Signal Processing.(2) Foreign accent conversion, a task that helps reduce foreign accents for efficient communication. A paper that provides an unified evaluation of current approaches and identifies unsolved problems is submitted to an international conference and currently under review.(3) Singing voice conversion, a fundamental technique that has the potential to augment the communication ability of human. The applicant is running a scientific event named the Singing Voice Conversion Challenge 2023, which aims to provide an unified experimental setting including task and dataset, in order to attract researchers world-wide to look into this problem and explore the limitation of the state-of-the-art techniques.
这项研究的目的是在多模式信号和实时处理技术的帮助下,应用语音转换(VC)来实现现实世界应用的交互式语音生产范式。在第二年,申请人着重于三个方面。(1)基本风险投资技术的持续改进,特别是基于自我监督的语音表示(S3R)的VC,这是一种降低培训数据要求的新兴趋势。申请人继续更新S3PRL-VC,这是一种开源工具包,供研究人员评估VC的S3R模型,并在信号处理中的IEEE选定主题杂志上发布了最新的实验结果。(2)外国重音转换,一项任务,有助于减少外国口音以高效的沟通。一篇论文提供了对当前方法并确定未解决问题的统一评估的论文,已提交国际会议并目前正在审查中。(3)唱歌语音转换,这是一种基本技术,有可能增强人类的沟通能力。申请人正在举办一场名为“歌声转换挑战2023”的科学活动,该活动旨在提供统一的实验设置,包括任务和数据集,以吸引全球研究人员,以调查这个问题并探索最先进的技术的限制。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion
- DOI:10.21437/interspeech.2021-208
- 发表时间:2021-06
- 期刊:
- 影响因子:0
- 作者:Wen-Chin Huang;Kazuhiro Kobayashi;Yu-Huai Peng;Ching-Feng Liu;Yu Tsao;Hsin-Min Wang;T. Toda
- 通讯作者:Wen-Chin Huang;Kazuhiro Kobayashi;Yu-Huai Peng;Ching-Feng Liu;Yu Tsao;Hsin-Min Wang;T. Toda
CRANK: an Open-Source Software for Nonparallel Voice Conversion based on Vetor-Quantized Variational Autoencoder
CRANK:基于矢量量化变分自动编码器的非并行语音转换开源软件
- DOI:
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Kazuhiro Kobayashi;Wen-Chin Huang;Yi-Chiao Wu;Patrick Tobing;Tomoki Hayashi;and Tomoki Toda
- 通讯作者:and Tomoki Toda
S3PRL-VC: Open-Source Voice Conversion Framework with Self-Supervised Speech Representations
- DOI:10.1109/icassp43922.2022.9746430
- 发表时间:2021-10
- 期刊:
- 影响因子:0
- 作者:Wen-Chin Huang;Shu-Wen Yang;Tomoki Hayashi;Hung-yi Lee;Shinji Watanabe;T. Toda
- 通讯作者:Wen-Chin Huang;Shu-Wen Yang;Tomoki Hayashi;Hung-yi Lee;Shinji Watanabe;T. Toda
On Prosody Modeling for ASR+TTS Based Voice Conversion
- DOI:10.1109/asru51503.2021.9688010
- 发表时间:2021-07
- 期刊:
- 影响因子:0
- 作者:Wen-Chin Huang;Tomoki Hayashi;Xinjian Li;Shinji Watanabe;T. Toda
- 通讯作者:Wen-Chin Huang;Tomoki Hayashi;Xinjian Li;Shinji Watanabe;T. Toda
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
HUANG WENCHIN其他文献
HUANG WENCHIN的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
A data-saving and self-supervised deep learning system for continuous ischemic stroke assessment
用于连续缺血性中风评估的数据保存和自我监督深度学习系统
- 批准号:
24K15011 - 财政年份:2024
- 资助金额:
$ 1.41万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Self-supervised feature learning for rapid processing of marine imagery
用于快速处理海洋图像的自监督特征学习
- 批准号:
LP220200949 - 财政年份:2023
- 资助金额:
$ 1.41万 - 项目类别:
Linkage Projects
Cognitively engaging walking exercise and neuromodulation to enhance brain function in older adults
认知性步行锻炼和神经调节可增强老年人的大脑功能
- 批准号:
10635832 - 财政年份:2023
- 资助金额:
$ 1.41万 - 项目类别:
SCH: Dementia Early Detection for Under-represented Populations via Fair Multimodal Self-Supervised Learning
SCH:通过公平的多模式自我监督学习对代表性不足的人群进行痴呆症早期检测
- 批准号:
10816864 - 财政年份:2023
- 资助金额:
$ 1.41万 - 项目类别:
Learning 3D information and ego-motion from acoustic video in extreme underwater environment
在极端水下环境中从声学视频中学习 3D 信息和自我运动
- 批准号:
23K19993 - 财政年份:2023
- 资助金额:
$ 1.41万 - 项目类别:
Grant-in-Aid for Research Activity Start-up