RI: Small: Low-Latency and High-Quality Simultaneous Translation

RI:小:低延迟、高质量同声翻译

基本信息

  • 批准号:
    2009071
  • 负责人:
  • 金额:
    $ 45万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-08-15 至 2024-07-31
  • 项目状态:
    已结题

项目摘要

Simultaneous language translation (interpretation) is widely used in many situations including multilateral organizations such as the United Nations, international summits and conferences, and legal proceedings. However, the concurrent perception and production in two languages makes this task extremely challenging and exhausting for humans. The number of professional simultaneous interpreters is extremely limited worldwide, and they have to work in groups of two or three where each interpreter can only sustain for about 15-30 minutes. Therefore, there is a critical need to develop simultaneous translation techniques to reduce the burden of human interpreters and make this service more accessible and affordable. However, simultaneous translation is also notoriously difficult for machines and accomplishing it consistently and reliably is considered one of the holy grails of Artificial Intelligence. Various methods have been proposed to solve this problem, but with three major limitations: (a) their translation model is still a full-sentence translation model; (b) they cannot achieve short latencies such as "3-seconds delay" common in human interpretation; and (c) their systems are complicated and difficult to train. Therefore, this project aims to develop new algorithms, techniques, and datasets for high-quality simultaneous machine translation with minimum delay (low latency). The technologies developed by this project will make simultaneous translation more affordable and accessible, which will improve the efficiency of human communication across linguistic barriers. This project also supports STEM education of underrepresented minorities (who do not speak English natively) by recruiting them in machine translation studies.Based on the principal investigator's successful prior work, the key idea in this project is to discard the conventional full-sentence translation paradigm and the classical sequence-to-sequence framework which processes the full input sentence before starting to translate and are thus ill-suited to simultaneous translation. Instead, this project adopts a "prefix-to-prefix" framework which starts translation after processing only a few input words, mimicking human interpreters. Though extremely simple, this framework achieves low latency and high translation quality. Using this framework, this project aims to (1) Develop an algorithm to detect and fix anticipation mistakes on the fly, and explore new evaluation metrics that can work for translations with revisions; (2) Develop dynamic and flexible translation strategies to balance quality and latency; (3) Construct better training data for simultaneous translation by revising the reference translations in a parallel text to remove unnecessary reorderings; (4) Apply the prefix-to-prefix framework to incremental text-to-speech synthesis (TTS), thus completing the end-to-end simultaneous speech-to-speech pipeline, improve its quality and latency, and compare with human simultaneous interpreters.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
同声传译(口译)广泛应用于联合国等多边组织、国际峰会和会议、法律诉讼等多种场合。然而,两种语言的同时感知和产生使得这项任务对人类来说极具挑战性和疲惫不堪。全球范围内专业同声传译员的数量极其有限,而且他们必须以两到三人一组的方式工作,每个译员只能持续大约15-30分钟。因此,迫切需要开发同声翻译技术,以减轻人工口译员的负担,并使这项服务更容易获得和负担得起。然而,同声翻译对于机器来说也是出了名的困难,一致且可靠地完成同声翻译被认为是人工智能的圣杯之一。人们提出了各种方法来解决这个问题,但存在三个主要局限性:(a)它们的翻译模型仍然是全句翻译模型; (b) 它们无法实现短延迟,例如人类解释中常见的“3秒延迟”; (c) 他们的系统复杂且难以训练。 因此,该项目旨在开发新的算法、技术和数据集,以实现最小延迟(低延迟)的高质量同步机器翻译。该项目开发的技术将使同声翻译变得更加经济实惠和易于使用,从而提高跨越语言障碍的人类交流效率。该项目还通过招募机器翻译研究来支持代表性不足的少数群体(母语不是英语)的 STEM 教育。基于主要研究者之前的成功工作,该项目的关键思想是放弃传统的整句翻译范式经典的序列到序列框架在开始翻译之前处理完整的输入句子,因此不适合同声翻译。相反,该项目采用“前缀到前缀”框架,在仅处理几个输入单词后开始翻译,模仿人类翻译。虽然非常简单,但该框架实现了低延迟和高翻译质量。使用该框架,该项目的目标是(1)开发一种算法来动态检测和修复预期错误,并探索可用于修订翻译的新评估指标; (2) 制定动态、灵活的翻译策略,平衡质量和延迟; (3)通过修改平行文本中的参考译文,删除不必要的重新排序,构建更好的同声翻译训练数据; (4) 将前缀到前缀框架应用于增量文本到语音合成(TTS),从而完成端到端同步语音到语音管道,提高其质量和延迟,并与人类同步进行比较该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(6)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
LazySampling and LinearSampling: fast stochastic sampling of RNA secondary structure with applications to SARS-CoV-2
LazySampling 和 LinearSampling:RNA 二级结构的快速随机采样及其在 SARS-CoV-2 中的应用
  • DOI:
    10.1093/nar/gkac1029
  • 发表时间:
    2022-11
  • 期刊:
  • 影响因子:
    14.9
  • 作者:
    Zhang, He;Li, Sizhen;Zhang, Liang;Mathews, David H.;Huang, Liang
  • 通讯作者:
    Huang, Liang
Direct Simultaneous Speech-to-Text Translation Assisted by Synchronized Streaming ASR
同步流 ASR 辅助的直接同步语音到文本翻译
  • DOI:
    10.18653/v1/2021.findings-acl.406
  • 发表时间:
    2021-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Chen, Junkun;Ma, Mingbo;Zheng, Renjie;Huang, Liang
  • 通讯作者:
    Huang, Liang
Improving Simultaneous Translation by Incorporating Pseudo-References with Fewer Reorderings
通过合并伪引用并减少重新排序来改进同声翻译
  • DOI:
    10.18653/v1/2021.emnlp-main.473
  • 发表时间:
    2021-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Chen, Junkun;Zheng, Renjie;Kita, Atsuhito;Ma, Mingbo;Huang, Liang
  • 通讯作者:
    Huang, Liang
RNA design via structure-aware multifrontier ensemble optimization
通过结构感知多前沿集成优化进行 RNA 设计
  • DOI:
    10.1093/bioinformatics/btad252
  • 发表时间:
    2023-06-30
  • 期刊:
  • 影响因子:
    5.8
  • 作者:
    Zhou, Tianshuo;Dai, Ning;Li, Sizhen;Ward, Ma;Mathews, David H.;Huang, Liang
  • 通讯作者:
    Huang, Liang
LinearTurboFold: Linear-time global prediction of conserved structures for RNA homologs with applications to SARS-CoV-2
LinearTurboFold:RNA 同系物保守结构的线性时间全局预测及其在 SARS-CoV-2 中的应用
  • DOI:
    10.1073/pnas.2116269118
  • 发表时间:
    2021-12
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Li, Sizhen;Zhang, He;Zhang, Liang;Liu, Kaibo;Liu, Boxiang;Mathews, David H.;Huang, Liang
  • 通讯作者:
    Huang, Liang
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Liang Huang其他文献

Clinical Interventions in Aging Dovepress the Effectiveness of a Combined Exercise Intervention on Physical Fitness Factors Related to Falls in Community-dwelling Older Adults
老龄化临床干预措施提高了联合运动干预对社区老年人跌倒相关体能因素的有效性
  • DOI:
    10.2196/13562
  • 发表时间:
    2019-01-30
  • 期刊:
  • 影响因子:
    3.2
  • 作者:
    Zhuang Jie;Liang Huang;Yanqiang Wu;Yanxin Zhang
  • 通讯作者:
    Yanxin Zhang
Stereoselective synthesis of a MCHr1 antagonist.
MCHr1 拮抗剂的立体选择性合成。
  • DOI:
    10.1021/jo701894v
  • 发表时间:
    2007-11-13
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Denise Andersen;T. Storz;Pingli Liu;Xin Wang;Leping Li;P. Fan;Xiaoqi Chen;A. Allgeier;A. Burgos;J. Tedrow;Jean Baum;Ying Chen;Richard D. Crockett;Liang Huang;R. Syed;R. Larsen;M. Martinelli
  • 通讯作者:
    M. Martinelli
Antiepileptic drugs for Tourette's syndrome
抽动秽语综合征的抗癫痫药物
  • DOI:
    10.1002/14651858.cd012043
  • 发表时间:
    2016-01-16
  • 期刊:
  • 影响因子:
    8.4
  • 作者:
    Chun;Lingli Zhang;Z. Hao;Liang Huang;W. Song
  • 通讯作者:
    W. Song
Role of total mesorectal excision in curative resection of rectal cancer
全直肠系膜切除术在直肠癌根治性切除中的作用
Transmission and scarring in graphene quantum dots
石墨烯量子点的传输和结疤
  • DOI:
    10.1088/0953-8984/21/34/344203
  • 发表时间:
    2009-08-26
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Liang Huang;Y. Lai;D. Ferry;R. Akis;S. Goodnick
  • 通讯作者:
    S. Goodnick

Liang Huang的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Liang Huang', 18)}}的其他基金

MFB: Better Homologous Folding using Computational Linguistics and Deep Learning
MFB:使用计算语言学和深度学习更好的同源折叠
  • 批准号:
    2330737
  • 财政年份:
    2024
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
RI: Small: Fast and Accurate Natural Language Parsing and Generation by Marrying Deep Learning with Dynamic Programming
RI:小型:将深度学习与动态规划相结合,快速准确地进行自然语言解析和生成
  • 批准号:
    1817231
  • 财政年份:
    2018
  • 资助金额:
    $ 45万
  • 项目类别:
    Continuing Grant
EAGER: Collaborative Research: Scaling Up Discriminative Learning for Natural Language Understanding and Translation
EAGER:协作研究:扩大自然语言理解和翻译的判别学习
  • 批准号:
    1656051
  • 财政年份:
    2015
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
EAGER: Collaborative Research: Scaling Up Discriminative Learning for Natural Language Understanding and Translation
EAGER:协作研究:扩大自然语言理解和翻译的判别学习
  • 批准号:
    1449278
  • 财政年份:
    2014
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
SBIR Phase II: Amphiphilic Copolymers as Thickening Agents for Personal Care Products
SBIR 第二阶段:作为个人护理产品增稠剂的两亲性共聚物
  • 批准号:
    1430647
  • 财政年份:
    2014
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
SBIR Phase I: Amphiphilic Copolymers as Thickening Agents for Personal Care Products
SBIR 第一阶段:作为个人护理产品增稠剂的两亲性共聚物
  • 批准号:
    1248253
  • 财政年份:
    2013
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant

相似国自然基金

Plin2经脂质代谢途径调控小胶质细胞炎症反应在慢性低灌注脑白质损伤中的作用及机制研究
  • 批准号:
    82301507
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
低强度rTMS调控小胶质细胞Kv1.3钾通道抑制脑缺血早期神经元焦亡的研究
  • 批准号:
    82302865
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
IL-1R2低表达中性粒细胞亚群通过p38-铁死亡途径促进非小细胞肺癌进展的机制研究
  • 批准号:
    82372855
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
IMRCs调控CPNE1-NLK-STAT3通路介导小胶质细胞极化改善慢性低灌注性血管性认知障碍的机制研究
  • 批准号:
    82301450
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
理论探索原子级精确的金属纳米团簇活化/转化低碳小分子反应机制
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
  • 批准号:
    2218773
  • 财政年份:
    2022
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
  • 批准号:
    2100158
  • 财政年份:
    2021
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
RI: Small: Creating Text-to-Speech Synthesis for Low Resource Languages
RI:小型:为低资源语言创建文本到语音合成
  • 批准号:
    1717680
  • 财政年份:
    2017
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
RI: Small: Collaborative Research: Structured Inference for Low-Level Vision
RI:小型:协作研究:低级视觉的结构化推理
  • 批准号:
    1820693
  • 财政年份:
    2017
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
RI: Small: Low Cost Technologies to Improve the Quality of 3D Scanning
RI:小型:提高 3D 扫描质量的低成本技术
  • 批准号:
    1717355
  • 财政年份:
    2017
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了