Multilingual corpus construction and domain adaptation for low-resource machine translation

低资源机器翻译的多语言语料库构建和领域适应

基本信息

  • 批准号:
    22KJ1724
  • 负责人:
  • 金额:
    $ 1.41万
  • 依托单位:
  • 依托单位国家:
    日本
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
  • 财政年份:
    2023
  • 资助国家:
    日本
  • 起止时间:
    2023-03-08 至 2024-03-31
  • 项目状态:
    已结题

项目摘要

During this year, I have published 5 papers and one journal paper is under review. For the 3 papers as the first author: 1) the first work published in an international conference AACL-IJCNLP2022 exploits BERT-based unsupervised subword segmentation for neural machine translation which is effective on low-resource to high-resource scenarios; 2) the second work published in a domestic conference NLP2023 utilizes machine translation of prompts for adjusting GPT-3 to Japanese tasks; 3) the third work submitting to the NLP journal leverages information from multiple subword segmenters in a proposed subword-relation-aware attention-mechanism and aligning loss objective. Other works include video-information for multimodal NMT which is published in the JIP journal, exploring contrastive word alignments for multilingual NMT which is published in a top international conference NAACL2022, and contrastive pre-training for relation extraction which is published in a top international conference EMNLP2022. Two co-authored papers are under review for international conference ACL2023 and one for EAMT2023. I have also participated in symposiums on campus and workshops in Japan, and communicate with many researchers there.Moreover, I took an internship at NICT in a national lab focusing on machine translation, and we have applied one patent for the BERT-based unsupervised subword segmentation.
在这一年中,我发表了5篇论文,一份期刊论文正在审查中。对于第一位作者的3篇论文:1)在国际会议AACL-IJCNLP2022中发表的第一项作品利用基于BERT的无监督子单词对神经机器翻译的子词细分,这在低资源到高资产品的高资产品方案中有效; 2)在国内会议NLP2023中发布的第二件作品利用了将GPT-3调整为日本任务的提示的机器翻译; 3)提交给NLP期刊的第三项工作利用了拟议的子词相关的注意力机构和对齐损失目标的多个子单词细分器的信息。其他作品包括用于在JIP期刊上发表的多模式NMT的视频信息,探讨了多语言NMT的对比度对准,该单词对准词在国际顶级会议NAACL2022中发表,并在Top International International International Conferition EMNLP20222上发表了有关关系提取的对比预培训。国际会议ACL2023的两篇共同撰稿的论文正在审查中,另一篇是EAMT2023的论文。我还参加了日本校园和讲习班的研讨会,并与那里的许多研究人员进行交流。此外,我在一个专注于机器翻译的国家实验室中在纽约市进行了实习,我们已经为基于Bert的无监督子词申请了一项专利。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation?
对比词对齐何时可以改善多对多神经机器翻译?
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zhuoyuan Mao;Chenhui Chu;Raj Dabre;Haiyue Song;Zhen Wan and Sadao Kurohashi
  • 通讯作者:
    Zhen Wan and Sadao Kurohashi
BERTSeg: BERT Based Subword Segmentation
BERTeg:基于 BERT 的子词分割
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
BERTSeg: BERT Based Unsupervised Subword Segmentation for Neural Machine Translation
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Haiyue Song;Raj Dabre;Zhuoyuan Mao;Chenhui Chu;S. Kurohashi
  • 通讯作者:
    Haiyue Song;Raj Dabre;Zhuoyuan Mao;Chenhui Chu;S. Kurohashi
Representative Data Selection for Sequence-to-Sequence Pre-training
序列到序列预训练的代表性数据选择
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Haiyue Song;Raj Dabre;Zhuoyuan Mao;Chenhui Chu;Sadao Kurohashi
  • 通讯作者:
    Sadao Kurohashi
Video-guided Machine Translation with Spatial Hierarchical Attention Network
  • DOI:
    10.18653/v1/2021.acl-srw.9
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Weiqi Gu;Haiyue Song;Chenhui Chu;S. Kurohashi
  • 通讯作者:
    Weiqi Gu;Haiyue Song;Chenhui Chu;S. Kurohashi
共 9 条
  • 1
  • 2
前往

相似海外基金

Learning about ChatGPT for educational purposes: Examining the role of online teacher communities for supporting teachers in Japan
了解用于教育目的的 ChatGPT:检查在线教师社区在支持日本教师方面的作用
  • 批准号:
    24K16767
    24K16767
  • 财政年份:
    2024
  • 资助金额:
    $ 1.41万
    $ 1.41万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
    Grant-in-Aid for Early-Career Scientists
PestGPT: Integrating Visual Intelligence and ChatGPT into a Mobile Solution for Sustainable Pest Management
PestGPT:将视觉智能和 ChatGPT 集成到可持续害虫管理的移动解决方案中
  • 批准号:
    10076558
    10076558
  • 财政年份:
    2023
  • 资助金额:
    $ 1.41万
    $ 1.41万
  • 项目类别:
    Collaborative R&D
    Collaborative R&D
SBIR Phase I: Using ChatGPT and Machine Learning to Power Positive Change among Justice Involved Youth
SBIR 第一阶段:利用 ChatGPT 和机器学习推动参与正义的青少年发生积极变化
  • 批准号:
    2333168
    2333168
  • 财政年份:
    2023
  • 资助金额:
    $ 1.41万
    $ 1.41万
  • 项目类别:
    Standard Grant
    Standard Grant
Detection and Analysis of Automatically Generated Text according to the Applications
根据应用自动生成文本的检测和分析
  • 批准号:
    23K11767
    23K11767
  • 财政年份:
    2023
  • 资助金额:
    $ 1.41万
    $ 1.41万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
    Grant-in-Aid for Scientific Research (C)
Building a RT-ChatGPT on Radiotherapy for Cancer Treatment using a Medically Trained OpenAI ChatGPT
使用经过医学训练的 OpenAI ChatGPT 构建癌症放射治疗的 RT-ChatGPT
  • 批准号:
    487811
    487811
  • 财政年份:
    2023
  • 资助金额:
    $ 1.41万
    $ 1.41万
  • 项目类别:
    Miscellaneous Programs
    Miscellaneous Programs