EAGER: Collaborative Research: Scaling Up Discriminative Learning for Natural Language Understanding and Translation
EAGER:协作研究:扩大自然语言理解和翻译的判别学习
基本信息
- 批准号:1446996
- 负责人:
- 金额:$ 12.91万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2014
- 资助国家:美国
- 起止时间:2014-08-15 至 2016-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This EArly Grant for Exploratory Research aims to improve automatic understanding of natural language by machines, and automatic translation between languages such as Chinese and English. In the realm of understanding, the project develops methods for syntactically and semantically analyzing, or parsing, sentences. Improved parsing can help in accessing the enormous amount of information available in unstructured text on the web and in databases of newspapers and scanned books. Improved translation between languages increases opportunities for trade as well as for dissemination of information generally between nations and cultures. Machine translation is widely used today despite its generally poor quality, and any improvement in quality will improve access to information for millions of people. This project aims to exploit the power of machine learning algorithms that are designed to discriminate between correct and incorrect outputs by numerically optimizing mathematical functions that are defined in terms of the data available for training. Discriminative structured prediction algorithms have witnessed great success in the field of natural language processing (NLP) over the past decade, generally surpassing their generative counterparts. However, there remain two major problems which prevent discriminative methods from scaling to very large datasets: first, they typically assume exact search (over a prohibitively large search space), which is rarely possible in practice for problems such as parsing and translation. Secondly, they normally assume the data is completely annotated, whereas many naturally occurring datasets are only partially annotated: for example a parallel text in machine translation includes the source and target sentence pairs but not the derivation between them. As a result of these two problems, the current methods are not taking full advantage of the enormous and ever increasing amount of text data available to us.This EArly Grant ofr Exploratory Research (EAGER) aims to: - Develop a linear-time structured learning framework specifically tailored for inexact search, which hopefully retains theoretical properties of structured learning (e.g. convergence) under exact search. - Extend this framework to handle latent variables, such as derivations in machine translation, syntactic structures in semantic parsing, and semantic representations in question answering. If the exploratory extension to latent variable frameworks is sucessful, it will enable longer-term research to: - Apply these efficient learning algorithms to discriminative training of machine translation systems over the entire training dataset rather than only on a small development set. - Apply these efficient learning algorithms to discriminative training for syntactic and semantic parsing, with the goal of scaling up semantic parsing to enable web-scale knowledge extraction.
这项早期探索性研究资助旨在提高机器对自然语言的自动理解,以及中文和英文等语言之间的自动翻译。在理解领域,该项目开发了对句子进行句法和语义分析或解析的方法。改进的解析可以帮助访问网络上非结构化文本以及报纸和扫描书籍数据库中的大量可用信息。语言之间翻译的改进增加了贸易机会以及国家和文化之间的信息传播机会。尽管机器翻译的质量普遍较差,但如今机器翻译仍被广泛使用,而质量的任何改进都将改善数百万人获取信息的机会。 该项目旨在利用机器学习算法的强大功能,该算法旨在通过对根据可用于训练的数据定义的数学函数进行数值优化来区分正确和错误的输出。 过去十年,判别式结构化预测算法在自然语言处理(NLP)领域取得了巨大成功,总体上超越了生成式预测算法。然而,仍然存在两个主要问题阻碍判别性方法扩展到非常大的数据集:首先,它们通常假设精确搜索(在非常大的搜索空间上),这在实践中对于解析和翻译等问题很少是可能的。其次,他们通常假设数据已完全注释,而许多自然存在的数据集仅部分注释:例如,机器翻译中的并行文本包括源句和目标句对,但不包括它们之间的派生。由于这两个问题,当前的方法没有充分利用我们可用的大量且不断增加的文本数据。这项早期探索性研究资助 (EAGER) 旨在: - 开发线性时间结构化学习专门为不精确搜索量身定制的框架,希望保留精确搜索下结构化学习的理论特性(例如收敛)。 - 扩展该框架以处理潜在变量,例如机器翻译中的推导、语义解析中的句法结构以及问答中的语义表示。 如果对潜变量框架的探索性扩展成功,它将使得长期研究能够: - 将这些高效的学习算法应用于整个训练数据集上的机器翻译系统的判别训练,而不仅仅是在小型开发集上。 - 将这些高效的学习算法应用于句法和语义解析的判别训练,目标是扩大语义解析以实现网络规模的知识提取。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Daniel Gildea其他文献
Daniel Gildea的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Daniel Gildea', 18)}}的其他基金
RI: Small: Cache transition systems for sentence understanding and generation
RI:小型:用于句子理解和生成的缓存转换系统
- 批准号:
1813823 - 财政年份:2018
- 资助金额:
$ 12.91万 - 项目类别:
Standard Grant
RI: Large:Collaborative Research: Richer Representations for Machine Translation
RI:大型:协作研究:更丰富的机器翻译表示
- 批准号:
0910611 - 财政年份:2009
- 资助金额:
$ 12.91万 - 项目类别:
Continuing Grant
CAREER: Semantics for Statistical Machine Translation
职业:统计机器翻译语义
- 批准号:
0546554 - 财政年份:2006
- 资助金额:
$ 12.91万 - 项目类别:
Continuing Grant
相似国自然基金
数智背景下的团队人力资本层级结构类型、团队协作过程与团队效能结果之间关系的研究
- 批准号:72372084
- 批准年份:2023
- 资助金额:40 万元
- 项目类别:面上项目
颅颌面手术机器人辅助半面短小牵张成骨术的智能规划与交互协作研究
- 批准号:
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:
面向自主认知与群智协作的多智能体制造系统关键技术研究
- 批准号:52305539
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
大规模物联网多协作绿色信息感知和智慧响应决策一体化方法研究
- 批准号:62371149
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
多UAV协作的大规模传感网并发充电模型及其服务机制研究
- 批准号:62362017
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
相似海外基金
Collaborative Research: EAGER: IMPRESS-U: Groundwater Resilience Assessment through iNtegrated Data Exploration for Ukraine (GRANDE-U)
合作研究:EAGER:IMPRESS-U:通过乌克兰综合数据探索进行地下水恢复力评估 (GRANDE-U)
- 批准号:
2409395 - 财政年份:2024
- 资助金额:
$ 12.91万 - 项目类别:
Standard Grant
EAGER/Collaborative Research: An LLM-Powered Framework for G-Code Comprehension and Retrieval
EAGER/协作研究:LLM 支持的 G 代码理解和检索框架
- 批准号:
2347624 - 财政年份:2024
- 资助金额:
$ 12.91万 - 项目类别:
Standard Grant
EAGER/Collaborative Research: Revealing the Physical Mechanisms Underlying the Extraordinary Stability of Flying Insects
EAGER/合作研究:揭示飞行昆虫非凡稳定性的物理机制
- 批准号:
2344215 - 财政年份:2024
- 资助金额:
$ 12.91万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: Designing Nanomaterials to Reveal the Mechanism of Single Nanoparticle Photoemission Intermittency
合作研究:EAGER:设计纳米材料揭示单纳米粒子光电发射间歇性机制
- 批准号:
2345581 - 财政年份:2024
- 资助金额:
$ 12.91万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: Designing Nanomaterials to Reveal the Mechanism of Single Nanoparticle Photoemission Intermittency
合作研究:EAGER:设计纳米材料揭示单纳米粒子光电发射间歇性机制
- 批准号:
2345582 - 财政年份:2024
- 资助金额:
$ 12.91万 - 项目类别:
Standard Grant