CCRI: Planning: Planning for the Development of a Platform to Support Multilingual and Multi-Domain Coreference Annotation for Natural Language Processing Research
CCRI:规划:规划开发支持自然语言处理研究多语言、多领域共指标注的平台
基本信息
- 批准号:1925548
- 负责人:
- 金额:$ 10万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-09-01 至 2022-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In natural language processing, coreference resolution involves clustering together all words and phrases within a text that refer to the same entity. For example, in the sentence "Monsieur Poirot assured Hastings that he ought to have faith in him," the strings "Monsieur Poirot" and "him" refer to the same person, while "Hastings" and "he" refer to a different character. Resolving these references is challenging because it requires the application of syntactic, semantic, and world knowledge, and it is important since coreference is essential to intelligently understand the meaning of text for question answering, translation, corpus insights, and many other applications. Unfortunately, current coreference models are held back by the lack of human-annotated training data from various domains and world languages, mainly because it is expensive and time-consuming to collect such data at scale.This CCRI planning grant will take the first step toward breaking the coreference data bottleneck by creating two new resources for the community: (1) a software platform that facilitates cheap and accurate crowdsourced collection for tasks that require labeling text spans within documents, and (2) a multi-domain crowdsourced coreference dataset collected using this platform. The dataset resource will contain data from a variety of different domains (such as books and web forums), unlike prior datasets that focus primarily on newswire text, which will allow researchers who work on non-standard domains to integrate coreference systems into their modeling pipelines. This planning grant will also support discussions and conference workshops about the platform and data resources; the resulting community feedback will be incorporated into a CCRI full proposal that aims to use the platform to create a much larger and multilingual coreference dataset, as well as explore non-coreference data labeling tasks such as question answering.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
在自然语言处理中,核心分辨率涉及将所有单词和短语聚集在一起的文本中,该文本指的是同一实体。例如,在“ Poirot先生向黑斯廷斯先生保证他应该对他有信心的句子中,“弦”“ Monsieur Poirot”和“他”是同一个人,而“ Hastings”和“他”是指另一个角色。解决这些参考文献是具有挑战性的,因为它需要句法,语义和世界知识的应用,这一点很重要,因为Coreference对于智能理解文本的含义对于问题回答,翻译,语料库的见解和许多其他应用是必不可少的。不幸的是,由于缺乏来自各种领域和世界语言的人类通知的培训数据而阻碍了当前的核心模型,这主要是因为它很昂贵,而且很耗时,可以大规模收集此类数据。这项CCRI计划赠款将迈出第一步通过为社区创建两个新资源来破坏Coreference数据瓶颈:(1)一个软件平台,可促进需要在文档中标记文本跨度标签文本的任务,以及(2)多域众包核心数据集收集的使用,该任务需要标记文本跨度。这个平台。数据集资源将包含来自各种不同域(例如书籍和网络论坛)的数据,与以前的数据集不同,这些数据集主要关注新闻新闻文本,这将使在非标准域上工作的研究人员可以将核心方面的系统集成到其建模管道中。该计划赠款还将支持有关平台和数据资源的讨论和会议研讨会;由此产生的社区反馈将纳入CCRI的全面建议中,该建议旨在使用该平台创建更大且多语言的Coreference数据集,并探索非权益数据标签任务,例如问答。该奖项反映了NSF的法规任务和使用基金会的知识分子优点和更广泛的审查标准,通过评估被认为值得支持。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
ezCoref: Towards Unifying Annotation Guidelines for Coreference Resolution
ezCoref:迈向统一共指解析注释指南
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Gupta, Ankita;Karpinska, Marzena;Zhao, Wenlong;Krishna, Kalpesh;Merullo, Jack;Yeh, Luke;Iyyer, Mohit;O'Connor, Brendan
- 通讯作者:O'Connor, Brendan
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Brendan O'Connor其他文献
Bovine brain pyroglutamyl aminopeptidase (type-1): purification and characterisation of a neuropeptide-inactivating peptidase.
牛脑焦谷氨酰氨基肽酶(1 型):神经肽失活肽酶的纯化和表征。
- DOI:
10.1016/1357-2725(96)00034-9 - 发表时间:
1996 - 期刊:
- 影响因子:0
- 作者:
Philip M. Cummins;Brendan O'Connor - 通讯作者:
Brendan O'Connor
Thyrotropin‐Releasing Hormone
促甲状腺激素释放激素
- DOI:
10.1046/j.1471-4159.1995.65030953.x - 发表时间:
1995 - 期刊:
- 影响因子:4.7
- 作者:
R. O'Leary;Brendan O'Connor - 通讯作者:
Brendan O'Connor
Brendan O'Connor的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Brendan O'Connor', 18)}}的其他基金
Collaborative Research: DMREF: Establishing a molecular interaction framework to design and predict modern polymer semiconductor assembly
合作研究:DMREF:建立分子相互作用框架来设计和预测现代聚合物半导体组装
- 批准号:
2324191 - 财政年份:2023
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
CAREER: Social Aggregate Measurement from Text
职业:从文本进行社会聚合测量
- 批准号:
1845576 - 财政年份:2019
- 资助金额:
$ 10万 - 项目类别:
Continuing Grant
III: Small: Collaborative Research: Building Subjective Knowledge Bases by Modeling Viewpoints
III:小:协作研究:通过建模观点构建主观知识库
- 批准号:
1814955 - 财政年份:2018
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
INFEWS/T3: Solar-Powered Integrated Greenhouse (SPRING) Systems Using Wavelength Selective Photovoltaics for Complete Solar Utilization
INFEWS/T3:使用波长选择性光伏技术实现太阳能完全利用的太阳能集成温室 (SPRING) 系统
- 批准号:
1639429 - 财政年份:2017
- 资助金额:
$ 10万 - 项目类别:
Continuing Grant
CAREER: Mechanical Behavior of Flexible Electronic Films
职业:柔性电子薄膜的机械行为
- 批准号:
1554322 - 财政年份:2016
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Mechanical Behavior of Polymer-Fullerene Blends for Photovoltaic Applications
用于光伏应用的聚合物-富勒烯共混物的机械行为
- 批准号:
1200340 - 财政年份:2012
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
相似国自然基金
适应园区用户零碳演进发展的配电网柔性趋优规划方法
- 批准号:52307123
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
区域一体化背景下跨界地区创新空间协同发展机制及规划应对——基于长三角的实证
- 批准号:52378080
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
低碳˙智能˙韧性——土木工程学科发展规划战略研讨会
- 批准号:52242802
- 批准年份:2022
- 资助金额:10 万元
- 项目类别:专项基金项目
第二届建筑领域发展规划战略研讨会
- 批准号:
- 批准年份:2022
- 资助金额:10 万元
- 项目类别:
集群化发展模式下海上多风电场协同优化规划方法研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: CCRI: Planning-C: A Community for Configurability Open Research and Development (ACCORD)
合作研究:CCRI:Planning-C:可配置性开放研究与开发社区 (ACCORD)
- 批准号:
2234909 - 财政年份:2023
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Collaborative Research: CCRI: Planning-C: A Community for Configurability Open Research and Development (ACCORD)
合作研究:CCRI:Planning-C:可配置性开放研究与开发社区 (ACCORD)
- 批准号:
2234908 - 财政年份:2023
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
CCRI: Planning-C: A Framework for Development of Robots and IoT for Precision Agriculture
CCRI:Planning-C:精准农业机器人和物联网开发框架
- 批准号:
2213839 - 财政年份:2022
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
CCRI: Planning: Collaborative Proposal: Tools and Research Priority Analyses for Development of Open-Source AI-Enabled Control and Testing Framework for 6G Cellular Research
CCRI:规划:协作提案:为 6G 蜂窝研究开发开源人工智能控制和测试框架的工具和研究优先分析
- 批准号:
2016724 - 财政年份:2020
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
CCRI: Planning: ScooterLab: Development of a Programmable and Participatory e-Scooter Testbed to Enable CISE-focused Micromobility Research
CCRI:规划:ScooterLab:开发可编程和参与式电动滑板车测试平台,以实现以 CISE 为重点的微移动研究
- 批准号:
2016717 - 财政年份:2020
- 资助金额:
$ 10万 - 项目类别:
Standard Grant