III: Small: Interactive Construction of Complex Query Models

III:小:复杂查询模型的交互构建

基本信息

  • 批准号:
    1617408
  • 负责人:
  • 金额:
    $ 51.6万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2016
  • 资助国家:
    美国
  • 起止时间:
    2016-07-15 至 2020-06-30
  • 项目状态:
    已结题

项目摘要

This research program will investigate and implement SearchIE, a search-based approach to information "extraction." SearchIE will allow rapid, personalized, situational identification of types of objects or actions in text, where those types are likely to be useful for a complex search task. Modern search engines often provide some mechanism to indicate that a query keyword matches a document only if it occurs in the name of a person or in a location. To make that possible, annotators found and marked a large number of people names (for example) in text, a machine learning algorithm was applied to learn which low-level features are indicative of the name type, and then a resulting classifier for that type is run across the collection of documents. It is then possible to write a query that means "paris used as a person's name rather than a location." Unfortunately, the existing approaches do not serve searchers interested in novel, unanticipated types - for example, names of whaling ships, officers in Queen Victoria's navy, local watering holes. Such examples cannot be handled currently because the classifiers need to be trained and run ahead of time, an expensive data labeling process that is too daunting for many search tasks. Since on-line information gathering almost always starts with search and frequently involves identifying items of interest in the found text, bringing these two together has the potential to change both substantially. The SearchIE approach makes it possible for someone to build personalized extractors contextualized by their topical interests. The result is that the technology can radically improve online searching for lay persons as well as professionals by significantly reducing the time needed to focus queries into relevant information. It does not appear that the information extraction task has ever been approached directly as a search task. SearchIE is unique in bringing an information retrieval (search) mindset to the extraction problem, providing new capabilities that are either impossible or extremely difficult in the traditional "annotate then detect" model of the problem. This project will investigate the fundamental issues raised by the SearchIE approach. What models can best integrate extraction and search in new settings where they can truly happen simultaneously? How can a searcher describe and edit a model for the types of interest? Can an interactively developed model be a springboard into a machine learned model and when is there enough information to do that? Does using topical context to limit the scope of extraction provide the expected accuracy gains using SearchIE's approach? What data structure modifications are needed to fully implement SearchIE so that it is efficient as well as effective? How well does this approach fare on additional standard test collections? Addressing the systems and algorithmic issues are fundamental problems that have the potential to greatly impact both search and extraction. For further information, see the project's web site at http://ciir.cs.umass.edu/research/searchie.
该研究计划将调查并实施 SearchIE,这是一种基于搜索的信息“提取”方法。 SearchIE 将允许快速、个性化、情境地识别文本中的对象或动作类型,这些类型可能对复杂的搜索任务有用。现代搜索引擎通常提供某种机制来指示查询关键字仅在出现在人名或位置时才与文档匹配。为了实现这一点,注释者在文本中发现并标记了大量的人名(例如),应用机器学习算法来了解哪些低级特征指示姓名类型,然后得出该类型的分类器贯穿文档集合。然后可以编写一个查询,表示“巴黎用作人名而不是位置”。 不幸的是,现有的方法无法满足对新颖的、意想不到的类型感兴趣的搜索者的需求,例如捕鲸船的名称、维多利亚女王海军的军官、当地的酒吧。目前无法处理此类示例,因为分类器需要提前训练和运行,这是一个昂贵的数据标记过程,对于许多搜索任务来说过于艰巨。由于在线信息收集几乎总是从搜索开始,并且经常涉及识别找到的文本中感兴趣的项目,因此将两者结合在一起有可能大大改变两者。 SearchIE 方法使人们可以根据自己的主题兴趣构建个性化的提取器。结果是,该技术可以通过显着减少将查询集中到相关信息所需的时间,从根本上改善非专业人士和专业人士的在线搜索。信息提取任务似乎从未被直接视为搜索任务。 SearchIE 的独特之处在于将信息检索(搜索)思维方式引入提取问题,提供了传统“注释然后检测”问题模型中不可能或极其困难的新功能。该项目将研究 SearchIE 方法提出的基本问题。哪些模型可以在新环境中最好地将提取和搜索集成在一起,使它们能够真正同时发生?搜索者如何描述和编辑感兴趣类型的模型?交互式开发的模型可以成为机器学习模型的跳板吗?什么时候有足够的信息来做到这一点?使用主题上下文来限制提取范围是否可以使用 SearchIE 的方法提供预期的准确性增益?需要对数据结构进行哪些修改才能完全实现 SearchIE,使其高效且有效?这种方法在其他标准测试集合上的表现如何?解决系统和算法问题是有可能对搜索和提取产生重大影响的基本问题。有关更多信息,请参阅该项目的网站:http://ciir.cs.umass.edu/research/searchie。

项目成果

期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A Reinforcement Learning Framework for Relevance Feedback
相关性反馈的强化学习框架
Sentence Retrieval for Entity List Extraction with a Seed, Context, and Topic
使用种子、上下文和主题进行实体列表提取的句子检索
Term Discrimination Value for Cross-Language Information Retrieval
跨语言信息检索的术语判别值
SearchIE: A Retrieval Approach for Information Extraction
SearchIE:一种信息提取的检索方法
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

James Allan其他文献

Polymorphism in glutathione S-transferase P1 is associated with susceptibility to chemotherapy-induced leukemia
谷胱甘肽 S-转移酶 P1 的多态性与化疗诱发的白血病易感性相关
3‐methyladenine DNA glycosylases: structure, function, and biological importance
3-甲基腺嘌呤 DNA 糖基化酶:结构、功能和生物学重要性
  • DOI:
    10.1002/(sici)1521-1878(199908)21:8<668::aid-bies6>3.0.co;2-d
  • 发表时间:
    1999-08-01
  • 期刊:
  • 影响因子:
    4
  • 作者:
    M. D. Wyatt;James Allan;A. Lau;T. Ellenberger;L. Samson
  • 通讯作者:
    L. Samson
Enhancing the thermal conductivity of ethylene-vinyl acetate (EVA) in a photovoltaic thermal collector
提高光伏集热器中乙烯-醋酸乙烯酯 (EVA) 的导热性
  • DOI:
    10.1063/1.4944557
  • 发表时间:
    2016-03-15
  • 期刊:
  • 影响因子:
    1.6
  • 作者:
    James Allan;H. Pinder;Z. Dehouche
  • 通讯作者:
    Z. Dehouche
A content based approach for discovering missing anchor text for web search
一种基于内容的方法,用于发现网络搜索缺失的锚文本
A Multi-Task Architecture on Relevance-based Neural Query Translation
基于相关性的神经查询翻译的多任务架构

James Allan的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('James Allan', 18)}}的其他基金

CondensabLe AeRosol from non Ideal Stove Emissions (CLARISE)
非理想炉排放产生的冷凝气溶胶 (CLARISE)
  • 批准号:
    NE/X000923/1
  • 财政年份:
    2023
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Research Grant
III: Medium: Collaborative Research: Athena: Learning-oriented Search with Personalized Learning Flows
III:媒介:协作研究:Athena:具有个性化学习流程的面向学习的搜索
  • 批准号:
    2106282
  • 财政年份:
    2021
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Continuing Grant
EAGER: Dynamic Contextual Explanation of Search Results
EAGER:搜索结果的动态上下文解释
  • 批准号:
    2039449
  • 财政年份:
    2020
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Standard Grant
Soot Aerodynamic Size Selection for Optical properties (SASSO)
光学特性烟灰空气动力学尺寸选择 (SASSO)
  • 批准号:
    NE/S00212X/1
  • 财政年份:
    2018
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Research Grant
III: Small: Mirador: Explainable Computational Models for Recognizing and Understanding Controversial Topics Encountered Online
III:小:Mirador:用于识别和理解网上遇到的有争议话题的可解释计算模型
  • 批准号:
    1813662
  • 财政年份:
    2018
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Standard Grant
CRI: CI-SUSTAIN: Collaborative Research: Sustaining Lemur Project Resources for the Long-Term
CRI:CI-SUSTAIN:合作研究:长期维持狐猴项目资源
  • 批准号:
    1822986
  • 财政年份:
    2018
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Standard Grant
I-Corps: Probabilistically Detecting Controversy
I-Corps:概率性检测争议
  • 批准号:
    1721069
  • 财政年份:
    2017
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Standard Grant
Sources and Emissions of Air Pollutants in Beijing (Manchester)
北京(曼彻斯特)空气污染物来源及排放
  • 批准号:
    NE/N007123/1
  • 财政年份:
    2016
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Research Grant
Megacity Delhi atmospheric emission quantification, assessment and impacts (DelhiFlux) - Manchester
大城市德里大气排放量化、评估和影响 (DelhiFlux) - 曼彻斯特
  • 批准号:
    NE/P016472/1
  • 财政年份:
    2016
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Research Grant
Strategic Workshop on Information Retrieval
信息检索战略研讨会
  • 批准号:
    1216764
  • 财政年份:
    2012
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Standard Grant

相似国自然基金

小脑小胶质细胞-神经元交互作用在运动功能调控和共济失调中作用的研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
CLEC2/PDPN信号通路介导脑外伤后血小板/小胶质细胞交互作用以及对神经元损伤修复的影响
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目
STYK1与EGFR交互调节自噬介导非小细胞肺癌EGFR-TKI耐药作用机制的研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目
躯体感觉皮层神经元-小胶质细胞交互作用调控截肢后继发性疼痛的神经机制
  • 批准号:
    82171218
  • 批准年份:
    2021
  • 资助金额:
    55 万元
  • 项目类别:
    面上项目

相似海外基金

III: Small: Deep Interactive Reinforcement Learning for Self-optimizing Feature Selection
III:小:用于自优化特征选择的深度交互式强化学习
  • 批准号:
    2152030
  • 财政年份:
    2022
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: III: SHF: Small: Multi-Stakeholder Decision Making: Qualitative Preference Languages, Interactive Reasoning, and Explanation
协作研究:RI:III:SHF:小型:多利益相关者决策:定性偏好语言、交互式推理和解释
  • 批准号:
    2225823
  • 财政年份:
    2022
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: III: SHF: Small: Multi-Stakeholder Decision Making: Qualitative Preference Languages, Interactive Reasoning, and Explanation
协作研究:RI:III:SHF:小型:多利益相关者决策:定性偏好语言、交互式推理和解释
  • 批准号:
    2225824
  • 财政年份:
    2022
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Standard Grant
III: Small: Fair Decision Making by Consensus: Interactive Bias Mitigation Technology
III:小:共识公平决策:交互式偏差缓解技术
  • 批准号:
    2007932
  • 财政年份:
    2020
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Standard Grant
III: Small: An end-to-end pipeline for interactive visual analysis of big data
III:小型:用于大数据交互式可视化分析的端到端管道
  • 批准号:
    1815238
  • 财政年份:
    2018
  • 资助金额:
    $ 51.6万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了