CAREER: Teaching Machines to Recognize Complex Visual Concepts in Images through Compositionality

职业：教导机器通过组合性识别图像中的复杂视觉概念

基本信息

批准号：
2045773
负责人：
Vicente Ordonez
金额：
$ 49.98万
依托单位：
University of Virginia Main Campus
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-09-01 至 2021-11-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2045773&HistoricalAwards=false
关键词：
CAREER Teaching Machines Recognize Complex

项目摘要

Modern computational systems for image recognition can be taught to detect objects among large sets of categories. However, in order to teach machines to recognize every new category, human operators need to annotate a large number of images with categorical labels. In practice many applications require a custom set of categories. For instance, a visual recognition model for detecting different types of furniture for an e-commerce application might require very specific categories such as ‘rocking chair’, ‘swivel chair’, ‘accent chair’, or ‘swivel accent chair’. Even an expert domain user that has a good idea in mind for what should be the visual characteristics that are important to recognize in each type of chair, would have to teach the system through annotating images individually. The goal of this project is to enable richer modes of interaction where ‘machine teachers’ would be able to guide the image recognition through direct feedback on the types of visual characteristics that are important for each new category. To this end we plan to exploit principles of compositionality where new categories can be defined based on basic concepts that are easier to recognize. The project will integrate research with the education and involve undergraduate students from underrepresented groups in the research.This project will devise new models that learn to recognize visual concepts compositionally by first discovering and then learning to recognize visual primitives that are shared across many classes. This process will also be tailored to maximize the utility in an environment where a user can guide the model through natural interactions including the use of language and direct manipulation through a visual interface. The project will be 1) developing methods to compositionally and interactively learn from textual descriptions 2) proposing methods to automatically discover primitives that are composable across categories, and 3) proposing models that can support interactions even after deployment. These three research aims will be complemented by a comprehensive evaluation plan, a public platform that exposes our methods in an interactive environment, and broadening participation activities. This research effort will bring novel designs in visual recognition models that offer people more expressive ways for guiding them and training them.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

可以教导用于图像识别的现代计算系统以检测大量类别中的对象。但是，为了教导机器识别每个新类别，人类操作员需要注释带有分类标签的大量图像。实际上，许多应用程序都需要一组自定义的类别。例如，用于检测电子商务应用程序不同类型的家具的视觉识别模型可能需要非常具体的类别，例如“摇椅”，“旋转椅子”，“口音椅”或“ Swivel cockent椅子”。即使是专家领域用户，对在每种类型的椅子中要识别的视觉特征应该是一个很好的主意，也必须通过单独注释图像来教授系统。该项目的目的是实现更丰富的互动模式，即“机器教师”将能够通过直接反馈对每个新类别很重要的视觉特征类型来指导图像识别。为此，我们计划探索组成性原则，在这些原理中，可以根据易于识别的基本概念来定义新类别。该项目将与教育联系起来，并将来自代表性不足小组的本科生参与研究。该项目将设计新的模型，这些新模型学会通过首先发现识别视觉概念，然后学会学会识别在许多类别中共享的视觉原始素。此过程还将定制，以最大化用户可以通过自然相互作用引导模型的环境中的实用程序，包括使用语言和通过视觉接口进行直接操作。该项目将是1）从文本描述中开发合成和交互式学习的方法2）提出方法以自动发现可以在类别中综合的原始方法，以及3）提出即使在部署后也可以支持交互的模型。这三个研究目标将由全面的评估计划完成，该计划将在互动环境中揭示我们的方法并扩大参与活动。这项研究工作将带来视觉识别模型中的新颖设计，为人们提供更多表达方式来指导他们和培训他们。该奖项反映了NSF的法定任务，并被认为是值得通过基金会的知识分子优点和更广泛影响的评估来进行评估的。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Vicente Ordonez其他文献

Variation of Gender Biases in Visual Recognition Models Before and After Finetuning

视觉识别模型微调前后性别偏差的变化

DOI：
10.48550/arxiv.2303.07615
发表时间：
2023
期刊：
ArXiv
影响因子：
0
作者：
Jaspreet Ranjit;Tianlu Wang;Baishakhi Ray;Vicente Ordonez
通讯作者：
Vicente Ordonez

Enabling AI at the edge with XNOR-networks

通过 XNOR 网络在边缘启用 AI

DOI：
10.1145/3429945
发表时间：
2020
期刊：
Communications of the ACM
影响因子：
22.7
作者：
Mohammad Rastegari;Vicente Ordonez;Joseph Redmon;Ali Farhadi
通讯作者：
Ali Farhadi

Learning to name objects

学习给物体命名

DOI：
10.1145/2885252
发表时间：
2016
期刊：
Communications of the ACM
影响因子：
22.7
作者：
Vicente Ordonez;Wei Liu;Jia Deng;Yejin Choi;A. Berg;Tamara L. Berg
通讯作者：
Tamara L. Berg

The Ariadne Infrastructure for Managing and Storing Metadata

用于管理和存储元数据的 Ariadne 基础设施

DOI：
10.1109/mic.2009.90
发表时间：
2009
期刊：
IEEE Internet Computing
影响因子：
3.2
作者：
Stefaan Ternier;K. Verbert;Gonzalo Parra;Bram Vandeputte;J. Klerkx;E. Duval;Vicente Ordonez;X. Ochoa
通讯作者：
X. Ochoa