Collaborative Research: RI:Medium:Understanding Events from Streaming Video - Joint Deep and Graph Representations, Commonsense Priors, and Predictive Learning
协作研究:RI:Medium:理解流视频中的事件 - 联合深度和图形表示、常识先验和预测学习
基本信息
- 批准号:2348689
- 负责人:
- 金额:$ 28.51万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-10-01 至 2024-10-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
While it is easy for humans to process video data and extract meanings from it, it is extremely hard to design algorithms to do so. When developed, there are many applications of this technology, such as building assistive robotics or constructing smart spaces for independent living or monitoring wildlife. Video-data capture events, which are central to the content of human experience. Events consist of objects/people (who), location (where), time (when), actions (what), activities (how), and intent (why). This project develops a computer vision-based event understanding algorithm that operates in a self-supervised, streaming fashion. The algorithm will predict and detect old and new events, learn to build hierarchical event representations, all in the context of a prior knowledge-base that is updated over time. The intent is to generate interpretations of an event that go beyond what is seen, rather than just recognition. This research pushes the frontier of computer vision by coupling the self-supervised learning process with prior knowledge, moving the field towards open-world algorithms, and needing little or no supervision. Furthermore, this project will focus on recruitment and retention of undergraduate women students through freshman and sophomore years, with attention towards underrepresented minority students at the three sites: University of South Florida, Florida State University, and Oklahoma State University.At the core of the approach is a hybrid representational hierarchy that includes both continuous representations and symbolic graph-based representations. The continuous-valued representation is the standard, vector-valued deep learning stack that ends in an embedding vector of some object or action concept in the knowledge base. The next level of the representation consists of elementary symbolic compositions of these verbs and nouns. These elementary compositions, when associated with concepts from a knowledge-base they makeup an event interpretation, containing descriptions that go beyond what is observed in the image. These symbolic levels are built using Grenander's canonical representations from pattern theory. These representations, which have flexible graph-structured backbones, are more expressive than other well-known graphical models. The specific technical aims of the project are four-fold. First, it seeks to integrate function-based continuous with energy-based Grenander's canonical symbolic representations from pattern theory into one integrated formulation based on equilibrium propagation. Second, it will research and develop ways to use and modify commonsense knowledge bases. This will help to go beyond the closed world assumption, which is implicit in the current practice of annotated data-based deep learning approaches. Third, it will develop dynamical models on graph manifolds, which will enable generative modeling of graph structures for prediction and discovery of new concepts. Fourth, inspired by finding from human perception experiments and neuroscience, it will design predictive self-supervised learning over both continuous and symbolic representations.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
虽然人类处理视频数据并从中提取含义很容易,但设计算法来做到这一点却极其困难。当这项技术开发出来后,会有很多应用,例如构建辅助机器人或构建用于独立生活或监测野生动物的智能空间。 视频数据捕获事件是人类体验内容的核心。事件由物体/人(谁)、地点(地点)、时间(何时)、行动(什么)、活动(如何)和意图(为什么)组成。该项目开发了一种基于计算机视觉的事件理解算法,该算法以自我监督的流式传输方式运行。该算法将预测和检测新旧事件,学习构建分层事件表示,所有这些都在随时间更新的先验知识库的背景下进行。其目的是对事件产生超越所见的解释,而不仅仅是认知。 这项研究通过将自监督学习过程与先验知识相结合,推动计算机视觉领域的发展,推动该领域向开放世界算法发展,并且几乎不需要或不需要监督。此外,该项目将侧重于在大一和大二期间招收和留住本科女学生,并关注南佛罗里达大学、佛罗里达州立大学和俄克拉荷马州立大学这三个地点代表性不足的少数族裔学生。该方法是一种混合表示层次结构,包括连续表示和基于符号图的表示。连续值表示是标准的向量值深度学习堆栈,以知识库中某些对象或动作概念的嵌入向量结束。表示的下一个层次由这些动词和名词的基本符号组合组成。这些基本构图与知识库中的概念相关联时,它们构成了事件解释,其中包含超出图像中观察到的描述。这些符号级别是使用模式理论中 Grenander 的规范表示构建的。这些表示具有灵活的图结构主干,比其他众所周知的图模型更具表现力。该项目的具体技术目标有四个方面。首先,它试图将基于函数的连续与基于能量的 Grenander 模式理论的规范符号表示集成到一个基于平衡传播的集成公式中。其次,它将研究和开发使用和修改常识知识库的方法。这将有助于超越封闭世界假设,该假设隐含在当前基于注释数据的深度学习方法的实践中。第三,它将开发图流形上的动态模型,这将使图结构的生成建模能够用于预测和发现新概念。 第四,受人类感知实验和神经科学发现的启发,它将设计针对连续和符号表示的预测性自我监督学习。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响进行评估,被认为值得支持审查标准。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
IS-GGT: Iterative Scene Graph Generation with Generative Transformers
IS-GGT:使用生成变压器迭代场景图生成
- DOI:10.1109/cvpr52729.2023.00609
- 发表时间:2022-11-30
- 期刊:
- 影响因子:0
- 作者:Sanjoy Kundu;Sathyanarayanan N. Aakur
- 通讯作者:Sathyanarayanan N. Aakur
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Sathyanarayanan Aakur其他文献
Sathyanarayanan Aakur的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Sathyanarayanan Aakur', 18)}}的其他基金
CAREER:Towards Causal Multi-Modal Understanding with Event Partonomy and Active Perception
职业:通过事件部分和主动感知实现因果多模态理解
- 批准号:
2348690 - 财政年份:2023
- 资助金额:
$ 28.51万 - 项目类别:
Continuing Grant
CAREER:Towards Causal Multi-Modal Understanding with Event Partonomy and Active Perception
职业:通过事件部分和主动感知实现因果多模态理解
- 批准号:
2143150 - 财政年份:2022
- 资助金额:
$ 28.51万 - 项目类别:
Continuing Grant
Collaborative Research: RI:Medium:Understanding Events from Streaming Video - Joint Deep and Graph Representations, Commonsense Priors, and Predictive Learning
协作研究:RI:Medium:理解流视频中的事件 - 联合深度和图形表示、常识先验和预测学习
- 批准号:
1955230 - 财政年份:2020
- 资助金额:
$ 28.51万 - 项目类别:
Continuing Grant
相似国自然基金
跨膜蛋白LRP5胞外域调控膜受体TβRI促钛表面BMSCs归巢、分化的研究
- 批准号:82301120
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
Dectin-2通过促进FcεRI聚集和肥大细胞活化加剧哮喘发作的机制研究
- 批准号:82300022
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
藏药甘肃蚤缀β-咔啉生物碱类TβRI抑制剂的发现及其抗肺纤维化作用机制研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
TβRI的UFM化修饰调控TGF-β信号通路和乳腺癌转移的作用及机制研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
内核区对流活动与云微物理过程对登陆中国台风快速增强(RI)的研究
- 批准号:
- 批准年份:2021
- 资助金额:58 万元
- 项目类别:
相似海外基金
Collaborative Research: RI: Small: Motion Fields Understanding for Enhanced Long-Range Imaging
合作研究:RI:小型:增强远程成像的运动场理解
- 批准号:
2232298 - 财政年份:2023
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: RUI: Automated Decision Making for Open Multiagent Systems
协作研究:RI:中:RUI:开放多智能体系统的自动决策
- 批准号:
2312657 - 财政年份:2023
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312840 - 财政年份:2023
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Multilingual Long-form QA with Retrieval-Augmented Language Models
合作研究:RI:Medium:采用检索增强语言模型的多语言长格式 QA
- 批准号:
2312948 - 财政年份:2023
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Superhuman Imitation Learning from Heterogeneous Demonstrations
合作研究:RI:媒介:异质演示中的超人模仿学习
- 批准号:
2312956 - 财政年份:2023
- 资助金额:
$ 28.51万 - 项目类别:
Standard Grant