CPS: Medium: Sufficient Statistics for Learning Multi-Agent Interactions
CPS:中:学习多智能体交互的足够统计数据
基本信息
- 批准号:2125511
- 负责人:
- 金额:$ 111.42万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-15 至 2025-08-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Multi-agent coordination and collaboration is a core challenge of future cyber-physical systems as they start having more complex interactions with each other or with humans in homes or cities. One of the key challenges is that agents must be able to reason about and learn the behavior of other agents in order to be able to make decisions. This is particularly challenging because state of the art approaches such as recursive belief modeling over partner policies often do not scale. However, humans are very effective in coordinating and collaborating with each other without the need of any expensive recursive belief modeling. One hypothesis is that humans can effectively capture the sufficient representations required for coordinating on tasks. Similar to humans, the agents in a multi-agent setting can look for the sufficient statistics needed for coordination and collaboration. This project is about learning and approximating such sufficient statistics to enable effective collaboration and coordination. In addition, the investigators will study teaching and learning in settings where the agents have partial observation over the world and need to teach and learn from each other in order to achieve a collaborative task.Important successful demonstrations of reinforcement learning for single agents have spurred the drive to determine whether such methods can extend to multiple agents. There have also been notable developments in the area of multi-agent systems, both in understanding the structure of the resulting interacting dynamics and in the development of practical reinforcement learning algorithms. The core objective of this project is: 1) the development of learning methods that approximate the well-known concept of sufficient statistics in multi-agent interactions; 2) the development of a reinforcement learning algorithm that leverages the representations of sufficient statistics for more effective planning, coordination, and collaboration in multi-agent settings; and 3) the development of algorithms that use the representations of sufficient statistics to enable teaching and learning in multi-agent settings under partial observation over the environment. The overall outcome of this project will be a new formalism along with algorithms, tools, and techniques that enhance multi-agent learning and control. The investigators will ground this in two main applications: 1) collaborative search and exploration and 2) collaborative transport of objects.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
多代理协调和协作是未来网络物理系统的核心挑战,因为它们开始彼此之间或与人类或城市中的人类之间进行更复杂的互动。主要挑战之一是,代理必须能够推理和学习其他代理的行为,以便能够做出决定。这尤其具有挑战性,因为艺术的状态方法(例如递归信念建模对合作伙伴政策)通常不会扩大。但是,人类在彼此协调和协作方面非常有效,而无需任何昂贵的递归信念建模。一种假设是,人类可以有效地捕获与任务协调所需的足够表示。与人类类似,在多代理环境中的代理商可以寻找协调和协作所需的足够统计数据。该项目是关于学习和近似足够的统计数据,以实现有效的协作和协调。此外,调查人员将在环境中研究教学和学习,在这种情况下,代理商对世界有部分观察,需要互相教学和学习以实现协作任务。最重要的成功证明了对单个代理的加强学习的成功证明,促使人们促进了驱动力,以确定此类方法是否可以扩展到多个代理商。 多代理系统领域也有显着的发展,既在理解所得的相互作用动力学的结构以及实用强化学习算法的发展中。该项目的核心目标是:1)学习方法的发展,该方法近似于多代理相互作用中足够统计的概念; 2)开发增强学习算法,该算法利用足够的统计数据来实现多代理环境中的更有效的计划,协调和协作; 3)开发使用足够统计数据的表示算法,以在环境中部分观察下在多机构环境中进行教学。 该项目的总体结果将是一种新的形式主义,以及算法,工具和技术,可增强多机构学习和控制。调查人员将在两个主要应用程序中进行以下基础:1)合作搜索和探索以及2)对象的协作运输。该奖项反映了NSF的法定任务,并且使用基金会的知识分子优点和更广泛的影响审查标准,认为值得通过评估来获得支持。
项目成果
期刊论文数量(6)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Conditional Imitation Learning for Multi-Agent Games
- DOI:10.1109/hri53351.2022.9889671
- 发表时间:2022-01
- 期刊:
- 影响因子:0
- 作者:Andy Shih;Stefano Ermon;Dorsa Sadigh
- 通讯作者:Andy Shih;Stefano Ermon;Dorsa Sadigh
Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams
去中心化合作强盗团队中的合作伙伴感知算法
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Erdem Bıyık, Anusha Lalitha
- 通讯作者:Erdem Bıyık, Anusha Lalitha
Reward Design with Language Models
- DOI:10.48550/arxiv.2303.00001
- 发表时间:2023-02
- 期刊:
- 影响因子:0
- 作者:Minae Kwon;Sang Michael Xie;Kalesha Bullard;Dorsa Sadigh
- 通讯作者:Minae Kwon;Sang Michael Xie;Kalesha Bullard;Dorsa Sadigh
Leveraging Smooth Attention Prior for Multi-Agent Trajectory Prediction
- DOI:10.48550/arxiv.2203.04421
- 发表时间:2022-03
- 期刊:
- 影响因子:0
- 作者:Zhangjie Cao;Erdem Biyik;G. Rosman;Dorsa Sadigh
- 通讯作者:Zhangjie Cao;Erdem Biyik;G. Rosman;Dorsa Sadigh
Influencing Towards Stable Multi-Agent Interactions
- DOI:
- 发表时间:2021-10
- 期刊:
- 影响因子:0
- 作者:Woodrow Z. Wang;Andy Shih;Annie Xie;Dorsa Sadigh
- 通讯作者:Woodrow Z. Wang;Andy Shih;Annie Xie;Dorsa Sadigh
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Dorsa Sadigh其他文献
Repeated Interactions Convention Dependence HighLow ρi ρ 2 ρ 3 Rule representation Convention representation 4 player chess Friendly Rock Paper Scissors time gt gp
重复交互 约定依赖 HighLow ρi ρ 2 ρ 3 规则表示 约定表示 4 人棋 友好 石头剪刀布 时间 gt gp
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Andy Shih;Arjun Sawhney;J. Kondic;Stefano Ermon;Dorsa Sadigh - 通讯作者:
Dorsa Sadigh
Altruistic Autonomy: Beating Congestion on Shared Roads
无私的自治:克服共享道路上的拥堵
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Erdem Biyik;Daniel A. Lazar;Ramtin Pedarsani;Dorsa Sadigh - 通讯作者:
Dorsa Sadigh
Shared Autonomy for Robotic Manipulation with Language Corrections
具有语言修正功能的机器人操作的共享自主权
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Siddharth Karamcheti;Raj Palleti;Yuchen Cui;Percy Liang;Dorsa Sadigh - 通讯作者:
Dorsa Sadigh
Deep Local Trajectory Replanning and Control for Robot Navigation
机器人导航的深度局部轨迹重新规划和控制
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Ashwini Pokle;Roberto Martín;P. Goebel;Vincent Chow;H. Ewald;Junwei Yang;Zhenkai Wang;Amir Sadeghian;Dorsa Sadigh;S. Savarese;Marynel Vázquez - 通讯作者:
Marynel Vázquez
Human-robot interaction for truck platooning using hierarchical dynamic games
使用分层动态游戏进行卡车队列的人机交互
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Elis Stefansson;J. Fisac;Dorsa Sadigh;S. S. Sastry;Karl H. Johansson - 通讯作者:
Karl H. Johansson
Dorsa Sadigh的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Dorsa Sadigh', 18)}}的其他基金
Collaborative Research: CPS: Small: Risk-Aware Planning and Control for Safety-Critical Human-CPS
合作研究:CPS:小型:安全关键型人类 CPS 的风险意识规划和控制
- 批准号:
2218760 - 财政年份:2022
- 资助金额:
$ 111.42万 - 项目类别:
Standard Grant
NRI/Collaborative Research: Robot-Assisted Feeding: Towards Efficient, Safe, and Personalized Caregiving Robots
NRI/合作研究:机器人辅助喂养:迈向高效、安全和个性化的护理机器人
- 批准号:
2132847 - 财政年份:2022
- 资助金额:
$ 111.42万 - 项目类别:
Standard Grant
Collaborative Research: Mixed-Autonomy Traffic Networks: Routing Games and Learning Human Choice Models
合作研究:混合自主交通网络:路由博弈和学习人类选择模型
- 批准号:
1953032 - 财政年份:2020
- 资助金额:
$ 111.42万 - 项目类别:
Standard Grant
CHS: Small: Learning and Leveraging Conventions in Human-Robot Interaction
CHS:小:学习和利用人机交互中的约定
- 批准号:
2006388 - 财政年份:2020
- 资助金额:
$ 111.42万 - 项目类别:
Standard Grant
CAREER: Safe and Influencing Interactions for Human-Robot Systems
职业:人机系统的安全且有影响力的交互
- 批准号:
1941722 - 财政年份:2020
- 资助金额:
$ 111.42万 - 项目类别:
Continuing Grant
CRII: RI: Active Learning of Preferences for Human-Aware Autonomy
CRII:RI:主动学习人类意识自主偏好
- 批准号:
1849952 - 财政年份:2019
- 资助金额:
$ 111.42万 - 项目类别:
Standard Grant
相似国自然基金
复合低维拓扑材料中等离激元增强光学响应的研究
- 批准号:12374288
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
基于管理市场和干预分工视角的消失中等企业:特征事实、内在机制和优化路径
- 批准号:72374217
- 批准年份:2023
- 资助金额:41.00 万元
- 项目类别:面上项目
托卡马克偏滤器中等离子体的多尺度算法与数值模拟研究
- 批准号:12371432
- 批准年份:2023
- 资助金额:43.5 万元
- 项目类别:面上项目
中等质量黑洞附近的暗物质分布及其IMRI系统引力波回波探测
- 批准号:12365008
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
中等垂直风切变下非对称型热带气旋快速增强的物理机制研究
- 批准号:42305004
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
RII Track-4:@NASA: Bluer and Hotter: From Ultraviolet to X-ray Diagnostics of the Circumgalactic Medium
RII Track-4:@NASA:更蓝更热:从紫外到 X 射线对环绕银河系介质的诊断
- 批准号:
2327438 - 财政年份:2024
- 资助金额:
$ 111.42万 - 项目类别:
Standard Grant
Collaborative Research: Topological Defects and Dynamic Motion of Symmetry-breaking Tadpole Particles in Liquid Crystal Medium
合作研究:液晶介质中对称破缺蝌蚪粒子的拓扑缺陷与动态运动
- 批准号:
2344489 - 财政年份:2024
- 资助金额:
$ 111.42万 - 项目类别:
Standard Grant
Collaborative Research: AF: Medium: The Communication Cost of Distributed Computation
合作研究:AF:媒介:分布式计算的通信成本
- 批准号:
2402836 - 财政年份:2024
- 资助金额:
$ 111.42万 - 项目类别:
Continuing Grant
Collaborative Research: AF: Medium: Foundations of Oblivious Reconfigurable Networks
合作研究:AF:媒介:遗忘可重构网络的基础
- 批准号:
2402851 - 财政年份:2024
- 资助金额:
$ 111.42万 - 项目类别:
Continuing Grant
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403122 - 财政年份:2024
- 资助金额:
$ 111.42万 - 项目类别:
Standard Grant