CPS: Medium: Collaborative Research: Provably Safe and Robust Multi-Agent Reinforcement Learning with Applications in Urban Air Mobility

CPS:中:协作研究:可证明安全且鲁棒的多智能体强化学习及其在城市空中交通中的应用

基本信息

  • 批准号:
    2312094
  • 负责人:
  • 金额:
    $ 40万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2023
  • 资助国家:
    美国
  • 起止时间:
    2023-06-01 至 2026-05-31
  • 项目状态:
    未结题

项目摘要

This Cyber-Physical Systems (CPS) project aims at designing theories and algorithms for scalable multi-agent planning and control to support safety-critical autonomous eVTOL aircraft in high-throughput, uncertain and dynamic environments. Urban Air Mobility (UAM) is an emerging air transportation mode in which electrical vertical take-off and landing (eVTOL) aircraft will safely and efficiently transport passengers and cargo within urban areas. Guidance from the White House, the National Academy of Engineering, and the US Congress has encouraged fundamental research in UAM to maintain the US global leadership in this field. The success of UAM will depend on the safe and robust multi-agent autonomy to scale up the operations to high-throughput urban air traffic. Learning-based techniques such as deep reinforcement learning and multi-agent reinforcement learning are developed to support planning and control for these eVTOL vehicles. However, there is a major challenge to provide theoretical safety and robustness guarantees for these learning-based neural network in-the-loop models in multi-agent autonomous UAM applications. In this project, the researchers will collaborate with committed government and industry partners on the use-case-inspired fundamental research, with a focus on promoting safety and reliability of AI, machine learning and autonomy in students with diverse backgrounds. The technical objectives of this project include (1) Safety and Robustness of Single-Agent Reinforcement Learning: in order to address the “safety critical” UAM challenge, the PIs plan the min-max optimization for single agent reinforcement learning to formally build sufficient safety margin, constrained reinforcement learning to formulate safety as physical constraints in state and action spaces, and the novel cautious reinforcement learning that uses variational policy gradient to plan the safest aircraft trajectory with minimum distributional risk; (2) Safety and Robustness of Multi-Agent Reinforcement Learning: in order to address the “heterogeneous agents and scalability” challenge, a novel federated reinforcement learning framework where a central agent coordinates with decentralized safe agents to improve traffic throughput while guaranteeing safety, and a scaling mechanism to accommodate a varying number of decentralized aircraft; (3) Safety and Robustness from Simulations to the Real World: in order to address the “high-dimensionality and environment uncertainty” challenge, the researchers will focus on the agents’ policy robustness under distribution shift and fast adaptation from simulation to the real world. Specifically, value-targeted model learning to incorporate domain knowledge such as the aircraft and environment physics, and a safe adaptation mechanism after the RL model is deployed online for flight testing or execution is planned.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该网络物理系统(CPS)项目旨在设计可扩展的多智能体规划和控制的理论和算法,以支持高吞吐量、不确定和动态环境中的安全关键型自主电动垂直起降飞机(UAM)是一种新兴的城市空中交通。电动垂直起降(eVTOL)飞机将在城市地区安全高效地运输乘客和货物的航空运输模式得到了白宫、美国国家工程院和美国国会的指导,鼓励了城市空中交通的基础研究。维持美国在这一领域的全球领先地位将取决于安全和强大的多智能体自主性,以将运营扩展到高吞吐量的城市空中交通,例如深度强化学习和多智能体强化。然而,在多智能体自主 UAM 应用中为这些基于学习的神经网络在环模型提供理论安全性和鲁棒性保证是一个重大挑战。项目中,研究人员将与有承诺的政府和该项目的技术目标包括(1)单智能体的安全性和鲁棒性。强化学习:为了解决“安全关键”的 UAM 挑战,PI 计划对单代理强化学习进行最小-最大优化,以正式建立足够的安全裕度,约束强化学习将安全性制定为状态和动作空间中的物理约束,和小说谨慎强化学习,使用变分策略梯度来规划具有最小分布风险的最安全的飞机轨迹;(2)多智能体强化学习的安全性和鲁棒性:为了解决“异构智能体和可扩展性”的挑战,一种新颖的联邦强化学习框架;其中中央代理与分散的安全代理协调以提高流量吞吐量,同时保证安全,以及适应不同数量的分散飞机的扩展机制(3)从模拟到安全性和鲁棒性;现实世界:为了应对“高维和环境不确定性”的挑战,研究人员将重点关注智能体在分布转移下的策略鲁棒性以及从模拟到现实世界的快速适应,具体来说,就是以价值为目标的模型学习。结合飞机和环境物理等领域知识,并计划在 RL 模型在线部署进行飞行测试或执行后建立安全适应机制。该奖项反映了 NSF 的法定使命,并通过使用基金会的知识评估进行评估,认为值得支持优点和更广泛的影响审查标准。

项目成果

期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
VARIANCE-AWARE REGRET BOUNDS FOR STOCHASTIC CONTEXTUAL DUELING BANDITS
随机上下文决斗强盗的方差感知遗憾界限
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Quanquan Gu其他文献

Mean-Field Analysis of Two-Layer Neural Networks: Non-Asymptotic Rates and Generalization Bounds
两层神经网络的平均场分析:非渐近率和泛化界限
  • DOI:
  • 发表时间:
    2020-02-10
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zixiang Chen;Yuan Cao;Quanquan Gu;Tong Zhang
  • 通讯作者:
    Tong Zhang
Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits
随机噪声下的最优在线广义线性回归及其在异方差强盗中的应用
  • DOI:
  • 发表时间:
    2022-02-28
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Heyang Zhao;Dongruo Zhou;Jiafan He;Quanquan Gu
  • 通讯作者:
    Quanquan Gu
Pure Exploration in Asynchronous Federated Bandits
异步联邦强盗的纯粹探索
  • DOI:
    10.48550/arxiv.2310.11015
  • 发表时间:
    2023-10-17
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zichen Wang;Chuanhao Li;Chenyu Song;Lianghui Wang;Quanquan Gu;Huazheng Wang
  • 通讯作者:
    Huazheng Wang
Differentially Private Hypothesis Transfer Learning
差分私有假设迁移学习
  • DOI:
    10.1007/978-3-030-10928-8_48
  • 发表时间:
    2018-09-10
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yang Wang;Quanquan Gu;Donald E. Brown
  • 通讯作者:
    Donald E. Brown
Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow
通过多相 Procrustes 流快速高效地完成归纳矩阵
  • DOI:
  • 发表时间:
    2018-03-03
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xiao Zhang;S. Du;Quanquan Gu
  • 通讯作者:
    Quanquan Gu

Quanquan Gu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Quanquan Gu', 18)}}的其他基金

Collaborative Research: Towards the Foundation of Approximate Sampling-Based Exploration in Sequential Decision Making
协作研究:为顺序决策中基于近似采样的探索奠定基础
  • 批准号:
    2323113
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
III: Small: Towards the Foundations of Training Deep Neural Networks: New Theory and Algorithms
III:小:迈向训练深度神经网络的基础:新理论和算法
  • 批准号:
    2008981
  • 财政年份:
    2020
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CIF: Small: Collaborative Research: Rank Aggregation with Heterogeneous Information Sources: Efficient Algorithms and Fundamental Limits
CIF:小型:协作研究:异构信息源的排名聚合:高效算法和基本限制
  • 批准号:
    1911168
  • 财政年份:
    2019
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
BIGDATA: F: Collaborative Research: Taming Big Networks via Embedding
BIGDATA:F:协作研究:通过嵌入驯服大网络
  • 批准号:
    1741342
  • 财政年份:
    2018
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
BIGDATA: F: Collaborative Research: Taming Big Networks via Embedding
BIGDATA:F:协作研究:通过嵌入驯服大网络
  • 批准号:
    1855099
  • 财政年份:
    2018
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: Scaling Up Knowledge Discovery in High-Dimensional Data Via Nonconvex Statistical Optimization
职业:通过非凸统计优化扩大高维数据中的知识发现
  • 批准号:
    1906169
  • 财政年份:
    2018
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
III: Small: Collaborative Research: High-Dimensional Machine Learning Methods for Personalized Cancer Genomics
III:小:协作研究:个性化癌症基因组学的高维机器学习方法
  • 批准号:
    1903202
  • 财政年份:
    2018
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
III: Small: Collaborative Learning with Incomplete and Noisy Knowledge
III:小:知识不完整且有噪音的协作学习
  • 批准号:
    1904183
  • 财政年份:
    2018
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: Scaling Up Knowledge Discovery in High-Dimensional Data Via Nonconvex Statistical Optimization
职业:通过非凸统计优化扩大高维数据中的知识发现
  • 批准号:
    1652539
  • 财政年份:
    2017
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
III: Small: Collaborative Research: High-Dimensional Machine Learning Methods for Personalized Cancer Genomics
III:小:协作研究:个性化癌症基因组学的高维机器学习方法
  • 批准号:
    1717206
  • 财政年份:
    2017
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant

相似国自然基金

基于机器学习和经典电动力学研究中等尺寸金属纳米粒子的量子表面等离激元
  • 批准号:
    22373002
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
基于挥发性分布和氧化校正的大气半/中等挥发性有机物来源解析方法构建
  • 批准号:
    42377095
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
中等质量黑洞附近的暗物质分布及其IMRI系统引力波回波探测
  • 批准号:
    12365008
  • 批准年份:
    2023
  • 资助金额:
    32 万元
  • 项目类别:
    地区科学基金项目
复合低维拓扑材料中等离激元增强光学响应的研究
  • 批准号:
    12374288
  • 批准年份:
    2023
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目
中等垂直风切变下非对称型热带气旋快速增强的物理机制研究
  • 批准号:
    42305004
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: CPS: Medium: Automating Complex Therapeutic Loops with Conflicts in Medical Cyber-Physical Systems
合作研究:CPS:中:自动化医疗网络物理系统中存在冲突的复杂治疗循环
  • 批准号:
    2322534
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: CPS: Medium: Automating Complex Therapeutic Loops with Conflicts in Medical Cyber-Physical Systems
合作研究:CPS:中:自动化医疗网络物理系统中存在冲突的复杂治疗循环
  • 批准号:
    2322533
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: CPS: Medium: Sensor Attack Detection and Recovery in Cyber-Physical Systems
合作研究:CPS:中:网络物理系统中的传感器攻击检测和恢复
  • 批准号:
    2333980
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: CPS: Medium: Co-Designed Control and Scheduling Adaptation for Assured Cyber-Physical System Safety and Performance
协作研究:CPS:中:共同设计控制和调度适应,以确保网络物理系统的安全和性能
  • 批准号:
    2229136
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: CPS: Medium: Co-Designed Control and Scheduling Adaptation for Assured Cyber-Physical System Safety and Performance
协作研究:CPS:中:共同设计控制和调度适应,以确保网络物理系统的安全和性能
  • 批准号:
    2229290
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了