Distributionally Robust Adaptive Control: Enabling Safe and Robust Reinforcement Learning

分布式鲁棒自适应控制:实现安全鲁棒的强化学习

基本信息

项目摘要

Data-driven algorithms can autonomously control complex systems like autonomous cars and drones. However, the use of such powerful algorithms remains relegated primarily to controlled laboratory environments. The main reason for the minimal adoption of data-driven methods for safety-critical systems is the difficulty one encounters when attempting to establish safety and predictability guarantees as one would do with well-established control theoretical methods. This award supports fundamental research to identify the best methodologies to consolidate data-driven and control-theoretic tools so that the overall methodology is safe, robust, and high-performing. The new approach lifts control tools to speak the same language as the data-driven methods. In doing so, the performance of the data-driven methods is not compromised, and yet, the safety guarantees of control-theoretic tools can be constructed. Safe and predictable autonomous operation of complex systems can bring immense socio-economic benefits through its application in medical robotics, autonomous logistics, transportation, and extra-terrestrial exploration, to name a few. This research involves multiple disciplines, including robotics, control theory, statistical learning, and mathematics. The cross-disciplinary nature will assist underrepresented groups' broader participation in STEM and impact engineering education. To adopt data-driven methods that rely on reinforcement learning (RL) algorithms in safety-critical systems, we need guarantees on safety and robustness. Robust and adaptive control methodologies developed for classical systems with parametric uncertainties cannot be used directly in conjunction with RL because the latter operates on data-driven models for which identifying parametric and deterministic uncertainties is difficult, if not impossible. This research will construct a new class of robust adaptive controllers that are robust to errors in the learned distributions, thus allowing RL algorithms to directly interact with these controllers without further restrictions. Due to robustness at the level of distributions, notions of risk-aware safety can be included in a straightforward manner. This research will first aim to construct controllers that track temporally evolving state distributions with uniform bounds. Then, the epistemic uncertainties will be introduced with a novel adaptive control scheme to quantifiably control the effect of the uncertainties in the space of distributions. The results produced through this effort will bring the two distinct worlds of data-driven control and classical control together at a natural intersection point where trajectories of distributions, not of sample paths, are considered.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据驱动的算法可以自主控制自动驾驶汽车和无人机等复杂系统。然而,如此强大的算法的使用仍然主要局限于受控的实验室环境。安全关键系统很少采用数据驱动方法的主要原因是,在尝试建立安全性和可预测性保证时遇到困难,就像使用完善的控制理论方法一样。该奖项支持基础研究,以确定整合数据驱动和控制理论工具的最佳方法,从而使整体方法安全、稳健且高性能。新方法将控制工具提升为与数据驱动方法相同的语言。这样做,数据驱动方法的性能不会受到影响,而且可以构建控制理论工具的安全保证。复杂系统的安全且可预测的自主运行可以通过其在医疗机器人、自主物流、运输和外星探索等领域的应用带来巨大的社会经济效益。这项研究涉及多个学科,包括机器人学、控制理论、统计学习和数学。跨学科性质将有助于代表性不足的群体更广泛地参与 STEM 并影响工程教育。为了在安全关键系统中采用依赖强化学习(RL)算法的数据驱动方法,我们需要保证安全性和鲁棒性。为具有参数不确定性的经典系统开发的鲁棒自适应控制方法不能直接与强化学习结合使用,因为后者在数据驱动模型上运行,而识别参数和确定性不确定性即使不是不可能,也是很困难的。这项研究将构建一类新的鲁棒自适应控制器,它们对学习分布中的错误具有鲁棒性,从而允许强化学习算法直接与这些控制器交互,而不受进一步的限制。由于分布级别的稳健性,可以以简单的方式包含风险意识安全的概念。这项研究首先旨在构建能够跟踪具有统一边界的时间演化状态分布的控制器。然后,将通过一种新颖的自适应控制方案引入认知不确定性,以量化控制分布空间中不确定性的影响。通过这项努力产生的结果将把数据驱动控制和经典控制这两个不同的世界结合在一个自然交叉点,其中考虑的是分布轨迹,而不是样本路径。该奖项反映了 NSF 的法定使命,并被认为是值得的通过使用基金会的智力优势和更广泛的影响审查标准进行评估来获得支持。

项目成果

期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Safe and Efficient Reinforcement Learning Using Disturbance-Observer-Based Control Barrier Functions
使用基于干扰观察器的控制屏障函数进行安全高效的强化学习
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Naira Hovakimyan其他文献

Three-dimensional coordinated path-following control for second-order multi-agent networks
二阶多智能体网络三维协调路径跟踪控制
  • DOI:
    10.1016/j.jfranklin.2015.01.020
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zongyu Zuo;Venanzio Cichella;Ming Xu;Naira Hovakimyan
  • 通讯作者:
    Naira Hovakimyan
FlipDyn in Graphs: Resource Takeover Games in Graphs
图表中的 FlipDyn:图表中的资源接管游戏
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sandeep Banik;Shaunak D. Bopardikar;Naira Hovakimyan
  • 通讯作者:
    Naira Hovakimyan

Naira Hovakimyan的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Naira Hovakimyan', 18)}}的其他基金

Collaborative Research: SLES: Guaranteed Tubes for Safe Learning across Autonomy Architectures
合作研究:SLES:跨自治架构安全学习的保证管
  • 批准号:
    2331878
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
NSF-AoF: RI: Small: Safe Reinforcement Learning in Non-Stationary Environments With Fast Adaptation and Disturbance Prediction
NSF-AoF:RI:小型:具有快速适应和干扰预测功能的非平稳环境中的安全强化学习
  • 批准号:
    2133656
  • 财政年份:
    2021
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
NRI: INT: COLLAB: Synergetic Drone Delivery Network in Metropolis
NRI:INT:COLLAB:大都市的协同无人机交付网络
  • 批准号:
    1830639
  • 财政年份:
    2018
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
CPS: Medium: Collaborative Research: Against Coordinated Cyber and Physical Attacks: Unified Theory and Technologies
CPS:媒介:协作研究:对抗协调的网络和物理攻击:统一理论和技术
  • 批准号:
    1739732
  • 财政年份:
    2017
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
NRI: Collaborative Research: ASPIRE: Automation Supporting Prolonged Independent Residence for the Elderly
NRI:合作研究:ASPIRE:自动化支持老年人长期独立居住
  • 批准号:
    1528036
  • 财政年份:
    2015
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
EAGER: Human centered robotic system design
EAGER:以人为本的机器人系统设计
  • 批准号:
    1548409
  • 财政年份:
    2015
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant

相似国自然基金

强壮前沟藻共生细菌降解膦酸酯产生促藻效应的分子机制
  • 批准号:
    42306167
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于复合编码脉冲串的水下主动隐蔽性探测新方法研究
  • 批准号:
    61271414
  • 批准年份:
    2012
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目
半定松弛与非凸二次约束二次规划研究
  • 批准号:
    11271243
  • 批准年份:
    2012
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目
高效率强壮消息鉴别码的分析与设计
  • 批准号:
    61202422
  • 批准年份:
    2012
  • 资助金额:
    23.0 万元
  • 项目类别:
    青年科学基金项目
民航客运网络收益管理若干问题的研究
  • 批准号:
    60776817
  • 批准年份:
    2007
  • 资助金额:
    20.0 万元
  • 项目类别:
    联合基金项目

相似海外基金

VIPAuto: Robust and Adaptive Visual Perception for Automated Vehicles in Complex Dynamic Scenes
VIPAuto:复杂动态场景中自动驾驶车辆的鲁棒自适应视觉感知
  • 批准号:
    EP/Y015878/1
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Fellowship
CAREER: Enabling Robust and Adaptive Architectures through a Decoupled Security-Centric Hardware/Software Stack
职业:通过解耦的以安全为中心的硬件/软件堆栈实现鲁棒性和自适应架构
  • 批准号:
    2238548
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Continuing Grant
Developing robust and scalable genomics tools and databases to analyze immune receptor repertoires across diverse populations
开发强大且可扩展的基因组学工具和数据库来分析不同人群的免疫受体库
  • 批准号:
    10910354
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
CAREER: Risk-Based Methods for Robust, Adaptive, and Equitable Flood Risk Management in a Changing Climate
职业:在气候变化中实现稳健、适应性和公平的洪水风险管理的基于风险的方法
  • 批准号:
    2238060
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Exploiting Geometries of Learning for Fast, Adaptive and Robust AI
利用学习几何实现快速、自适应和鲁棒的人工智能
  • 批准号:
    DP230101176
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Discovery Projects
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了