Distributionally Robust Adaptive Control: Enabling Safe and Robust Reinforcement Learning
分布式鲁棒自适应控制:实现安全鲁棒的强化学习
基本信息
- 批准号:2135925
- 负责人:
- 金额:$ 37.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-07-01 至 2025-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Data-driven algorithms can autonomously control complex systems like autonomous cars and drones. However, the use of such powerful algorithms remains relegated primarily to controlled laboratory environments. The main reason for the minimal adoption of data-driven methods for safety-critical systems is the difficulty one encounters when attempting to establish safety and predictability guarantees as one would do with well-established control theoretical methods. This award supports fundamental research to identify the best methodologies to consolidate data-driven and control-theoretic tools so that the overall methodology is safe, robust, and high-performing. The new approach lifts control tools to speak the same language as the data-driven methods. In doing so, the performance of the data-driven methods is not compromised, and yet, the safety guarantees of control-theoretic tools can be constructed. Safe and predictable autonomous operation of complex systems can bring immense socio-economic benefits through its application in medical robotics, autonomous logistics, transportation, and extra-terrestrial exploration, to name a few. This research involves multiple disciplines, including robotics, control theory, statistical learning, and mathematics. The cross-disciplinary nature will assist underrepresented groups' broader participation in STEM and impact engineering education. To adopt data-driven methods that rely on reinforcement learning (RL) algorithms in safety-critical systems, we need guarantees on safety and robustness. Robust and adaptive control methodologies developed for classical systems with parametric uncertainties cannot be used directly in conjunction with RL because the latter operates on data-driven models for which identifying parametric and deterministic uncertainties is difficult, if not impossible. This research will construct a new class of robust adaptive controllers that are robust to errors in the learned distributions, thus allowing RL algorithms to directly interact with these controllers without further restrictions. Due to robustness at the level of distributions, notions of risk-aware safety can be included in a straightforward manner. This research will first aim to construct controllers that track temporally evolving state distributions with uniform bounds. Then, the epistemic uncertainties will be introduced with a novel adaptive control scheme to quantifiably control the effect of the uncertainties in the space of distributions. The results produced through this effort will bring the two distinct worlds of data-driven control and classical control together at a natural intersection point where trajectories of distributions, not of sample paths, are considered.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据驱动的算法可以自主控制复杂的系统,例如自动驾驶汽车和无人机。但是,这种强大的算法的使用仍主要降级为受控的实验室环境。最少采用数据驱动方法来安全至关重要系统的主要原因是,在试图建立安全性和可预测性保证时,人们会遇到困难,就像一个人使用良好的控制理论方法一样。该奖项支持基本研究,以确定合并数据驱动和控制理论工具的最佳方法,以便整体方法是安全,强大且高性能的。新方法提升控制工具,以说与数据驱动的方法相同的语言。这样一来,数据驱动方法的性能就不会受到损害,但是,可以构建控制理论工具的安全保证。复杂系统的安全和可预测的自主操作可以通过其在医疗机器人技术,自主物流,运输和地外探索中的应用来带来巨大的社会经济利益。这项研究涉及多个学科,包括机器人技术,控制理论,统计学习和数学。跨学科的性质将有助于代表性不足的团体对STEM和影响工程教育的广泛参与。要采用依靠加强学习(RL)算法的数据驱动方法,我们需要确保安全性和鲁棒性。为具有参数不确定性的经典系统开发的稳健和自适应控制方法不能直接与RL结合使用,因为后者在数据驱动的模型上运行,而这些模型很难识别参数和确定性不确定性,即使不是不可能。这项研究将构建一类新的健壮自适应控制器,这些控制器对学习分布中的错误是可靠的,从而允许RL算法与这些控制器直接相互作用而无需进一步限制。由于分布级别的稳健性,可以简单地包括风险感知安全性的概念。这项研究将首先旨在构建控制器,以跟踪均匀界限的时间不断发展的状态分布。然后,将使用一种新型的自适应控制方案引入认知不确定性,以量化分布空间中不确定性的影响。通过这项工作产生的结果将使数据驱动的控制和经典控制的两个不同的世界在自然的交叉点融合在一起,在这种自然交叉点中,考虑到分布轨迹而不是样本路径的轨迹。这项奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子优点和更广泛影响的审查审查的审查标准来通过评估来通过评估来提供支持的。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Safe and Efficient Reinforcement Learning using Disturbance-Observer-Based Control Barrier Functions
- DOI:
- 发表时间:2022-11
- 期刊:
- 影响因子:0
- 作者:Yikun Cheng;Pan Zhao;N. Hovakimyan
- 通讯作者:Yikun Cheng;Pan Zhao;N. Hovakimyan
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Naira Hovakimyan其他文献
Three-dimensional coordinated path-following control for second-order multi-agent networks
二阶多智能体网络三维协调路径跟踪控制
- DOI:
10.1016/j.jfranklin.2015.01.020 - 发表时间:
2015-09 - 期刊:
- 影响因子:0
- 作者:
Zongyu Zuo;Venanzio Cichella;Ming Xu;Naira Hovakimyan - 通讯作者:
Naira Hovakimyan
FlipDyn in Graphs: Resource Takeover Games in Graphs
图表中的 FlipDyn:图表中的资源接管游戏
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Sandeep Banik;Shaunak D. Bopardikar;Naira Hovakimyan - 通讯作者:
Naira Hovakimyan
Naira Hovakimyan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Naira Hovakimyan', 18)}}的其他基金
Collaborative Research: SLES: Guaranteed Tubes for Safe Learning across Autonomy Architectures
合作研究:SLES:跨自治架构安全学习的保证管
- 批准号:
2331878 - 财政年份:2024
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
NSF-AoF: RI: Small: Safe Reinforcement Learning in Non-Stationary Environments With Fast Adaptation and Disturbance Prediction
NSF-AoF:RI:小型:具有快速适应和干扰预测功能的非平稳环境中的安全强化学习
- 批准号:
2133656 - 财政年份:2021
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
NRI: INT: COLLAB: Synergetic Drone Delivery Network in Metropolis
NRI:INT:COLLAB:大都市的协同无人机交付网络
- 批准号:
1830639 - 财政年份:2018
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
CPS: Medium: Collaborative Research: Against Coordinated Cyber and Physical Attacks: Unified Theory and Technologies
CPS:媒介:协作研究:对抗协调的网络和物理攻击:统一理论和技术
- 批准号:
1739732 - 财政年份:2017
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
NRI: Collaborative Research: ASPIRE: Automation Supporting Prolonged Independent Residence for the Elderly
NRI:合作研究:ASPIRE:自动化支持老年人长期独立居住
- 批准号:
1528036 - 财政年份:2015
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
EAGER: Human centered robotic system design
EAGER:以人为本的机器人系统设计
- 批准号:
1548409 - 财政年份:2015
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
相似国自然基金
强壮前沟藻共生细菌降解膦酸酯产生促藻效应的分子机制
- 批准号:42306167
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
高效率强壮消息鉴别码的分析与设计
- 批准号:61202422
- 批准年份:2012
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
半定松弛与非凸二次约束二次规划研究
- 批准号:11271243
- 批准年份:2012
- 资助金额:60.0 万元
- 项目类别:面上项目
基于复合编码脉冲串的水下主动隐蔽性探测新方法研究
- 批准号:61271414
- 批准年份:2012
- 资助金额:60.0 万元
- 项目类别:面上项目
民航客运网络收益管理若干问题的研究
- 批准号:60776817
- 批准年份:2007
- 资助金额:20.0 万元
- 项目类别:联合基金项目
相似海外基金
VIPAuto: Robust and Adaptive Visual Perception for Automated Vehicles in Complex Dynamic Scenes
VIPAuto:复杂动态场景中自动驾驶车辆的鲁棒自适应视觉感知
- 批准号:
EP/Y015878/1 - 财政年份:2024
- 资助金额:
$ 37.5万 - 项目类别:
Fellowship
CAREER: Risk-Based Methods for Robust, Adaptive, and Equitable Flood Risk Management in a Changing Climate
职业:在气候变化中实现稳健、适应性和公平的洪水风险管理的基于风险的方法
- 批准号:
2238060 - 财政年份:2023
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
CAREER: Enabling Robust and Adaptive Architectures through a Decoupled Security-Centric Hardware/Software Stack
职业:通过解耦的以安全为中心的硬件/软件堆栈实现鲁棒性和自适应架构
- 批准号:
2238548 - 财政年份:2023
- 资助金额:
$ 37.5万 - 项目类别:
Continuing Grant
Commensal bacteria as vehicles for robust mucosal vaccination against lung pathogens
共生细菌作为针对肺部病原体的强力粘膜疫苗接种的载体
- 批准号:
10749817 - 财政年份:2023
- 资助金额:
$ 37.5万 - 项目类别:
Exploiting Geometries of Learning for Fast, Adaptive and Robust AI
利用学习几何实现快速、自适应和鲁棒的人工智能
- 批准号:
DP230101176 - 财政年份:2023
- 资助金额:
$ 37.5万 - 项目类别:
Discovery Projects