Collaborative Research: SLES: Safe Distributional-Reinforcement Learning-Enabled Systems: Theories, Algorithms, and Experiments

协作研究:SLES:安全的分布式强化学习系统:理论、算法和实验

基本信息

  • 批准号:
    2331782
  • 负责人:
  • 金额:
    $ 37.5万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2023
  • 资助国家:
    美国
  • 起止时间:
    2023-10-01 至 2027-09-30
  • 项目状态:
    未结题

项目摘要

Reinforcement learning (RL), with its success in automation and robotics, has been widely viewed as one of the most important technologies for next-generation, learning-enabled systems. For example, 6G networking systems, autonomous driving, digital healthcare, and smart cities are all enabled by RL. However, despite the significant advances over the last few decades, a major obstacle in applying RL in practice is the lack of “safety'' guarantees such as robustness, resilience to tail-risks, operational constraints, etc. This is because the traditional RL only aims at maximizing cumulative reward. While it is possible to add penalties to rewards in a traditional RL algorithm to discourage unsafe actions, many safety constraints, such as chance constraints, cannot be simply treated as penalties. This project develops foundational technologies for safe RL-enabled systems based on Distributional Reinforcement Learning (DRL), which learns the optimal policy. While developing the foundation of DRL for safe learning-enabled systems, research and education are integrated by including new theories and algorithms developed in this project into their graduate-level courses. All team members have been regularly supervising undergraduate students and students from underrepresented groups. The team continues to leverage Women's Place at Ohio State University and the Women in Science and Engineering Program at Arizona State University to enhance the broader participation of women students and researchers. This project focuses on a comprehensive approach for the end-to-end safety of DRL-enabled systems. The end-to-end safety includes (i) policy safety: learn a safe policy to avoid the occurrence of catastrophic outcomes (corresponds to risk-sensitive RL); (ii) exploration safety -- learn a safe policy safely by avoiding dangerous actions during exploration/learning (corresponds to online RL); and (iii) environmental safety -- learn a policy that is robust to parametric uncertainty (environment change). This project includes four thrusts. Thrust 1 (Foundation of constrained DRL) aims to establish theoretical foundations of risk sensitive constrained DRL and focuses on policy and environmental safety. Thrust 2 (Online constrained DRL) considers safe online learning and decision-making and focuses on exploration safety and environmental safety when learning a safe DRL policy. Thrust 3 (Physics-Enhanced constrained DRL) exploits physics to enhance end-to-end safety. These three thrusts on foundational research are interdependent, but each focuses on a unique aspect of safe RL-enabled systems and addresses multiple safety notions. The fourth thrust will provide comprehensive validation with both high-fidelity simulations and real-world experiments using unmanned aerial vehicles.This research is supported by a partnership between the National Science Foundation and Open Philanthropy.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
强化学习 (RL) 在自动化和机器人技术领域取得了成功,被广泛视为下一代学习型系统最重要的技术之一,例如 6G 网络系统、自动驾驶、数字医疗和智能。然而,尽管过去几十年来取得了重大进展,但在实践中应用强化学习的一个主要障碍是缺乏“安全”保证,例如稳健性、尾部风险恢复能力、操作限制等。 .这是因为传统强化学习的目标只是最大化累积奖励,虽然可以在传统强化学习算法中增加惩罚来阻止不安全行为,但许多安全约束(例如机会约束)不能简单地视为惩罚。基于分布式强化学习 (DRL) 的安全 RL 系统,它学习最优策略 虽然 DRL 的基础是安全学习系统,但通过将本项目中开发的新理论和算法纳入其中,将研究和教育融为一体。他们的所有团队成员都定期监督本科生和代表性不足群体的学生。该团队继续利用俄亥俄州立大学的女性地位和亚利桑那州立大学的女性科学与工程项目来提高女性的更广泛参与。该项目重点关注支持 DRL 的系统的端到端安全的综合方法。端到端安全包括 (i) 策略安全:学习安全策略以避免灾难性结果的发生。 (对应于(ii) 勘探安全——通过避免探索/学习过程中的危险行为来安全地学习安全策略(对应于在线强化学习);以及 (iii) 环境安全——学习对参数不确定性(环境变化)具有鲁棒性的策略。项目包括四个主旨。主旨 1(约束 DRL 的基础)旨在建立风险敏感的约束 DRL 的理论基础,重点是政策和环境安全。主旨 2(在线约束 DRL)考虑安全在线。学习和决策时重点关注勘探安全和环境安全。 Thrust 3(物理增强约束 DRL)利用物理学来增强端到端安全性。这三个基础研究的推动力是相互依赖的。每项研究都侧重于安全强化学习系统的一个独特方面,并解决多种安全概念。第四个重点将通过使用无人机进行高保真模拟和现实世界实验进行全面验证。这项研究得到了合作伙伴的支持。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Xian Yu其他文献

VCApather: A Network as a Service Solution for Video Conference Applications
VCApather:视频会议应用程序的网络即服务解决方案
Electroencephalogram signal analysis based on the improved k-nearest neighbor network
基于改进k近邻网络的脑电信号分析
Characterisation of a novel extended-spectrum beta-lactamase, SHV-70, from a clinical isolate of Enterobacter cloacae in China.
来自中国阴沟肠杆菌临床分离株的新型超广谱 β-内酰胺酶 SHV-70 的表征。
Treatment of hepatitis C virus infection in people with opioid use disorder: a real-world study of elbasvir/grazoprevir in a US Department of Veterans Affairs population
阿片类药物使用障碍患者丙型肝炎病毒感染的治疗:在美国退伍军人事务部人群中进行的艾尔巴韦/格拉佐韦韦的真实世界研究
Study Advances and Prospect of High-quality Bean Sprouts

Xian Yu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

脂质合成调控细胞膜稳态介导贝莱斯芽孢杆菌耐受乙醇胁迫机制研究
  • 批准号:
    32372284
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
哈茨木霉正向突变株协同贝莱斯芽孢杆菌SQR9发挥植物促生与生防功能的机制研究
  • 批准号:
    32302679
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
贝莱斯芽孢杆菌VII型分泌系统分泌蛋白YukE介导的根系铁泄漏促进其根际定殖的机制研究
  • 批准号:
    32370135
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
mtrB基因调控色氨酸合成介导贝莱斯芽孢杆菌耐受乙醇胁迫的机制研究
  • 批准号:
    32302023
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
外源嗜铁素介导贝莱斯芽胞杆菌Fur蛋白阻遏自身嗜铁素合成的作用机制研究
  • 批准号:
    32272624
  • 批准年份:
    2022
  • 资助金额:
    54 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: SLES: Guaranteed Tubes for Safe Learning across Autonomy Architectures
合作研究:SLES:跨自治架构安全学习的保证管
  • 批准号:
    2331878
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Guaranteed Tubes for Safe Learning across Autonomy Architectures
合作研究:SLES:跨自治架构安全学习的保证管
  • 批准号:
    2331879
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Safe Distributional-Reinforcement Learning-Enabled Systems: Theories, Algorithms, and Experiments
协作研究:SLES:安全的分布式强化学习系统:理论、算法和实验
  • 批准号:
    2331781
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Foundations of Qualitative and Quantitative Safety Assessment of Learning-enabled Systems
合作研究:SLES:学习型系统定性和定量安全评估的基础
  • 批准号:
    2331938
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Collaborative Research: SLES: Bridging offline design and online adaptation in safe learning-enabled systems
协作研究:SLES:在安全的学习系统中桥接离线设计和在线适应
  • 批准号:
    2331880
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了