CAREER: Stochasticity and Resilience in Reinforcement Learning: From Single to Multiple Agents
职业:强化学习中的随机性和弹性:从单个智能体到多个智能体
基本信息
- 批准号:2339794
- 负责人:
- 金额:$ 53.29万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2024
- 资助国家:美国
- 起止时间:2024-03-01 至 2029-02-28
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Reinforcement Learning (RL) has emerged as a promising data-driven paradigm for learning to control unknown and complex systems. It has achieved impressive success in simulated environments such as games. However, for applications in real-world engineering systems, existing RL algorithms and theory fall short of addressing three fundamental challenges: high stochasticity, long-horizon regimes and vulnerability to model uncertainty. These challenges are exacerbated in systems with multiple strategic agents. The goal of this CAREER project is to advance the algorithmic and theoretical foundations of RL by addressing these challenges, and enable efficient and resilient RL-based control in engineering systems. This project will particularly focus on applications in computer and communication networks, which will guide the problem formulation, methodology development and evaluation. The project is enhanced by an education plan that aims to offer students from K–12 to college a pathway to obtain experience and training in RL and broadly machine learning, as well as in their applications in engineering systems. This project will also support a mentoring program for students fromunderrepresented groups in STEM.The research work in this project will address the aforementioned challenges via three technical thrusts. Thrust 1 studies finite-time convergence of various iterative algorithms that arise in RL through the unified variational inequality framework, by leveraging tools from modern Markov chain theory. In Thrust 2, we will develop techniques to tame the high stochasticity in long-horizon problems, and further develop RL algorithms that provably learn a stable and near-optimal policy. Thrust 3 studies scalable multi-agent RL through the framework of mean-field game and graphon game, as well as the game theoretical foundation of robust Markov games under model uncertainty. The developed RL algorithms will be implemented and evaluated in a broad profile of decision-making problems in computer and communication networks.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
增强学习(RL)已成为一个有希望的数据驱动的范式,用于学习控制未知和复杂系统。它在模拟环境(例如游戏)中取得了令人印象深刻的成功。但是,对于现有工程系统中的应用,现有的RL算法和理论缺乏应对三个基本挑战:高随机性,长期胜利制度和模拟不确定性的脆弱性。这些挑战在具有多种战略代理的系统中加剧了。该职业项目的目的是通过解决这些挑战来推动RL的算法和理论基础,并在工程系统中实现高效且基于RL的控制。该项目将特别关注计算机和通信网络中的应用,这些应用程序将指导问题公式,方法论的开发和评估。一项教育计划增强了该项目,该计划旨在为从K -12到大学的学生提供一条途径,以获得RL和广泛的机器学习以及在工程系统中的应用中获得经验和培训。该项目还将为STEM中代表性不足的群体中代表性不足的学生提供心理计划。该项目的研究工作将通过三个技术推力来应对优先的挑战。1Thrust1研究通过从现代Markov Chabov Chable理论中利用统一的变性框架,通过统一的变性框架通过统一的变性框架来通过RL产生的各种迭代算法的有限时间收敛。在“推力2”中,我们将开发技术来驯服长途问题中高的随机性,并进一步开发RL算法,这些算法适当地学习了稳定且近乎最佳的政策。推力3研究通过均值游戏和Graphon游戏的框架进行可扩展的多代理RL,以及在模型不确定性下的强大马尔可夫游戏的游戏理论基础。开发的RL算法将在计算机和通信网络中的决策问题方面进行实施和评估。该奖项反映了NSF的法定任务,并使用基金会的知识分子优点和更广泛的影响审查标准,认为通过评估而被认为是宝贵的支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Qiaomin Xie其他文献
On Reinforcement Learning Using Monte Carlo Tree Search with Supervised Learning: Non-Asymptotic Analysis
使用蒙特卡罗树搜索和监督学习的强化学习:非渐近分析
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Devavrat Shah;Qiaomin Xie;Zhi Xu - 通讯作者:
Zhi Xu
On Faking a Nash Equilibrium
关于伪造纳什均衡
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Young Wu;Jeremy McMahan;Xiaojin Zhu;Qiaomin Xie - 通讯作者:
Qiaomin Xie
Prelimit Coupling and Steady-State Convergence of Constant-stepsize Nonsmooth Contractive SA
常步长非光滑收缩SA的预极限耦合与稳态收敛
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Yixuan Zhang;D. Huo;Yudong Chen;Qiaomin Xie - 通讯作者:
Qiaomin Xie
Optimal Attack and Defense for Reinforcement Learning
强化学习的最优攻击和防御
- DOI:
10.1609/aaai.v38i13.29346 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Jeremy McMahan;Young Wu;Xiaojin Zhu;Qiaomin Xie - 通讯作者:
Qiaomin Xie
Exact Policy Recovery in Offline RL with Both Heavy-Tailed Rewards and Data Corruption
具有重尾奖励和数据损坏的离线强化学习中的精确策略恢复
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Yiding Chen;Xuezhou Zhang;Qiaomin Xie;Xiaojin Zhu;UW - 通讯作者:
UW
Qiaomin Xie的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Qiaomin Xie', 18)}}的其他基金
Travel: Student Travel Grant for the 2024 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems
旅费:2024 年 ACM SIGMETRICS 国际计算机系统测量和建模会议学生旅费补助
- 批准号:
2412676 - 财政年份:2024
- 资助金额:
$ 53.29万 - 项目类别:
Standard Grant
相似国自然基金
考虑微结构随机性的生物材料断裂力学模型及其仿生应用
- 批准号:12372325
- 批准年份:2023
- 资助金额:53 万元
- 项目类别:面上项目
随机泛函分析中若干基本问题的研究
- 批准号:12371141
- 批准年份:2023
- 资助金额:44.00 万元
- 项目类别:面上项目
面向高速随机数生成的扩散型忆阻器熵源随机性与非线性反馈的研究
- 批准号:62304015
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
考虑多源随机性的功能梯度板结构非侵入式随机有限元新方法研究
- 批准号:12302255
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
Hamilton系统中不可逆性、非交换性与随机性相关问题的理论研究
- 批准号:12231010
- 批准年份:2022
- 资助金额:235 万元
- 项目类别:重点项目
相似海外基金
eMB: Collaborative Research: Stochasticity in ovarian aging and biotechnologies for menopause delay
eMB:合作研究:卵巢衰老的随机性和延迟绝经的生物技术
- 批准号:
2325259 - 财政年份:2023
- 资助金额:
$ 53.29万 - 项目类别:
Standard Grant
Collaborative Research: BoCP-Design: US-Sao Paulo: The roles of stochasticity and spatial context in dynamics of functional diversity under global change
合作研究:BoCP-设计:美国-圣保罗:随机性和空间背景在全球变化下功能多样性动态中的作用
- 批准号:
2225096 - 财政年份:2023
- 资助金额:
$ 53.29万 - 项目类别:
Standard Grant
Collaborative Research: BoCP-Design: US-Sao Paulo: The roles of stochasticity and spatial context in dynamics of functional diversity under global change
合作研究:BoCP-设计:美国-圣保罗:随机性和空间背景在全球变化下功能多样性动态中的作用
- 批准号:
2225098 - 财政年份:2023
- 资助金额:
$ 53.29万 - 项目类别:
Standard Grant
eMB: Collaborative Research: Stochasticity in ovarian aging and biotechnologies for menopause delay
eMB:合作研究:卵巢衰老的随机性和延迟绝经的生物技术
- 批准号:
2325258 - 财政年份:2023
- 资助金额:
$ 53.29万 - 项目类别:
Standard Grant
Integrative analysis of the stochasticity of single-cell omics data for predicting pioneerness of transcription factors
单细胞组学数据随机性的综合分析用于预测转录因子的先驱性
- 批准号:
23K14165 - 财政年份:2023
- 资助金额:
$ 53.29万 - 项目类别:
Grant-in-Aid for Early-Career Scientists