CAREER: Reinforcement Learning-Based Control of Heterogeneous Multi-Agent Systems in Structured Environments: Algorithms and Complexity

职业：结构化环境中异构多智能体系统的基于强化学习的控制：算法和复杂性

基本信息

批准号：
2237830
负责人：
Yi Zhou
金额：
$ 54.1万
依托单位：
University of Utah
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-07-01 至 2028-06-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2237830&HistoricalAwards=false
关键词：
CAREER Reinforcement Learning Based Control

项目摘要

Reinforcement learning (RL) is a popular framework for learning optimal decision-making in complex environments, and many RL algorithms have been developed to improve decision-making of a single agent in normal environments. However, modern large-scale distributed learning applications usually involve multiple heterogeneous agents that interact with complex environments, making the optimal decision-making fundamentally more challenging to learn. For example, when navigating multiple drones in an open area, the drones need to properly cooperative with each other and take the environment uncertainty into account. As another example, in distributed wireless networks, the interaction of the agents (e.g., base stations or mobile phones) are subject to heterogeneous constraints on power and bandwidth, etc. This project aims to develop a resilient RL framework for managing heterogeneous multi-agent systems in complex environments, and systematically design efficient multi-agent RL algorithms with comprehensive convergence and complexity analysis. The project will produce RL algorithm packages that are fully accessible to the public. The research activities will also generate positive educational impacts on undergraduate and graduate students. The materials developed by this project will be integrated into courses on machine learning and optimization, and will benefit interdisciplinary students majoring in electrical and computer engineering, statistics and computer science. The project will actively involve underrepresented students and integrate research with education for undergraduate and graduate students in STEM. It will also produce introductory materials for K-12 students to be used in engineering summer research programs.The overarching goal of this project is to develop a resilient RL framework for managing multi-agent systems that involve heterogeneous agents in complex and structured environments, and systematically design scalable and computation-efficient RL algorithms with rigorous and comprehensive convergence and complexity analysis for managing such systems. The proposed research includes three major thrusts. First, to manage cooperative agents with heterogeneous constraints in various types of structured environments (e.g., homogeneity and uncertainty), the environment model structure will be leveraged to develop fully decentralized policy optimization algorithms with convergence and complexity analysis. Second, to manage competitive agents with heterogeneous constraints in uncertain environment, new tractable notions of constrained and robust equilibrium will be proposed. Their fundamental structures and properties will be studied, based on which fully-decentralized primal-dual type policy optimization algorithms and robust value-based algorithms with convergence guarantees will be developed. Lastly, to improve the generalizability of agents’ policies across heterogeneous environments, a new assistive RL framework that can substantially enhance the generalizability using few rounds of information exchange without data sharing will be developed. These RL algorithms will be applied to learn resilient and optimal control policies for interference management in wireless networks and energy control in power networks.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

增强学习（RL）是在复杂环境中学习最佳决策的流行框架，并且已经开发了许多RL算法来改善普通环境中单个代理的决策。但是，现代的大规模分布式学习应用程序通常涉及多种与复杂环境相互作用的异质代理，从而使最佳决策从根本上讲更加挑战。例如，当导航多个无人机在开放区域中时，无人机需要相互正确协调并考虑环境不确定性。作为另一个例子，在分布式无线网络中，代理（例如，基站或手机）的相互作用受到功率和带宽等异质约束。该项目旨在开发一个有弹性的RL RL框架，以在复杂的多样性分析和系统地分析中综合型Al-Algorith和系统地分析，以管理异质的多代试Andigents Systems，并具有综合型Algorith。该项目将生产RL算法软件包，公众完全可以访问。研究活动还将对本科生和研究生产生积极的教育影响。该项目开发的材料将集成到机器学习和优化的课程中，并将受益于电气和计算机工程，统计和计算机科学专业的跨学科学生。该项目将积极参与代表性不足的学生，并与STEM的本科生和研究生的教育进行综合研究。它还将为K-12学生提供介绍材料，用于工程夏季研究计划。该项目的总体目标是开发一个累积的RL RL框架，用于管理多机构系统，该系统涉及复杂和结构化的环境中的异质性代理，以及系统地设计可扩展和计算效率的RL Algorith的综合和全面的转化和复杂性分析，以进行系统分析。拟议的研究包括三个主要推力。首先，为了管理各种类型的结构化环境（例如均匀性和不确定性），具有异质约束的合作代理，将利用环境模型结构来开发具有融合和复杂性分析的完全分散的政策优化算法。其次，要在不确定的环境中管理具有异质限制的竞争代理，将提出新的可约束和强大等效的可拖延注释。将研究它们的基本结构和属性，基于这些结构和属性，将开发具有融合保证的完全截然不同的原始偶型策略优化算法和鲁棒的基于价值的算法。最后，为了提高代理商在异质环境中的策略的普遍性，将开发一个新的辅助RL框架，可以使用几轮信息交换，而无需数据共享，可以增强次要的概括性。这些RL算法将用于学习无线网络中干扰管理的弹性和最佳控制策略。该奖项反映了NSF的法定任务，并被认为值得通过基金会的知识分子优点和更广泛的影响审查标准通过评估来获得支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Yi Zhou其他文献

A New Sequential Block Partial Update Normalized Least Mean M-Estimate Algorithm and Its Convergence Performance Analysis

一种新的顺序块部分更新归一化最小均值M估计算法及其收敛性能分析

DOI：
10.1109/isspit.2007.4458180
发表时间：
2007
期刊：
2007 IEEE International Symposium on Signal Processing and Information Technology
影响因子：
0
作者：
S. Chan;Yi Zhou;K. Ho
通讯作者：
K. Ho

Superoscillation focusing with suppressed sidebands by destructive interference

通过相消干涉抑制边带的超振荡聚焦

DOI：
10.1364/oe.474346
发表时间：
2022
期刊：
Optics Express
影响因子：
3.8
作者：
Kun Zhang;Fengliang Dong;Shaokui Yan;Lihua Xu;Haifeng Hu;Zhiwei Song;Zhengguo Shang;Yi Zhou;Yufei Liu;Zhongquan Wen;Luru Dai;Weiguo Chu;Gang Chen
通讯作者：
Gang Chen