Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Efficient Reinforcement Learning
合作研究:CIF:媒介:高效强化学习的统计和算法基础
基本信息
- 批准号:2106778
- 负责人:
- 金额:$ 80万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-10-01 至 2025-09-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
As a data-driven paradigm for sequential decision making in unknown environments, Reinforcement Learning (RL) has received significant interest in recent years owing to its potential ability to solve difficult problems associated with future societal and scientific developments. However, the explosion of both model dimensionality and complexity in current and emerging applications exacerbates the challenge of achieving efficient RL in sample-starved situations, where data collection is expensive, time-consuming, or even high-stake, e.g., in clinical trials, online advertising, and autonomous systems. As a result, understanding and improving the sample and computational efficiencies of RL algorithms, sometimes under additional resource and system-level constraints, are rightly understood as critical to the successful deployment of RL in the future. In this project the PIs are involving students at all levels with diverse backgrounds in Electrical and Computer Engineering, and in Statistics, are developing education modules on RL to enrich the curriculum, and are co-organizing workshops and outreach activities to enable the broader dissemination of the project outcomes.Despite decades-long research efforts, the statistical and computational underpinnings of RL are still far from being well understood, especially when it comes to finite-sample and finite-time issues which are of crucial operational value. This research project is bridging the theory-practice gap of modern algorithmic approaches to RL. It is doing so by (i) characterizing fundamental limits for the sample and computations complexities in various RL settings, (ii) by developing performance guarantees and uncertainty quantification schemes, and (iii) by designing new computationally efficient algorithms that are provably near-optimal in terms of sample complexity in both single-agent and multi-agent settings. The expected outcomes will enable the trustworthy adoption of RL algorithms in sample-starved environments. The complementary expertise of the research team is being leveraged to enrich the statistical and algorithmic foundations of RL through model-, policy-, and value-based approaches. New efficient algorithms that rely on function approximation schemes are being developed in order to address the curse of dimensionality; the resulting techniques are intended to lead to non-asymptotic analysis tools that deal with the complicated statistical dependencies present in RL. This rich research agenda is expected to foster multidisciplinary efforts at the intersection of high-dimensional statistics, non-convex optimization, control theory, information theory, and machine learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
作为在未知环境中进行连续决策的数据驱动范式,近年来,强化学习(RL)由于解决了与未来的社会和科学发展相关的困难问题,因此受到了浓厚的兴趣。但是,在当前和新兴应用中,模型维度和复杂性的爆炸式爆炸加剧了在样本饥饿的情况下达到有效RL的挑战,在样品中,数据收集昂贵,耗时,甚至高率,例如在临床试验中,临床试验,在线广告和自主系统。结果,理解和改善RL算法的样本和计算效率,有时在其他资源和系统级别的约束下,正确地理解为对未来RL成功部署至关重要。在该项目中,PI在电气和计算机工程领域的各个级别的学生都参与,并且在统计学中,正在开发RL的教育模块以丰富课程,并共同组织研讨会和外展活动,以使更广泛地传播有关课程的活动。项目成果。尽管长达数十年的研究工作,RL的统计和计算基础仍然远远没有得到充分理解,尤其是在有限样本和有限的时间问题上,这些问题具有至关重要的运营价值。该研究项目正在弥合RL的现代算法方法的理论实践差距。它是通过(i)表征样品和计算各种RL设置中的复杂性的基本限制,(ii)通过开发性能保证和不确定性量化方案,以及(iii)设计新的计算有效算法,这些算法几乎是优秀的在单位和多代理设置中的样本复杂性方面。预期的结果将使在样本饥饿的环境中值得信赖的RL算法采用。研究团队的互补专业知识正在利用通过模型,政策和基于价值的方法来丰富RL的统计和算法基础。为了解决维度的诅咒,正在开发依赖功能近似方案的新的有效算法。最终的技术旨在导致非吸收分析工具,以处理RL中存在的复杂统计依赖性。预计这一丰富的研究议程将在高维统计,非凸优化,控制理论,信息理论和机器学习的交集中促进多学科的努力。该奖项反映了NSF的法定任务,并被认为是值得通过评估的支持。基金会的智力优点和更广泛的影响审查标准。
项目成果
期刊论文数量(18)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning
- DOI:
- 发表时间:2021-02
- 期刊:
- 影响因子:0
- 作者:Gen Li;Changxiao Cai;Yuxin Chen;Yuantao Gu;Yuting Wei;Yuejie Chi
- 通讯作者:Gen Li;Changxiao Cai;Yuxin Chen;Yuantao Gu;Yuting Wei;Yuejie Chi
Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization
- DOI:
- 发表时间:2021-05
- 期刊:
- 影响因子:0
- 作者:Shicong Cen;Yuting Wei;Yuejie Chi
- 通讯作者:Shicong Cen;Yuting Wei;Yuejie Chi
Fast Global Convergence of Natural Policy Gradient Methods with Entropy Regularization
- DOI:10.1287/opre.2021.2151
- 发表时间:2020-07
- 期刊:
- 影响因子:0
- 作者:Shicong Cen;Chen Cheng;Yuxin Chen;Yuting Wei;Yuejie Chi
- 通讯作者:Shicong Cen;Chen Cheng;Yuxin Chen;Yuting Wei;Yuejie Chi
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
- DOI:10.1137/21m1456789
- 发表时间:2021-05
- 期刊:
- 影响因子:0
- 作者:Wenhao Zhan;Shicong Cen;Baihe Huang;Yuxin Chen;Jason D. Lee;Yuejie Chi
- 通讯作者:Wenhao Zhan;Shicong Cen;Baihe Huang;Yuxin Chen;Jason D. Lee;Yuejie Chi
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
- DOI:10.48550/arxiv.2210.01050
- 发表时间:2022-10
- 期刊:
- 影响因子:0
- 作者:Shicong Cen;Yuejie Chi;S. Du;Lin Xiao
- 通讯作者:Shicong Cen;Yuejie Chi;S. Du;Lin Xiao
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yuejie Chi其他文献
Settling the Sample Complexity of Model-Based Offline Reinforcement Learning
解决基于模型的离线强化学习的样本复杂度
- DOI:
10.48550/arxiv.2204.05275 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Gen Li;Laixi Shi;Yuxin Chen;Yuejie Chi;Yuting Wei - 通讯作者:
Yuting Wei
Regularized blind detection for MIMO communications
MIMO 通信的正则盲检测
- DOI:
10.1109/isit.2010.5513407 - 发表时间:
2010 - 期刊:
- 影响因子:0
- 作者:
Yuejie Chi;Yiyue Wu;A. Calderbank - 通讯作者:
A. Calderbank
Memory-Limited stochastic approximation for poisson subspace tracking
泊松子空间跟踪的内存有限随机近似
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Liming Wang;Yuejie Chi - 通讯作者:
Yuejie Chi
Principal subspace estimation for low-rank Toeplitz covariance matrices with binary sensing
具有二元感知的低秩 Toeplitz 协方差矩阵的主子空间估计
- DOI:
10.1109/acssc.2016.7869594 - 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
H. Fu;Yuejie Chi - 通讯作者:
Yuejie Chi
Support Stability of Spike Deconvolution via Total Variation Minimization
通过总变异最小化支持尖峰反卷积的稳定性
- DOI:
10.1109/ciss48834.2020.1570627765 - 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Maxime Ferreira Da Costa;Yuejie Chi - 通讯作者:
Yuejie Chi
Yuejie Chi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yuejie Chi', 18)}}的其他基金
Federated Optimization over Bandwidth-Limited Heterogeneous Networks
带宽受限异构网络的联合优化
- 批准号:
2318441 - 财政年份:2023
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: Towards a Theoretic Foundation for Optimal Deep Graph Learning
协作研究:为最优深度图学习奠定理论基础
- 批准号:
2134080 - 财政年份:2022
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant
NSF Student Travel Grant for the Fifth Conference on Machine Learning and Systems (MLSys 2022)
第五届机器学习和系统会议 (MLSys 2022) 的 NSF 学生旅费补助金
- 批准号:
2219655 - 财政年份:2022
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Taming Nonlinear Inverse Problems: Theory and Algorithms
驯服非线性反问题:理论与算法
- 批准号:
2126634 - 财政年份:2021
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
CIF: Small: Resource-Efficient Statistical Inference in Networked Environments
CIF:小型:网络环境中资源高效的统计推断
- 批准号:
2007911 - 财政年份:2020
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
CIF: Medium: Collaborative Research: Theory of Optimization Geometry and Algorithms for Neural Networks
CIF:媒介:协作研究:神经网络优化几何理论和算法
- 批准号:
1901199 - 财政年份:2019
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
EAGER-DynamicData: Subspace Learning From Binary Sensing
EAGER-DynamicData:从二进制感知中学习子空间
- 批准号:
1833553 - 财政年份:2018
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
CIF: Small: Inverse Methods for Parametric Mixture Models
CIF:小:参数混合模型的逆方法
- 批准号:
1826519 - 财政年份:2018
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
CIF: Medium: Collaborative Research: Nonconvex Optimization for High-Dimensional Signal Estimation: Theory and Fast Algorithms
CIF:中:协作研究:高维信号估计的非凸优化:理论和快速算法
- 批准号:
1806154 - 财政年份:2018
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant
CAREER: Robust Methods for High-Dimensional Signal Processing under Geometric Constraints
职业:几何约束下高维信号处理的鲁棒方法
- 批准号:
1818571 - 财政年份:2018
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
相似国自然基金
基于FRET受体上升时间的单分子高精度测量方法研究
- 批准号:22304184
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
脂质多聚复合物mRNA纳米疫苗的构筑及抗肿瘤治疗研究
- 批准号:52373161
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
屏障突破型原位线粒体基因递送系统用于治疗Leber遗传性视神经病变的研究
- 批准号:82304416
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
细胞硬度介导口腔鳞癌细胞与CD8+T细胞间力学对话调控免疫杀伤的机制研究
- 批准号:82373255
- 批准年份:2023
- 资助金额:48 万元
- 项目类别:面上项目
乙酸钙不动杆菌上调DUOX2激活PERK/ATF4内质网应激在炎症性肠病中的作用机制研究
- 批准号:82300623
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403122 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402815 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343599 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343600 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research:CIF:Small:Acoustic-Optic Vision - Combining Ultrasonic Sonars with Visible Sensors for Robust Machine Perception
合作研究:CIF:Small:声光视觉 - 将超声波声纳与可见传感器相结合,实现强大的机器感知
- 批准号:
2326905 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant