Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Efficient Reinforcement Learning

合作研究:CIF:媒介:高效强化学习的统计和算法基础

基本信息

  • 批准号:
    2106739
  • 负责人:
  • 金额:
    $ 40万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2021
  • 资助国家:
    美国
  • 起止时间:
    2021-10-01 至 2022-03-31
  • 项目状态:
    已结题

项目摘要

As a data-driven paradigm for sequential decision making in unknown environments, Reinforcement Learning (RL) has received significant interest in recent years owing to its potential ability to solve difficult problems associated with future societal and scientific developments. However, the explosion of both model dimensionality and complexity in current and emerging applications exacerbates the challenge of achieving efficient RL in sample-starved situations, where data collection is expensive, time-consuming, or even high-stake, e.g., in clinical trials, online advertising, and autonomous systems. As a result, understanding and improving the sample and computational efficiencies of RL algorithms, sometimes under additional resource and system-level constraints, are rightly understood as critical to the successful deployment of RL in the future. In this project the PIs are involving students at all levels with diverse backgrounds in Electrical and Computer Engineering, and in Statistics, are developing education modules on RL to enrich the curriculum, and are co-organizing workshops and outreach activities to enable the broader dissemination of the project outcomes.Despite decades-long research efforts, the statistical and computational underpinnings of RL are still far from being well understood, especially when it comes to finite-sample and finite-time issues which are of crucial operational value. This research project is bridging the theory-practice gap of modern algorithmic approaches to RL. It is doing so by (i) characterizing fundamental limits for the sample and computations complexities in various RL settings, (ii) by developing performance guarantees and uncertainty quantification schemes, and (iii) by designing new computationally efficient algorithms that are provably near-optimal in terms of sample complexity in both single-agent and multi-agent settings. The expected outcomes will enable the trustworthy adoption of RL algorithms in sample-starved environments. The complementary expertise of the research team is being leveraged to enrich the statistical and algorithmic foundations of RL through model-, policy-, and value-based approaches. New efficient algorithms that rely on function approximation schemes are being developed in order to address the curse of dimensionality; the resulting techniques are intended to lead to non-asymptotic analysis tools that deal with the complicated statistical dependencies present in RL. This rich research agenda is expected to foster multidisciplinary efforts at the intersection of high-dimensional statistics, non-convex optimization, control theory, information theory, and machine learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
作为未知环境中顺序决策的数据驱动范式,强化学习(RL)由于其解决与未来社会和科学发展相关的难题的潜在能力,近年来引起了极大的兴趣。然而,当前和新兴应用中模型维度和复杂性的爆炸式增长加剧了在样本匮乏的情况下实现高效强化学习的挑战,在这种情况下,数据收集昂贵、耗时,甚至风险很高,例如在临床试验中,在线广告和自治系统。因此,理解和提高 RL 算法的样本和计算效率(有时在额外的资源和系统级限制下)被正确地理解为对于未来成功部署 RL 的关键。在这个项目中,PI 吸引了电气和计算机工程以及统计学领域具有不同背景的各级学生参与,正在开发 RL 教育模块以丰富课程,并共同组织研讨会和外展活动,以更广泛地传播 RL 知识。尽管经过数十年的研究努力,强化学习的统计和计算基础仍然远未得到充分理解,特别是在涉及具有至关重要的操作价值的有限样本和有限时间问题时。该研究项目正在弥合现代强化学习算法的理论与实践差距。它通过以下方式实现这一目标:(i) 表征各种 RL 设置中样本和计算复杂性的基本限制,(ii) 开发性能保证和不确定性量化方案,以及 (iii) 设计可证明接近最优的新计算高效算法就单代理和多代理设置中的样本复杂性而言。预期结果将使强化学习算法在样本匮乏的环境中得到可靠的采用。研究团队的互补专业知识正在通过基于模型、策略和价值的方法来丰富强化学习的统计和算法基础。为了解决维数灾难,正在开发依赖函数逼近方案的新高效算法;由此产生的技术旨在产生非渐近分析工具,用于处理强化学习中存在的复杂统计依赖性。这一丰富的研究议程预计将促进高维统计、非凸优化、控制理论、信息论和机器学习交叉领域的多学科努力。该奖项反映了 NSF 的法定使命,并通过评估被认为值得支持基金会的智力价值和更广泛的影响审查标准。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Tackling Small Eigen-Gaps: Fine-Grained Eigenvector Estimation and Inference Under Heteroscedastic Noise
解决小特征间隙:异方差噪声下的细粒度特征向量估计和推理
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
异步 Q-Learning 的样本复杂性:更清晰的分析和方差减少
  • DOI:
    10.1109/tit.2021.3120096
  • 发表时间:
    2022-01
  • 期刊:
  • 影响因子:
    2.5
  • 作者:
    Li, Gen;Wei, Yuting;Chi, Yuejie;Gu, Yuantao;Chen, Yuxin
  • 通讯作者:
    Chen, Yuxin
Softmax policy gradient methods can take exponential time to converge
Softmax 策略梯度方法可能需要指数时间才能收敛
  • DOI:
    10.1007/s10107-022-01920-6
  • 发表时间:
    2023-01
  • 期刊:
  • 影响因子:
    2.7
  • 作者:
    Li, Gen;Wei, Yuting;Chi, Yuejie;Chen, Yuxin
  • 通讯作者:
    Chen, Yuxin
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
正则化强化学习的策略镜像下降:具有线性收敛的广义框架
  • DOI:
    10.1137/21m1456789
  • 发表时间:
    2023-06
  • 期刊:
  • 影响因子:
    3.1
  • 作者:
    Zhan, Wenhao;Cen, Shicong;Huang, Baihe;Chen, Yuxin;Lee, Jason D.;Chi, Yuejie
  • 通讯作者:
    Chi, Yuejie
Breaking the sample complexity barrier to regret-optimal model-free reinforcement learning
打破样本复杂性障碍,实现后悔最优无模型强化学习
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Yuxin Chen其他文献

Class-wise Thresholding for Detecting Out-of-Distribution Data
用于检测分布外数据的分类阈值
  • DOI:
  • 发表时间:
    2024-09-14
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Matteo Guarrera;Baihong Jin;Tung;Maria A. Zuluaga;Yuxin Chen;A. Sangiovanni
  • 通讯作者:
    A. Sangiovanni
Secret Image Sharing Based on Error-Correcting Codes
基于纠错码的秘密图像共享
Research on the effect and mechanism of antimicrobial peptides HPRP‐A1/A2 work against Toxoplasma gondii infection
抗菌肽HPRP-A1/A2抗弓形虫感染作用及机制研究
  • DOI:
    10.1111/pim.12619
  • 发表时间:
    2019-03-12
  • 期刊:
  • 影响因子:
    2.2
  • 作者:
    Ran Liu;Yangyue Ni;Jingwei Song;Zhipeng Xu;J. Qiu;Lijuan Wang;Yuxiao Zhu;Yibing Huang;M. Ji;Yuxin Chen
  • 通讯作者:
    Yuxin Chen
DNA Bloom Filter enables anti-contamination and file version control for DNA-based data storage
DNA 布隆过滤器可为基于 DNA 的数据存储提供抗污染和文件版本控制
  • DOI:
    10.1093/bib/bbae125
  • 发表时间:
    2024-03-27
  • 期刊:
  • 影响因子:
    9.5
  • 作者:
    Yiming Li;Haoling Zhang;Yuxin Chen;Yue Shen;Zhi Ping
  • 通讯作者:
    Zhi Ping
Machine learning models to predict the tunnel wall convergence
预测隧道壁收敛的机器学习模型
  • DOI:
    10.1016/j.trgeo.2023.101022
  • 发表时间:
    2023-05-01
  • 期刊:
  • 影响因子:
    5.3
  • 作者:
    Jian Zhou;Yuxin Chen;Chuanqi Li;Y. Qiu;Shuai Huang;Mingli Tao
  • 通讯作者:
    Mingli Tao

Yuxin Chen的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Yuxin Chen', 18)}}的其他基金

Collaborative Research: RI: Small: Foundations of Few-Round Active Learning
协作研究:RI:小型:少轮主动学习的基础
  • 批准号:
    2313131
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
  • 批准号:
    2218773
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
RI: Medium: Collaborative Research:Algorithmic High-Dimensional Statistics: Optimality, Computtional Barriers, and High-Dimensional Corrections
RI:中:协作研究:算法高维统计:最优性、计算障碍和高维校正
  • 批准号:
    2218713
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Efficient Reinforcement Learning
合作研究:CIF:媒介:高效强化学习的统计和算法基础
  • 批准号:
    2221009
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
  • 批准号:
    2100158
  • 财政年份:
    2021
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: Fine-Grained Statistical Inference in High Dimension: Actionable Information, Bias Reduction, and Optimality
协作研究:高维细粒度统计推断:可操作信息、减少偏差和最优性
  • 批准号:
    2014279
  • 财政年份:
    2020
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CIF: Small: Taming Nonconvexity in High-Dimensional Statistical Estimation
CIF:小:驯服高维统计估计中的非凸性
  • 批准号:
    1907661
  • 财政年份:
    2019
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
RI: Medium: Collaborative Research:Algorithmic High-Dimensional Statistics: Optimality, Computtional Barriers, and High-Dimensional Corrections
RI:中:协作研究:算法高维统计:最优性、计算障碍和高维校正
  • 批准号:
    1900140
  • 财政年份:
    2019
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant

相似国自然基金

IGF-1R调控HIF-1α促进Th17细胞分化在甲状腺眼病发病中的机制研究
  • 批准号:
    82301258
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
CTCFL调控IL-10抑制CD4+CTL旁观者激活促口腔鳞状细胞癌新辅助免疫治疗抵抗机制研究
  • 批准号:
    82373325
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
RNA剪接因子PRPF31突变导致人视网膜色素变性的机制研究
  • 批准号:
    82301216
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
血管内皮细胞通过E2F1/NF-kB/IL-6轴调控巨噬细胞活化在眼眶静脉畸形中的作用及机制研究
  • 批准号:
    82301257
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于多元原子间相互作用的铝合金基体团簇调控与强化机制研究
  • 批准号:
    52371115
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
  • 批准号:
    2326622
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
  • 批准号:
    2326621
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
  • 批准号:
    2343600
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
  • 批准号:
    2403123
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
  • 批准号:
    2326622
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了