Collaborative Research: CIF: Medium: MoDL:Toward a Mathematical Foundation of Deep Reinforcement Learning
合作研究:CIF:媒介:MoDL:迈向深度强化学习的数学基础
基本信息
- 批准号:2212262
- 负责人:
- 金额:$ 30万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-10-01 至 2026-09-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Deep Reinforcement Learning (DRL), which uses neural networks to solve sequential decision-making problems, has made breakthroughs in real-world applications, such as robotics, gaming, healthcare, and transportation systems. However, current theoretical work on reinforcement learning is restricted to problems with a small number of states; as these results do not cover neural networks, they cannot be used to satisfactorily explain the empirical successes of DRL. This project seeks to bridge this gap by building a mathematical foundation for DRL that leverages ideas from approximation theory, control theory, and optimization theory. This will allow the computational and statistical complexity of DRL to be systematically characterized, and will help with designing more efficient and reliable empirical methods. Education and outreach plans are integrated into this project. Specifically, the investigators will mentor graduate and undergraduate students (some through the STARS program for underrepresented groups at the University of washington), develop new courses and monographs, organize research workshops, and develop course materials for a high school data science and artificial intelligence curriculum. This project has three major components. The first thrust identifies which types of guarantees are achievable by policies for different reinforcement learning problem instances. Concretely, this requires investigating how increasingly structured problem instances enable stronger guarantees for policies; this will be done by using, and further developing, tools from non-convex optimization to describe policies that achieve stationary points, local maxima, and global maxima of the reward function. The second thrust takes the perspective of approximation theory and capacity control to investigate how the neural network complexity can be gradually increased to eventually find the most complex sub-family of neural networks that permit sample-efficient algorithms. The third thrust builds upon the knowledge gained in the first two thrusts, and is devoted to the design of computationally efficient algorithms; this will be done by leveraging tools from optimization theory and by making connections with control theory.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
深度强化学习 (DRL) 使用神经网络解决顺序决策问题,在机器人、游戏、医疗保健和交通系统等现实应用中取得了突破。然而,当前强化学习的理论工作仅限于少数状态的问题;由于这些结果不涵盖神经网络,因此它们不能用来令人满意地解释 DRL 的经验成功。该项目旨在利用近似理论、控制理论和优化理论的思想,为 DRL 建立数学基础,从而弥补这一差距。这将使 DRL 的计算和统计复杂性得到系统表征,并有助于设计更高效、更可靠的经验方法。教育和推广计划已纳入该项目。具体来说,研究人员将指导研究生和本科生(其中一些是通过华盛顿大学针对代表性不足群体的 STARS 计划)、开发新课程和专着、组织研究研讨会以及为高中数据科学和人工智能课程开发课程材料。该项目由三个主要部分组成。第一个主旨确定了针对不同强化学习问题实例的策略可以实现哪些类型的保证。具体来说,这需要研究日益结构化的问题实例如何为政策提供更有力的保障;这将通过使用并进一步开发非凸优化工具来描述实现奖励函数的固定点、局部最大值和全局最大值的策略来完成。第二个主旨从近似理论和容量控制的角度来研究如何逐渐增加神经网络的复杂性,以最终找到允许样本高效算法的最复杂的神经网络子族。第三个主旨建立在前两个主旨所获得的知识的基础上,致力于计算高效的算法的设计;这将通过利用优化理论的工具并与控制理论建立联系来完成。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jason Lee其他文献
The difluoromethylenesulfonic acid group as a monoanionic phosphate surrogate for obtaining PTP1B inhibitors.
二氟亚甲基磺酸基团作为单阴离子磷酸盐替代物,用于获得 PTP1B 抑制剂。
- DOI:
10.1016/s0968-0896(02)00062-7 - 发表时间:
2002 - 期刊:
- 影响因子:3.5
- 作者:
Carmen Leung;J. Grzyb;Jason Lee;Natalie Meyer;G. Hum;Chenguo Jia;Shifeng Liu;Scott D. Taylor - 通讯作者:
Scott D. Taylor
Horizontal muon track identification with neural networks in HAWC
HAWC 中神经网络的水平 μ 子径迹识别
- DOI:
10.22323/1.395.1036 - 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
J. R. A. Camacho;A. Abeysekara;A. Albert;R. Alfaro;C. Álvarez;Juan de Dios Álvarez Romero;J. Velazquez;Arun Babu Kollamparambil;D. Rojas;H. A. Solares;R. Babu;V. Baghmanyan;A. Barber;J. González;E. Belmont;S. BenZvi;D. Berley;C. Brisbois;K. Mora;T. Capistrán;A. Carramiñana;S. Casanova;O. Chaparro;U. Cotti;J. Cotzomi;S. León;E. D. L. Fuente;C. D. León;Lorenzo Diaz;R. D. Hernandez;J. Vélez;B. Dingus;M. Durocher;M. DuVernois;R. Ellsworth;K. Engel;María Catalina Espinoza Hernández;Jason Fan;K. Fang;M. F. Alonso;B. Fick;H. Fleischhack;J. L. Flores;N. Fraija;Diego Garcia Aguilar;J. A. García;J. L. García;G. Garcia;F. Garfias;G. Giacinti;H. Goksu;M. González;J. Goodman;J. P. Harding;S. H. Cadena;I. Herzog;J. Hinton;B. Hona;Dezhi Huang;F. Hueyotl;M. Hui;B. Humensky;P. Hüntemeyer;A. Iriarte;A. Jardin;H. Jhee;V. Joshi;D. Kieda;G. Kunde;S. Kunwar;A. Lara;Jason Lee;W. Lee;D. Lennarz;H. L. Vargas;J. Linnemann;A. Longinotti;R. López;G. Luis;J. Lundeen;K. Malone;V. Marandon;O. Martinez;I. Castellanos;Humberto Martínez Huerta;J. Martínez;J. Matthews;J. Mcenery;P. Miranda;Jorge Antonio Morales Soto;E. M. Barbosa;M. Mostafá;A. Nayerhoda;L. Nellen;M. Newbold;M. Nisa;R. Noriega;L. Olivera;N. Omodei;A. Peisker;Y. P. Araujo;E. Pérez;C. Rho;C. Rivière;D. Rosa;E. Ruiz;J. Ryan;H. Salazar;F. Greus;A. Sandoval;Michael Schneider;H. Schoorlemmer;J. Serna;G. Sinnis;A. Smith;W. Springer;P. Surajbali;I. Taboada;M. Tanner;K. Tollefson;I. Torres;Ramiro Torres Escobedo;Rhiannon M. Turner;F. Ureña;Luis Villaseñor;Xiaojie Wang;I. Watson;T. Weisgarber;Felix Werner;E. Willox;Joshua R. Wood;G. Yodh;A. Zepeda;Hao Zhou;Hawc - 通讯作者:
Hawc
Symbiotic HW Cache and SW DTLB Prefetching for DRAM/NVM Hybrid Memory
用于 DRAM/NVM 混合内存的共生硬件缓存和软件 DTLB 预取
- DOI:
10.1109/mascots50786.2020.9285963 - 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Onkar Patil;F. Mueller;Latchesar Ionkov;Jason Lee;M. Lang - 通讯作者:
M. Lang
Bilateral Atypical Femoral Fracture in a Bisphosphonate-Naïve Patient with Prior Long-Term Denosumab Therapy: A Case Report of the Management Strategy and a Literature Review
既往接受过长期狄诺塞麦治疗的双磷酸盐初治患者的双侧非典型股骨骨折:管理策略病例报告和文献综述
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:3.9
- 作者:
Kyle Auger;Jason Lee;Ian S. Hong;Jaclyn M. Jankowski;Frank A. Liporace;Richard S. Yoon - 通讯作者:
Richard S. Yoon
MOTIVES FOR GOING PUBLIC AND UNDERPRICING: NEW FINDINGS FROM KOREA
上市和抑价的动机:韩国的新发现
- DOI:
10.1111/j.1468-5957.1993.tb00659.x - 发表时间:
1993 - 期刊:
- 影响因子:2.9
- 作者:
Jeong‐Bon Kim;I. Krinsky;Jason Lee - 通讯作者:
Jason Lee
Jason Lee的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jason Lee', 18)}}的其他基金
CAREER: Towards a Theory of Deep Learning
职业:走向深度学习理论
- 批准号:
2144994 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
CIF: Medium: Collaborative Research: Theory of Optimization Geometry and Algorithms for Neural Networks
CIF:媒介:协作研究:神经网络优化几何理论和算法
- 批准号:
2002272 - 财政年份:2019
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
CIF: Medium: Collaborative Research: Theory of Optimization Geometry and Algorithms for Neural Networks
CIF:媒介:协作研究:神经网络优化几何理论和算法
- 批准号:
1856549 - 财政年份:2019
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
REU Site: Interdisciplinary Nanotechnology Traineeship for Next-Generation Energy, Health, Information, and Manufacturing
REU 网站:下一代能源、健康、信息和制造的跨学科纳米技术培训
- 批准号:
1560098 - 财政年份:2016
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Preparing African American Males for Energy & Education (PAAMEE)
为非洲裔美国男性提供能源做好准备
- 批准号:
1614741 - 财政年份:2016
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
PURSE: Promoting Underrepresented Girls Involvement in Research, Science, and Energy
PURSE:促进代表性不足的女孩参与研究、科学和能源
- 批准号:
0929728 - 财政年份:2009
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
相似国自然基金
离子型稀土渗流-应力-化学耦合作用机理与溶浸开采优化研究
- 批准号:52364012
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
亲环蛋白调控作物与蚜虫互作分子机制的研究
- 批准号:32301770
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于金属-多酚网络衍生多相吸波体的界面调控及电磁响应机制研究
- 批准号:52302362
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
职场网络闲逛行为的作用结果及其反馈效应——基于行为者和观察者视角的整合研究
- 批准号:72302108
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
EIF6负调控Dicer活性促进EV71复制的分子机制研究
- 批准号:32300133
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403122 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402815 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343599 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343600 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research:CIF:Small:Acoustic-Optic Vision - Combining Ultrasonic Sonars with Visible Sensors for Robust Machine Perception
合作研究:CIF:Small:声光视觉 - 将超声波声纳与可见传感器相结合,实现强大的机器感知
- 批准号:
2326905 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant