CAREER: Principled Deep Reinforcement Learning for Societal Systems

职业:社会系统的有原则的深度强化学习

基本信息

  • 批准号:
    2048075
  • 负责人:
  • 金额:
    $ 50万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2021
  • 资助国家:
    美国
  • 起止时间:
    2021-02-01 至 2026-01-31
  • 项目状态:
    未结题

项目摘要

The recent breakthrough in deep reinforcement learning (RL), especially its superhuman-level performance in board and video games, e.g., Go, Atari, Dota, and StarCraft, opens up new avenues for controlling many complex and unknown systems via learning. However, for practical purposes beyond game playing, deep RL still suffers from a lack of efficiency and trustworthiness. In terms of efficiency, the empirical success of deep RL requires millions to billions of data points and days to weeks of running time. In terms of trustworthiness, the empirical success of deep RL is only measured by the received reward, which does not account for safety and robustness. Such a lack of efficiency and trustworthiness is further exacerbated when we scale up deep RL to design and optimize societal systems in critical domains, e.g., healthcare, transportation, power grid, financial network, and supply chain.This CAREER proposal addresses these challenges by establishing a theoretical framework for analyzing the computational efficiency and sample efficiency of single-agent deep RL and an algorithmic framework for achieving such efficiencies. Moreover, it leads to a stochastic game framework for achieving safety, robustness, scalability, fairness, risk-awareness, and incentivization in social systems via multi-agent deep RL. The research plan emphasizes connecting deep RL with multiple fields, e.g., nonconvex optimization, nonparametric statistics, causal inference, stochastic game, and social science. The education plan emphasizes teaching data-driven decision making as a fundamental skill for future generations, especially for future leaders, in societal contexts. In particular, it aims to promote the idea of data-driven social leadership and support underrepresented minority researchers and students, who personally experience pressing challenges in societal systems, from K-12 education to graduate training. In order to cope with the ongoing pandemic, the outreach plan involves organizing online seminars on data science and artificial intelligence, mentoring remote interns by integrating research and education, and engaging remote students via DataFest and Client Project Challenge.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
深度强化学习 (RL) 最近的突破,特别是其在围棋、雅达利、Dota 和星际争霸等棋盘和视频游戏中超人水平的表现,为通过学习控制许多复杂和未知的系统开辟了新途径。然而,出于游戏之外的实际目的,深度强化学习仍然缺乏效率和可信度。在效率方面,深度强化学习的经验成功需要数百万到数十亿的数据点和数天到数周的运行时间。就可信度而言,深度强化学习的经验成功仅通过收到的奖励来衡量,并不能说明安全性和鲁棒性。当我们扩大深度强化学习以设计和优化医疗保健、交通、电网、金融网络和供应链等关键领域的社会系统时,效率和可信度的缺乏会进一步加剧。本职业提案通过建立用于分析单智能体深度强化学习的计算效率和样本效率的理论框架以及实现这种效率的算法框架。此外,它还形成了一个随机博弈框架,通过多智能体深度强化学习在社会系统中实现安全性、鲁棒性、可扩展性、公平性、风险意识和激励。该研究计划强调将深度强化学习与非凸优化、非参数统计、因果推理、随机博弈和社会科学等多个领域联系起来。该教育计划强调将数据驱动决策的教学作为未来几代人的基本技能,尤其是未来领导者在社会环境中的基本技能。特别是,它旨在推广数据驱动的社会领导理念,并支持代表性不足的少数族裔研究人员和学生,他们亲身经历着从 K-12 教育到研究生培训的社会系统中的紧迫挑战。为了应对持续的大流行,外展计划包括组织有关数据科学和人工智能的在线研讨会,通过整合研究和教育来指导远程实习生,以及通过 DataFest 和客户项目挑战赛吸引远程学生。该奖项反映了 NSF 的法定使命和通过使用基金会的智力价值和更广泛的影响审查标准进行评估,该项目被认为值得支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Zhaoran Wang其他文献

An Efficient Trajectory Planning Approach for Autonomous Ground Vehicles Using Improved Artificial Potential Field
使用改进的人工势场的自主地面车辆的高效轨迹规划方法
  • DOI:
    10.3390/sym16010106
  • 发表时间:
    2024-01-15
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xianjian Jin;Zhiwei Li;Nonsly Valerienne Opinat Ikiela;Xiongkui He;Zhaoran Wang;Yinchen Tao;Huaizhen Lv
  • 通讯作者:
    Huaizhen Lv
Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics
利用学习到的特征进行乐观探索可证明利用神经动力学解决马尔可夫决策过程
  • DOI:
  • 发表时间:
    2024-09-14
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sirui Zheng;Lingxiao Wang;Shuang Qiu;Zuyue Fu;Zhuoran Yang;Csaba Szepesvari;Zhaoran Wang
  • 通讯作者:
    Zhaoran Wang
Gap-Dependent Bounds for Two-Player Markov Games
两人马尔可夫博弈的间隙相关界限
  • DOI:
  • 发表时间:
    2021-07-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zehao Dou;Zhuoran Yang;Zhaoran Wang;S. Du
  • 通讯作者:
    S. Du
Research on Task Scheduling Optimi-Zation of Data Center Under Double Carbon Target
双碳目标下数据中心任务调度优化研究
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
悲观极小极大值迭代:从离线数据集中可证明有效的均衡学习
  • DOI:
  • 发表时间:
    2022-02-15
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Han Zhong;Wei Xiong;Jiyuan Tan;Liwei Wang;Tong Zhang;Zhaoran Wang;Zhuoran Yang
  • 通讯作者:
    Zhuoran Yang

Zhaoran Wang的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Zhaoran Wang', 18)}}的其他基金

Collaborative Research: CIF: Medium: Learning to Control from Data: from Theory to Practice
合作研究:CIF:媒介:从数据中学习控制:从理论到实践
  • 批准号:
    2211210
  • 财政年份:
    2022
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF: Small: A Unified Framework of Distributional Optimization via Variational Transport
合作研究:CIF:小型:通过变分传输的分布式优化的统一框架
  • 批准号:
    2008827
  • 财政年份:
    2020
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: High-Dimensional Decision Making and Inference with Applications for Personalized Medicine
合作研究:高维决策和推理及其在个性化医疗中的应用
  • 批准号:
    2015568
  • 财政年份:
    2020
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
Collaborative Research: High-Dimensional Decision Making and Inference with Applications for Personalized Medicine
合作研究:高维决策和推理及其在个性化医疗中的应用
  • 批准号:
    2015568
  • 财政年份:
    2020
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant

相似海外基金

CAREER: Principled yet practical observability for a microservices-based cloud
职业:基于微服务的云的原则性且实用的可观察性
  • 批准号:
    2340128
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
CAREER: Principled Unsupervised Learning via Minimum Volume Polytopic Embedding
职业:通过最小体积多面嵌入进行有原则的无监督学习
  • 批准号:
    2237640
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
Principled phylogenomic analysis without gene tree estimation
无需基因树估计的有原则的系统发育分析
  • 批准号:
    2308495
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Principled Reasoning about Dynamical Systems
关于动力系统的原理推理
  • 批准号:
    RGPIN-2020-05031
  • 财政年份:
    2022
  • 资助金额:
    $ 50万
  • 项目类别:
    Discovery Grants Program - Individual
CRCNS Research Proposal: Collaborative Research: US-German Collaboration toward a biophysically principled network model of transcranial magnetic stimulation (TMS)
CRCNS 研究提案:合作研究:美德合作建立经颅磁刺激 (TMS) 的生物物理原理网络模型
  • 批准号:
    10708986
  • 财政年份:
    2022
  • 资助金额:
    $ 50万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了