Exploring Causality in Reinforcement Learning for Robust Decision Making

探索强化学习中的因果关系以实现稳健决策

基本信息

  • 批准号:
    EP/Y003187/1
  • 负责人:
  • 金额:
    $ 20.97万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2023
  • 资助国家:
    英国
  • 起止时间:
    2023 至 无数据
  • 项目状态:
    未结题

项目摘要

Reinforcement learning (RL) has seen significant development in recent years and has demonstrated impressive capabilities in decision-making tasks, such as games (AlphaStar, OpenAI Five), chatbots (ChatGPT, and recommendation systems (Microsoft). The techniques of RL can also be applied to many fields, such as transportation, network communications, autonomous driving, sequential treatment in healthcare, robotics, and control. Unlike traditional supervised learning, RL focuses on making a sequence of decisions to achieve a long-term goal. This makes it particularly well-suited for solving complex problems. However, while RL has the potential to be highly effective, there are challenges that need to be addressed in order to make it more practical for real-world applications, where changing factors cannot be fully considered in training the agent, such as traffic regulations, weather, and clouds. To empower RL algorithms to be deployed in a range of real applications, we need to evaluate and improve the robustness of RL when facing complex changes in the real world and task shifts.In this project, we aim to develop robust and generalisable reinforcement learning techniques from a causal modelling perspective. The first thrust focuses on utilising causal model learning to create compact and robust representations of tasks. This compact and robust task representation can greatly benefit the overall performance of the RL agent by reducing the complexity of the problem and making the agent's decision-making process more efficient. As a result, the agent can learn faster and generalise better to unseen tasks, which is especially important in real-world scenarios where data is scarce and the complexity of tasks can vary greatly.The second research thrust focuses on the development of efficient and generalisable algorithms for task assignment transfer. This can enable the RL agent to adapt to new tasks more quickly and effectively and to generalise the learned knowledge to different but related tasks. This is crucial for real-world scenarios where the agent needs to operate in different environments or the task requirements change over time.One example of an application that would benefit from these contributions is autonomous driving in an industrial setting. While RL agents are usually trained in simulators, they may not perform well in real-world road scenarios and can be easily distracted by task-irrelevant information. For example, visual images that autonomous cars observe contain predominantly task-irrelevant information, like cloud shapes and architectural details, which should not influence the decision on driving.In this project, we aim to enable the agent to learn a compact and robust representation of the task, enabling it to only retain state information that is relevant to the task, adapt to changing driving scenarios safely, and generalise its knowledge to related tasks such as adapting to the different driving rules in the United States (right-hand drive).A causal understanding can help identify the minimal sufficient representations that are essential for policy learning and transferring and achieve safe and controllable explorations by leveraging causal structures and counterfactual reasoning.It can mitigate the issues that are suffered by most existing RL approaches, such as being data-hungry and lacking interpretability and generalisability.The outcome of this project can greatly improve the scalability and adaptability of RL agents, making them more suitable for real-world applications.
Reinforcement learning (RL) has seen significant development in recent years and has demonstrated impressive capabilities in decision-making tasks, such as games (AlphaStar, OpenAI Five), chatbots (ChatGPT, and recommendation systems (Microsoft). The techniques of RL can also be applied to many fields, such as transportation, network communications, autonomous driving, sequential treatment in healthcare, robotics, and control. Unlike traditional supervised learning, RL专注于做出长期目标的序列。在面对现实世界和任务变化的复杂变化时,需要评估和改善RL的鲁棒性。在这个项目中,我们旨在从因果建模的角度来开发可靠和可推广的强化学习技术。第一个推力重点是利用因果模型学习来创建任务的紧凑而健壮的表示。这种紧凑而健壮的任务表示可以通过降低问题的复杂性并使代理商的决策过程更加有效,从而极大地使RL代理的整体绩效受益。结果,代理可以更快地学习和推广到看不见的任务,这在数据稀缺的现实情况下尤为重要,并且任务的复杂性可能会有很大变化。第二项研究推力侧重于开发有效且可普遍的任务分配转移算法。这可以使RL代理更快,有效地适应新任务,并将学习的知识推广到不同但相关的任务。这对于代理商需要在不同环境或任务要求随时间变化的现实情况下至关重要。从这些贡献中受益的应用程序的一个示例是在工业环境中自动驾驶。尽管RL代理通常是在模拟器中训练的,但在现实世界中的情况下可能无法表现良好,并且很容易被任务信息的信息分散注意力。例如,自动驾驶观察的视觉图像主要包含任务 - 无关紧要的信息,例如云形状和架构细节,不应影响驾驶的决定。在该项目中,我们的目标是使代理人能够紧凑而稳健地代表任务的代表,以学习与该任务相关的信息,并使其与任务相关,并适用于驾驶的信息,并适用于该任务,并将其与任务相关,并确定该任务的信息,并确定该任务的信息,并确定该任务的信息,并确定该任务的信息,并确定该任务的信息,并将其与任务相关,并将其与任务相关,并且可以使该任务相关,并将其与任务相关,并且可以使该任务相关,并且可以使该任务相关,并且可以使该任务相关,并将其与该任务相关,并将其与该任务相关,并将其与该任务相关,并将其与该任务相关。美国的驾驶规则(右手)。一种因果理解可以帮助确定最低的足够代表性,这些表示对政策学习和转移至关重要,并通过利用因果结构和反事实推理来实现安全可控制的探索,它可以减轻大多数现有的RL方法,例如延伸性和不足的能力,例如这一范围的能力,例如这一范围内的能力,例如这一范围的能力。 RL代理,使其更适合实际应用。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

暂无数据

数据更新时间:2024-06-01

Yali Du其他文献

Reinforcement Learning With Multiple Relational Attention for Solving Vehicle Routing Problems
使用多重关系关注的强化学习解决车辆路径问题
  • DOI:
    10.1109/tcyb.2021.3089179
    10.1109/tcyb.2021.3089179
  • 发表时间:
    2021-07
    2021-07
  • 期刊:
  • 影响因子:
    11.8
  • 作者:
    Yunqiu Xu;Meng Fang;Ling Chen;Gangyan Xu;Yali Du;Chengqi Zhang
    Yunqiu Xu;Meng Fang;Ling Chen;Gangyan Xu;Yali Du;Chengqi Zhang
  • 通讯作者:
    Chengqi Zhang
    Chengqi Zhang
Comparison between Ologen implant and Mitomycin C in trabeculectomy: A Systematic Review and Meta-Analysis
Ologen 植入物与丝裂霉素 C 在小梁切除术中的比较:系统评价和荟萃分析
Learning the Expected Core of Strictly Convex Stochastic Cooperative Games
学习严格凸随机合作博弈的期望核心
  • DOI:
  • 发表时间:
    2024
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Nam Phuong Tran;T. Ta;Shuqing Shi;Debmalya Mandal;Yali Du;Long Tran
    Nam Phuong Tran;T. Ta;Shuqing Shi;Debmalya Mandal;Yali Du;Long Tran
  • 通讯作者:
    Long Tran
    Long Tran
Persistence of severe global inequalities in the burden of blindness and vision loss from 1990 to 2019: findings from the Global Burden of Disease Study 2019
1990 年至 2019 年全球失明和视力丧失负担方面持续存在严重不平等:2019 年全球疾病负担研究的结果
  • DOI:
  • 发表时间:
    2023
    2023
  • 期刊:
  • 影响因子:
    4.1
  • 作者:
    Yuancun Li;Hongxi Wang;Zhiqiang Guan;Cheng;P. Guo;Yali Du;Shengjie Yin;Binyao Chen;Jiao Jiang;Yueting Ma;Liu Jing;Yingzi Huang;Ke Zheng;Qian Ma;Ruiqing Zhou;Min Chen;N. Congdon;K. Qiu;Mingzhi Zhang
    Yuancun Li;Hongxi Wang;Zhiqiang Guan;Cheng;P. Guo;Yali Du;Shengjie Yin;Binyao Chen;Jiao Jiang;Yueting Ma;Liu Jing;Yingzi Huang;Ke Zheng;Qian Ma;Ruiqing Zhou;Min Chen;N. Congdon;K. Qiu;Mingzhi Zhang
  • 通讯作者:
    Mingzhi Zhang
    Mingzhi Zhang
Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning
元奖励网络:基于偏好的强化学习的隐式可微奖励学习
共 25 条
  • 1
  • 2
  • 3
  • 4
  • 5
前往

相似国自然基金

面向隐私保护数据的联邦因果关系推断算法研究
  • 批准号:
    62376087
  • 批准年份:
    2023
  • 资助金额:
    51 万元
  • 项目类别:
    面上项目
基于因果深度强化学习的化合物—蛋白质亲和力预测方法研究
  • 批准号:
    62302339
  • 批准年份:
    2023
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
面向离散数据的隐变量间因果关系推断理论与方法研究
  • 批准号:
    62306019
  • 批准年份:
    2023
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
基于因果关系计算的可解释认知演化路径发现方法及应用研究
  • 批准号:
    62377024
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
基于遗传大数据探究外周血白细胞计数与帕金森病的因果关系:孟德尔随机化研究和遗传风险评分分析
  • 批准号:
    82301434
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Rapid, Scalable, and Joint Assessment of Seismic Multi-Hazards and Impacts: From Satellite Images to Causality-Informed Deep Bayesian Networks
地震多重灾害和影响的快速、可扩展和联合评估:从卫星图像到因果关系深度贝叶斯网络
  • 批准号:
    2242590
    2242590
  • 财政年份:
    2024
  • 资助金额:
    $ 20.97万
    $ 20.97万
  • 项目类别:
    Standard Grant
    Standard Grant
CHAI - EPSRC AI Hub for Causality in Healthcare AI with Real Data
CHAI - EPSRC AI 中心,利用真实数据研究医疗保健 AI 中的因果关系
  • 批准号:
    EP/Y028856/1
    EP/Y028856/1
  • 财政年份:
    2024
  • 资助金额:
    $ 20.97万
    $ 20.97万
  • 项目类别:
    Research Grant
    Research Grant
Development of a Causality Analysis Method for Point Processes Based on Nonlinear Dynamical Systems Theory and Elucidation of the Representation of Information Processing in the Brain
基于非线性动力系统理论的点过程因果分析方法的发展及大脑信息处理表征的阐明
  • 批准号:
    22KJ2815
    22KJ2815
  • 财政年份:
    2023
  • 资助金额:
    $ 20.97万
    $ 20.97万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
    Grant-in-Aid for JSPS Fellows
Collaborative Research: Learning and forecasting high-dimensional extremes: sparsity, causality, privacy
协作研究:学习和预测高维极端情况:稀疏性、因果关系、隐私
  • 批准号:
    2310974
    2310974
  • 财政年份:
    2023
  • 资助金额:
    $ 20.97万
    $ 20.97万
  • 项目类别:
    Standard Grant
    Standard Grant
EAGER: North American Monsoon Prediction Using Causality Informed Machine Learning
EAGER:使用因果关系信息机器学习来预测北美季风
  • 批准号:
    2313689
    2313689
  • 财政年份:
    2023
  • 资助金额:
    $ 20.97万
    $ 20.97万
  • 项目类别:
    Standard Grant
    Standard Grant