An Abstraction-based Technique for Safe Reinforcement Learning
一种基于抽象的安全强化学习技术
基本信息
- 批准号:EP/X015823/1
- 负责人:
- 金额:$ 38.49万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2023
- 资助国家:英国
- 起止时间:2023 至 无数据
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Autonomous agents learning to act in unknown environments have been attracting research interest due to their wider implications for AI, as well as for their applications in complex domains, including robotics, network optimisation, and resource allocation. Currently, one of the most successful approaches is reinforcement learning (RL). However, to learn how to act, agents are required to explore the environment, which in safety-critical scenarios means that they might take dangerous actions, possibly harming themselves or even putting human lives at risk. Consequently, reinforcement learning is still rarely used in real-world applications, where multiple safety-critical constraints need to be satisfied simultaneously.To alleviate this problem, RL algorithms are being combined with formal verification techniques to ensure safety in learning. Indeed, formal methods are nowadays routinely applied to the specification, design, and verification of complex systems, as they allow to obtain proof-like certification of their correct and safe behaviour, which is meant to be intelligible to system engineers and human users alike. These desirable features have motivated the adoption of formal methods for the verification of general AI systems, which has variously been called safe, verifiable, trustworthy AI 1. Still, the application of formal methods to AI systems raises significant new challenges, including the "black-box" nature of most machine learning algorithms used nowadays. Specific to the application of formal methods to RL, we identify two main shortcomings with current approaches, which will be tackled in this project:- Most of current verification methodologies do not scale well as the complexity of the application increases. This state explosion problem is particularly acute for RL scenarios, where agents might have to chose among a huge number of action/state transitions (e.g., autonomous cars).- Systems with multiple learning agents are comparatively less explored, and therefore less understood, than single-agent settings, partly because of the high-dimensionality of their state-space and their non-stationarity. Yet, multi-agent settings are key for applications, such as platooning for autonomous vehicles and robot swarms.To tackle both problems, we put forward an abstraction-based approach to verification, which is meant to reduce the state space, also by leveraging on symmetries of the system, while preserving all its safety-related features, thus leading to guaranteed and scalable safe behaviours. The research envisaged in this project is timely and it fits with the current portfolio of EPSRC-funded research, as it aligns with the theme of AI and robotics, in particular the key strategic investment in trust-worthy autonomous systems. The present proposal is aimed at developing a verifiably safe RL methodology, which is meant to have a positive societal impact on the trust of the general public towards deployed AI solutions, and to facilitate their adoption within society at large.
由于对AI的更广泛影响,以及它们在复杂领域的应用,包括机器人技术,网络优化和资源分配。目前,最成功的方法之一是增强学习(RL)。但是,要学习采取行动,需要代理人探索环境,这在安全至关重要的情况下意味着他们可能采取危险的行动,可能伤害自己,甚至使人类生命处于危险之中。因此,增强学习仍然很少用于现实世界应用,在现实世界中,需要同时满足多个安全至关重要的约束。为了减轻此问题,RL算法与正式验证技术相结合,以确保学习安全。实际上,如今,正式方法通常适用于复杂系统的规范,设计和验证,因为它们允许获得其正确且安全的行为的类似证明的认证,这对于系统工程师和人类用户都可以理解。这些理想的特征促使采用正式方法来验证一般AI系统,该系统被称为安全,可验证,可信赖的AI 1。尽管如此,在AI系统中,将形式方法应用于AI系统会引起重大的新挑战,包括当今使用的大多数机器学习算法的“黑箱”本质。特定于将正式方法应用于RL的应用,我们确定了两个主要方法,这些方法将在该项目中解决: - 当前大多数当前验证方法的扩展不是很好地扩展,因为应用程序的复杂性增加。对于RL场景而言,这种状态爆炸问题尤其严重,在这种情况下,代理商可能必须在大量的动作/州过渡中选择(例如,自动驾驶汽车)。-具有多个学习剂的系统相对较少,因此比单一代理的探索较少,部分是由于其状态空间和非统一空间和非统一的设置。然而,多机构设置是应用的关键,例如为自动驾驶汽车和机器人群的排成线。为了解决这两个问题,我们提出了一种基于抽象的验证方法,该方法旨在减少状态空间,还可以通过利用系统的对称性,同时保留其所有与安全相关的功能,从而为保证和确定的安全性和kable safe and scable Safe Cable Cable Cable Cable Cable Cable Cable Cable Cable Cable Cable Cable Cable Cable Capeiors。该项目中设想的研究是及时的,它与EPSRC资助的研究的当前投资组合相吻合,因为它与AI和机器人技术的主题相符,尤其是在值得信任的自主系统中的关键战略投资。本提案旨在开发一种可靠的安全RL方法,该方法旨在对公众对部署AI解决方案的信任产生积极的社会影响,并促进其在整个社会中的收养。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Francesco Belardinelli其他文献
On the Stability of Learning in Network Games with Many Players
论多人网络游戏中学习的稳定性
- DOI:
10.48550/arxiv.2403.15848 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
A. Hussain;D.G. Leonte;Francesco Belardinelli;G. Piliouras - 通讯作者:
G. Piliouras
The Reasons that Agents Act: Intention and Instrumental Goals
代理人行动的原因:意图和工具性目标
- DOI:
10.48550/arxiv.2402.07221 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Francis Rhys Ward;Matt MacDermott;Francesco Belardinelli;Francesca Toni;Tom Everitt - 通讯作者:
Tom Everitt
Stability of Multi-Agent Learning in Competitive Networks: Delaying the Onset of Chaos
竞争网络中多智能体学习的稳定性:延迟混沌的发生
- DOI:
10.48550/arxiv.2312.11943 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
A. Hussain;Francesco Belardinelli - 通讯作者:
Francesco Belardinelli
Francesco Belardinelli的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Francesco Belardinelli', 18)}}的其他基金
Strategy Logics for the Verification of Security Protocols
安全协议验证的策略逻辑
- 批准号:
EP/V009214/1 - 财政年份:2021
- 资助金额:
$ 38.49万 - 项目类别:
Research Grant
The Third International Workshop on Formal Methods in Artificial Intelligence
第三届人工智能形式化方法国际研讨会
- 批准号:
EP/V008013/1 - 财政年份:2021
- 资助金额:
$ 38.49万 - 项目类别:
Research Grant
相似国自然基金
基于程序分析与测试的二进制软件漏洞挖掘技术研究
- 批准号:61702540
- 批准年份:2017
- 资助金额:27.0 万元
- 项目类别:青年科学基金项目
基于网络抽象的SDN编程方法关键技术研究
- 批准号:61602264
- 批准年份:2016
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
基于谓词抽象技术的访问控制策略安全性快速判定方法的研究
- 批准号:61300228
- 批准年份:2013
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
基于事件与环境协同感知的典型网络安全事件场景构建技术研究
- 批准号:61370215
- 批准年份:2013
- 资助金额:75.0 万元
- 项目类别:面上项目
建筑设计创新方法中基于图解抽象机制的"信息图构"策略研究
- 批准号:51308393
- 批准年份:2013
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
相似海外基金
A novel damage characterization technique based on adaptive deconvolution extraction algorithm of multivariate AE signals for accurate diagnosis of osteoarthritic knees
基于多变量 AE 信号自适应反卷积提取算法的新型损伤表征技术,用于准确诊断膝关节骨关节炎
- 批准号:
24K07389 - 财政年份:2024
- 资助金额:
$ 38.49万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Testing a Memory-Based Hypothesis for Anhedonia
测试基于记忆的快感缺失假设
- 批准号:
10598974 - 财政年份:2023
- 资助金额:
$ 38.49万 - 项目类别:
Innovating anti-tuberculosis drug susceptibility testing with a novel and rapid non-culture based phenotypic test using MPT64 biomarker
使用 MPT64 生物标志物,通过新型、快速的非培养表型测试来创新抗结核药物敏感性测试
- 批准号:
10663034 - 财政年份:2023
- 资助金额:
$ 38.49万 - 项目类别:
Vector Flow Velocity Imaging of Human Placenta using Angle-resolved Ultrasound and Deep Learning
使用角度分辨超声和深度学习对人胎盘进行矢量血流速度成像
- 批准号:
10886180 - 财政年份:2023
- 资助金额:
$ 38.49万 - 项目类别:
Harmony AI: State of the Art Natural Language Processing for Genetic Engineering
Harmony AI:用于基因工程的最先进的自然语言处理
- 批准号:
10698805 - 财政年份:2023
- 资助金额:
$ 38.49万 - 项目类别: