S&AS: FND: COLLAB: Learning from Stories: Practical Value Alignment and Taskability for Autonomous Systems
S
基本信息
- 批准号:1849231
- 负责人:
- 金额:$ 29.13万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-06-01 至 2023-05-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In the near future we are likely to see increasingly-capable autonomous systems operating in proximity to humans and immersed in society. As these systems become more sophisticated, they will interact increasingly with humans. With this increased human-agent interaction comes an increased obligation to ensure that autonomous systems do not cause even unintentional harm to a human. Creating systems that cannot intentionally or unintentionally harm humans in not an easy task. This is because there are infinitely many undesirable outcomes that can be achieved in an open world, making it impossible to instruct these systems to avoid each one. If the desired behavior cannot be directly specified, then it must be learned. Past approaches to learn these types of behaviors have focused on learning from human examples, but these methods are unlikely to scale. This research uses natural language explanations of behavior as a scalable alternative for training autonomous agents for safe operation. Naturalistic descriptions contain vast amounts of information about sociocultural norms, which make them rich sources for such training. Enabling systems to better understand and learn from such descriptions will enable human operators to more naturally specify goals or tasks for the agent to complete.This research explores the concept of learning via natural language descriptions of desired behavior. This technique uses procedural knowledge contained in natural language explanations to help train autonomous agents. Concretely, this approach learns utility functions that can be used to guide autonomous agents towards behaviors that are aligned with the description used for training. To accomplish this, researchers will create computational models capable of extracting both knowledge about sociocultural norms as well as procedural knowledge from naturally occurring corpora. These models will then be used to create behavior policies that are both aligned with sociocultural norms and procedurally plausible. To further ensure that these models can be practically deployed, researchers will enable their models to incorporate a "human in the loop" to provide online feedback about the quality of these learned behavior policies in terms of their social acceptability and appropriateness. Safeguards will also be investigated to protect the learned behavior policies against the effects of adversarial or malicious training examples.This award is jointly funded by the Division of Information and Intelligent Systems in the Directorate for Computer & Information Science & Engineering and the Established Program to Stimulate Competitive Research (EPSCoR) in the Office of Integrative Activities.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
在不久的将来,我们可能会看到功能越来越强大的自主系统在人类附近运行并融入社会。随着这些系统变得更加复杂,它们将越来越多地与人类互动。随着人机交互的增加,确保自主系统不会对人类造成无意伤害的义务也随之增加。创建不会有意或无意伤害人类的系统并不是一件容易的事。这是因为在开放世界中可能会出现无数种不良结果,因此不可能指示这些系统避免每一种结果。如果无法直接指定所需的行为,则必须学习它。过去学习此类行为的方法主要集中于从人类例子中学习,但这些方法不太可能扩展。这项研究使用自然语言解释行为作为训练自主代理安全操作的可扩展替代方案。自然主义描述包含大量有关社会文化规范的信息,这使它们成为此类培训的丰富来源。使系统能够更好地理解和学习这些描述将使人类操作员能够更自然地指定代理要完成的目标或任务。这项研究探讨了通过期望行为的自然语言描述进行学习的概念。该技术使用自然语言解释中包含的程序知识来帮助训练自主代理。具体来说,这种方法学习效用函数,可用于指导自主代理采取与训练所用描述一致的行为。为了实现这一目标,研究人员将创建能够从自然存在的语料库中提取有关社会文化规范的知识以及程序性知识的计算模型。然后,这些模型将用于创建既符合社会文化规范又在程序上合理的行为政策。为了进一步确保这些模型可以实际部署,研究人员将使他们的模型纳入“循环中的人”,以提供有关这些学习行为政策的社会可接受性和适当性质量的在线反馈。还将研究保障措施,以保护学习的行为策略免受对抗性或恶意训练示例的影响。该奖项由计算机与信息科学与工程理事会信息与智能系统部门和既定刺激计划共同资助综合活动办公室的竞争性研究 (EPSCoR)。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Influencing Reinforcement Learning through Natural Language Guidance
- DOI:10.32473/flairs.v34i1.128472
- 发表时间:2021-04
- 期刊:
- 影响因子:0
- 作者:Tasmia Tasrin;Md Sultan Al Nahian;Habarakadage Perera;Brent Harrison
- 通讯作者:Tasmia Tasrin;Md Sultan Al Nahian;Habarakadage Perera;Brent Harrison
Learning Norms from Stories: A Prior for Value Aligned Agents
- DOI:10.1145/3375627.3375825
- 发表时间:2020-01-01
- 期刊:
- 影响因子:0
- 作者:Al Nahian, Md Sultan;Frazier, Spencer;Harrison, Brent
- 通讯作者:Harrison, Brent
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Brent Harrison其他文献
Designing Inclusive AI Certifications
设计包容性人工智能认证
- DOI:
10.1609/aaaiss.v3i1.31269 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Kathleen Timmerman;Judy Goldsmith;Brent Harrison;Zongming Fei - 通讯作者:
Zongming Fei
Reducing the qubit requirement of Jordan-Wigner encodings of $N$-mode, $K$-fermion systems from $N$ to $\lceil \log_2 {N \choose K} \rceil$
将 $N$-mode、$K$-fermion 系统的 Jordan-Wigner 编码的量子位要求从 $N$ 降低到 $lceil log_2 {N choose K}
ceil$
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Brent Harrison;D. Nelson;D. Adamiak;J. Whitfield - 通讯作者:
J. Whitfield
Learning From Explanations Using Sentiment and Advice in RL
在强化学习中使用情感和建议从解释中学习
- DOI:
10.1109/tcds.2016.2628365 - 发表时间:
2017 - 期刊:
- 影响因子:5
- 作者:
Samantha Krening;Brent Harrison;K. Feigh;C. Isbell;Mark O. Riedl;A. Thomaz - 通讯作者:
A. Thomaz
Learning to Generate Natural Language Rationales for Game Playing Agents
学习为游戏代理生成自然语言原理
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Upol Ehsan;Pradyumna Tambwekar;Larry Chan;Brent Harrison;Mark O. Riedl - 通讯作者:
Mark O. Riedl
Multi-agent Reinforcement Learning for Decentralized Stable Matching
用于分散稳定匹配的多智能体强化学习
- DOI:
10.1007/978-3-030-87756-9_24 - 发表时间:
2020 - 期刊:
- 影响因子:4.5
- 作者:
Kshitija Taywade;J. Goldsmith;Brent Harrison - 通讯作者:
Brent Harrison
Brent Harrison的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
基于Piezo2快速信号传导研究气滞胃痛方干预机械刺激诱导FD胃黏膜内脏高敏状态的作用机制
- 批准号:82305136
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于双展示工程噬菌体fd388-BH-WV应用于三阴性乳腺癌脑转移瘤的靶向“饥饿”治疗
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于Piezo蛋白介导SCF/c-kit-JAK-STAT信号通路促进Cajal间质细胞增殖研究腹部推拿调控FD胃动力的作用机制
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
柴胡疏肝散调节FD肠道菌群—线粒体crosstalk经KEAP1/PGAM5/AIFM1通路抑制ICC氧死亡促进胃动力机制的研究
- 批准号:
- 批准年份:2021
- 资助金额:55 万元
- 项目类别:面上项目
FD-OCT联合CMR成像技术探索LncRNA NEAT1在动脉粥样硬化进程中的作用与机制
- 批准号:82072030
- 批准年份:2020
- 资助金额:54 万元
- 项目类别:面上项目
相似海外基金
S&AS:FND:COLLAB: Planning Coordinated Event Observation for Structured Narratives
S
- 批准号:
2313929 - 财政年份:2022
- 资助金额:
$ 29.13万 - 项目类别:
Standard Grant
S&AS: FND: COLLAB: Planning and Control of Heterogeneous Robot Teams for Ocean Monitoring
S
- 批准号:
2311967 - 财政年份:2022
- 资助金额:
$ 29.13万 - 项目类别:
Standard Grant
S&AS: FND: COLLAB: Adaptable Vehicular Sensing and Control for Fleet-Oriented Systems in Smart Cities
S
- 批准号:
1849238 - 财政年份:2019
- 资助金额:
$ 29.13万 - 项目类别:
Standard Grant
S&AS:FND:COLLAB:Unsupervised Rare Event Learning - With Applications on Autonomous Vehicles
S
- 批准号:
1849304 - 财政年份:2019
- 资助金额:
$ 29.13万 - 项目类别:
Standard Grant
S&AS: FND: COLLAB: Learning from Stories: Practical Value Alignment and Taskability for Autonomous Systems
S
- 批准号:
1849262 - 财政年份:2019
- 资助金额:
$ 29.13万 - 项目类别:
Standard Grant