Collaborative Research: HCC: Medium: Aligning Robot Representations with Humans

合作研究:HCC:媒介:使机器人表示与人类保持一致

基本信息

  • 批准号:
    2310757
  • 负责人:
  • 金额:
    $ 42.05万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2023
  • 资助国家:
    美国
  • 起止时间:
    2023-08-15 至 2026-07-31
  • 项目状态:
    未结题

项目摘要

This project seeks to make robots more robust and aligned with human preferences and values. Traditionally, robot behaviors and objectives were trained to include a set of hand-crafted features (i.e., variables represented in the data) that reflect task-relevant aspects of the environment. Using well-chosen features is very data-efficient, but it is unrealistic for human engineers to identify and write code ahead of time for all the features that could matter. Training modern high-capacity models from a lot of data is a great alternative, as long as we do not probe the learned models on novel (out-of-distribution) inputs. The reason these models fail to generalize to out-of-distribution inputs is that they will generally fail to learn the correct representation, comprising the features that matter, and instead pick up on spurious patterns in the data. The central goal of this project is to enable robots to arrive at the underlying correct representation for objectives (and, hence, behaviors). And since learning the objective function---what the human user wants---is fundamentally about humans, this work proposes that only the human can determine what actually matters vs. what is spurious. The research will introduce the problem of aligning robot representations to humans. The key observation behind the project is that traditional input used in learning, such as demonstrations or comparisons, which is designed to teach the robot the full task, is not ideal for aligning the robot’s representation. With representation alignment defined as a problem, there is the opportunity to design new types of human feedback that help the robot explicitly isolate the right representation. The project will develop new types of human feedback and algorithms for efficiently learning from them to arrive at an aligned representation. Preliminary work leveraged this observation to introduce feature traces---a novel type of human input through which users can teach the robot about specific features they care about. The project will pursue four objectives that together tackle the aspects of aligning robot representations with humans: (1) Teaching one feature at a time, beyond feature traces: It will investigate new input types for aligning robot representations with users, contribute active learning algorithms that help the human teacher provide the most informative input, and build transparency tools that enable robots to teach back to the user their current understanding of the representation. (2) Extracting features all at once from new, representation-specific human input: It will investigate new human input types that teach the full representation all at once by combining self-supervised representation learning methods with human-centric representation learning. (3) Using a correct representation in the right way: Given a new task, the robot needs to learn which features matter and in which contexts. (4) Extending earlier work to policy learning: It will extend new tools to the policy learning setting and use the lens of human-aligned representations to enable better policy generalization to new users and to improve goal mis-generalization in reinforcement learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目旨在使机器人更健壮,并与人类的偏好和价值观保持一致。传统上,对机器人的行为和目标进行了训练,以包括一组手工制作的功能(即数据中表示的变量),这些功能反映了与任务相关的环境方面。使用精心挑选的功能非常有数据效率,但是人工工程师提前识别和编写代码的所有功能是不现实的。从许多数据中培训现代的高容量模型是一个很好的选择,只要我们不探究新型模型的新颖(分数)输入。这些模型之所以无法推广到分发输入的原因,是因为它们通常无法学习正确的表示形式,完成重要的功能,而是在数据中选择了伪造模式。该项目的核心目标是使机器人能够获得对象(以及行为)的基础正确表示。自从学习目标功能(人类用户想要的东西)以来,从根本上讲是关于人类的,这项工作只有人才才能确定实际重要的事情,而不是虚假的。该研究将引入将机器人表示与人类保持一致的问题。该项目背后的关键观察结果是,用于学习旨在教机器人的全部任务的传统输入(例如,示范或比较)并不是对齐机器人表示的理想选择。由于表示的对齐方式被定义为问题,因此有机会设计新型的人类反馈,以帮助机器人明确隔离正确的表示形式。该项目将开发新型的人类反馈和算法,以有效地向其学习,以达到一致的表示。初步工作利用了这一观察结果来介绍特征痕迹---一种新型的人类投入类型,用户可以通过该曲线来教机器人有关他们关心的特定功能。该项目将追求四个目标,共同解决将机器人表示与人类对齐的各个方面:(1)一次教学一项功能,超出特征轨迹:它将调查用于使机器人表示与用户保持一致的新输入类型,贡献积极的学习算法,以帮助人类教师提供最有用的输入,并构建人类的透明度,并在启用人类的透明度上,以启用人类的特征,以启用人类的特征。将研究新的人类输入类型,这些类型通过将自我监督的表示方法与以人为中心的代表学习结合在一起来一次教授全部代表。 (3)以正确的方式使用正确的表示:给定新任务,机器人需要学习哪些功能重要以及在哪些情况下。 (4)将较早的工作扩展到政策学习:它将将新工具扩展到政策学习设置,并利用与人类一致的陈述的镜头,以更好地对新用户进行更好的政策概括,并在加强学习中改善目标错误的目标误导性。该奖项反映了NSF的法定任务,并通过对基金会的知识优点和广泛的影响来评估,通过评估来获得珍贵的支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

暂无数据

数据更新时间:2024-06-01

Anca Dragan其他文献

Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
学习时间距离:对比后继特征可以为决策提供度量结构
  • DOI:
  • 发表时间:
    2024
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Vivek Myers;Chongyi Zheng;Anca Dragan;Sergey Levine;Benjamin Eysenbach
    Vivek Myers;Chongyi Zheng;Anca Dragan;Sergey Levine;Benjamin Eysenbach
  • 通讯作者:
    Benjamin Eysenbach
    Benjamin Eysenbach
When Your AIs Deceive You: Challenges with Partial Observability of Human Evaluators in Reward Learning
当你的人工智能欺骗你时:奖励学习中人类评估者的部分可观察性挑战
  • DOI:
    10.48550/arxiv.2402.17747
    10.48550/arxiv.2402.17747
  • 发表时间:
    2024
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Leon Lang;Davis Foote;Stuart J. Russell;Anca Dragan;Erik Jenner;Scott Emmons
    Leon Lang;Davis Foote;Stuart J. Russell;Anca Dragan;Erik Jenner;Scott Emmons
  • 通讯作者:
    Scott Emmons
    Scott Emmons
Adversaries Can Misuse Combinations of Safe Models
对手可能会滥用安全模型的组合
  • DOI:
  • 发表时间:
    2024
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Erik Jones;Anca Dragan;Jacob Steinhardt
    Erik Jones;Anca Dragan;Jacob Steinhardt
  • 通讯作者:
    Jacob Steinhardt
    Jacob Steinhardt
共 3 条
  • 1
前往

Anca Dragan的其他基金

CAREER: Towards Autonomously Generating Robot Behavior for Coordination with Humans -- Accounting for Effects on Human Actions
职业:走向自主生成机器人行为以与人类协调——考虑对人类行为的影响
  • 批准号:
    1652083
    1652083
  • 财政年份:
    2017
  • 资助金额:
    $ 42.05万
    $ 42.05万
  • 项目类别:
    Continuing Grant
    Continuing Grant

相似国自然基金

iRGD偶联纳米载体双靶向ITGA5/NRP-1抑制HCC侵袭及细胞空间互作机制的研究
  • 批准号:
    82360569
  • 批准年份:
    2023
  • 资助金额:
    32.00 万元
  • 项目类别:
    地区科学基金项目
90Y联合Flt3L作为原位疫苗联合ICIs治疗HBV相关HCC机制研究
  • 批准号:
    82372067
  • 批准年份:
    2023
  • 资助金额:
    55 万元
  • 项目类别:
    面上项目
tRNAMet通过调控富含AUG密码子基因的蛋白翻译促进HCC发展的机制研究
  • 批准号:
    82373963
  • 批准年份:
    2023
  • 资助金额:
    48 万元
  • 项目类别:
    面上项目
ACSM3通过增加线粒体代谢通路抵抗NAFLD-HCC进展的机制研究
  • 批准号:
    82303234
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
靶向肿瘤anti-miRNAs上调JAK1/STAT1通路提高HCC对联合免疫治疗敏感的机制研究
  • 批准号:
    82373257
  • 批准年份:
    2023
  • 资助金额:
    48 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: HCC: Small: End-User Guided Search and Optimization for Accessible Product Customization and Design
协作研究:HCC:小型:最终用户引导的搜索和优化,以实现无障碍产品定制和设计
  • 批准号:
    2327136
    2327136
  • 财政年份:
    2023
  • 资助金额:
    $ 42.05万
    $ 42.05万
  • 项目类别:
    Standard Grant
    Standard Grant
Collaborative Research: HCC: Small: Bridging Research and Visualization Design Practice via a Sustainable Knowledge Platform
合作研究:HCC:小型:通过可持续知识平台桥接研究和可视化设计实践
  • 批准号:
    2147044
    2147044
  • 财政年份:
    2023
  • 资助金额:
    $ 42.05万
    $ 42.05万
  • 项目类别:
    Standard Grant
    Standard Grant
Collaborative Research: HCC: Small: Computational Design and Application of Wearable Haptic Knits
合作研究:HCC:小型:可穿戴触觉针织物的计算设计与应用
  • 批准号:
    2301355
    2301355
  • 财政年份:
    2023
  • 资助金额:
    $ 42.05万
    $ 42.05万
  • 项目类别:
    Standard Grant
    Standard Grant
Collaborative Research: NSF-CSIRO: HCC: Small: Understanding Bias in AI Models for the Prediction of Infectious Disease Spread
合作研究:NSF-CSIRO:HCC:小型:了解预测传染病传播的 AI 模型中的偏差
  • 批准号:
    2302969
    2302969
  • 财政年份:
    2023
  • 资助金额:
    $ 42.05万
    $ 42.05万
  • 项目类别:
    Standard Grant
    Standard Grant
Collaborative Research: HCC: Small: Understanding Online-to-Offline Sexual Violence through Data Donation from Users
合作研究:HCC:小型:通过用户捐赠的数据了解线上线下性暴力
  • 批准号:
    2401775
    2401775
  • 财政年份:
    2023
  • 资助金额:
    $ 42.05万
    $ 42.05万
  • 项目类别:
    Standard Grant
    Standard Grant