RI: Small: Extracting Knowledge from Language Models for Decision Making

RI：小型：从语言模型中提取知识以进行决策

基本信息

批准号：
2246811
负责人：
Sergey Levine
金额：
$ 60万
依托单位：
University of California-Berkeley
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-09-15 至 2026-08-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2246811&HistoricalAwards=false
关键词：
RI Small Extracting Knowledge Language

项目摘要

This project aims to integrate semantic knowledge from large language models into automated decision-making and control systems while retaining reliability and robustness. The main principle of the proposed approach is to use language models to generate proposals and guidance, but still make the final decision or plan based on principled and robust planning and control methods, such that the language models are used when their semantic predictions are useful but not relied upon to always yield the correct answer. Large language models, such as ChatGPT, have garnered considerable attention in recent years due to their ability to respond to complex user queries and fulfill elaborate requests, such as writing code, composing stories, or providing educational explanations. Because of this, there is considerable interest in using them directly as decision-making systems (for example, if a language model can give “how to” instructions for repairing a car, perhaps it can also issue commands to a robot that actually repairs a car). However, there are also numerous concerns that such models might be too unreliable or too prone to generate false predictions to be useful as decision-making systems on their own. Therefore, this project aims to integrate these models into principled methods for planning and control to leverage the semantic knowledge in these models while providing a degree of robustness. This research has significant ramifications for automated decision-making systems that need to interact with complex real-world environments, where both semantic reasoning and intelligent planning are important. This includes robotic systems, including autonomous vehicles and service robots, intelligent assistants, decision support systems, and a range of automation technologies.The technical approach in this project will be based around a probabilistic formulation that ties together the ungrounded semantic predictions from language models with grounded but non-semantic predictions from learned dynamics models. In this way, probabilistic inference machinery can be used to derive algorithms that make decisions that have a high likelihood of being semantically good according to the language model and a high likelihood of being physically (dynamically) optimal according to the learned dynamics model. In practice, this principle can be instantiated in the context of both model-based and model-free reinforcement-learning systems, learned prediction systems, and planning algorithms (by formulating planning as inference). The project will explore applications of this concept to prediction, planning and control, and exploration in reinforcement learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该项目旨在将大型模型的语义整合到自动化的决策和控制系统中，同时保留了鲁棒性和鲁棒性。有原则性和健壮的计划和控制方法，当时使用这种语言模型，但由于其能够组合用户查询和履行诸如编写代码，编写故事L的能力，因此无法始终产生正确的答案。因此。因此汽车）众多的担忧，即这种模型可能会引起虚假的预测，因此可以在这些模型中计划和控制语义知识。制造需要复杂的现实环境的系统很重要。从学习动力学模型中的语言模型中进行语义预测，可以将NCE机械的模型从原则上可以根据学习的动力学模型来得出算法。在基于模型的和无模型的增强系统，学习的预测系统和计划算法的背景下进行实例化（通过将计划作为推理提出）。奖项使用Toundation的Revader对W标准的Revader take Recteriation进行了Suthy评估的NSF'Sf'Stututory。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Sergey Levine其他文献

Goal-oriented Vision-and-Dialog Navigation through Reinforcement Learning

通过强化学习实现目标导向的视觉和对话导航

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
Peter Anderson;Qi Wu;Damien Teney;Jake Bruce;Mark Johnson;Niko Sünderhauf;Ian D. Reid;F. Bonin;Alberto Ortiz;Angel X. Chang;Angela Dai;T. Funkhouser;Ma;Matthias Niebner;M. Savva;David Chen;Raymond Mooney. 2011;Learning;Howard Chen;Alane Suhr;Dipendra Kumar Misra;T. Kollar;Nicholas Roy;Trajectory;Satwik Kottur;José M. F. Moura;Dhruv Devi Parikh;Sergey Levine;Chelsea Finn;Trevor Darrell;Jianfeng Li;Gao Yun;Chen;Ziming Li;Sungjin Lee;Baolin Peng;Jinchao Li;Julia Kiseleva;M. D. Rijke;Shahin Shayandeh;Weixin Liang;Youzhi Tian;Cheng;Yitao Liang;Marlos C. Machado;Erik Talvitie;Chih;Jiasen Lu;Zuxuan Wu;G. Al
通讯作者：
G. Al

Is Value Learning Really the Main Bottleneck in Offline RL?

价值学习真的是离线强化学习的主要瓶颈吗？

DOI：
发表时间：
2024
期刊：
影响因子：
0
作者：
Seohong Park;Kevin Frans;Sergey Levine;Aviral Kumar
通讯作者：
Aviral Kumar

Grow Your Limits: Continuous Improvement with Real-World RL for Robotic Locomotion

拓展你的极限：通过现实世界的强化学习来持续改进机器人运动

DOI：
发表时间：
2023
期刊：
arXiv.org
影响因子：
0
作者：
Laura M. Smith;Yunhao Cao;Sergey Levine
通讯作者：
Sergey Levine

Functional Graphical Models: Structure Enables Offline Data-Driven Optimization

功能图形模型：结构支持离线数据驱动优化

DOI：
发表时间：
2024
期刊：
International Conference on Artificial Intelligence and Statistics
影响因子：
0
作者：
J. Kuba;Masatoshi Uehara;Pieter Abbeel;Sergey Levine
通讯作者：
Sergey Levine

REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous Manipulation

REBOOT：重用数据引导高效的现实世界灵巧操作

DOI：
10.48550/arxiv.2309.03322
发表时间：
2023
期刊：
Overwhelmed
影响因子：
0
作者：
Zheyuan Hu;Aaron Rovinsky;Jianlan Luo;Vikash Kumar;Abhishek Gupta;Sergey Levine
通讯作者：
Sergey Levine