CAREER: Controllable generation for instruction-following language models

职业:指令跟随语言模型的可控生成

基本信息

  • 批准号:
    2338866
  • 负责人:
  • 金额:
    $ 53.13万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2024
  • 资助国家:
    美国
  • 起止时间:
    2024-04-15 至 2029-03-31
  • 项目状态:
    未结题

项目摘要

Instruction-following language models like ChatGPT are beginning to see widespread development, and the ability to understand these systems and control them is critically important to make sure that they benefit society. Despite the success of these language models in generating fluent and convincing-looking outputs, there has been a growing body of work indicating that these systems can generate outputs that are undesirable to users, model creators, and even society at large. This gap between the ability to create models that imitate humans and the inability to have them fulfill specific desiderata (e.g. refuse to generate incorrect information) shows a major deficiency in the ability to precisely control these systems. This project aims to build principled, transparent, and precise methods for controlling language models.To achieve these goals, this project views controllable generation as a viable long-term path to creating instruction-following language models that precisely follow our design goals. Controllable generation offers several benefits. First, it defines a precise statistical modeling problem on which it is possible to build principled methods and rigorous evaluations. Second, it separates the control target from the task, improving transparency by allowing users to see exactly what is being optimized by the model designers. Third, it enables much more precise controls via inference-time methods such as rejection sampling, which strictly enforces the control as a constraint. While controllable generation has major long-term benefits for language models, there also remain significant open problems that must be resolved first, including the difficulty of performing discrete search, the need for specialized training, and the lack of realistic benchmarks of control tasks in the wild. We will address these challenges through a combination of new models (such as diffusion-based models), zero-shot and decoder-based control methods, and a broad benchmark of in-the-wild control behaviors.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
像 ChatGPT 这样的指令跟踪语言模型开始得到广泛的发展,理解这些系统并控制它们的能力对于确保它们造福社会至关重要。尽管这些语言模型在生成流畅且令人信服的输出方面取得了成功,但越来越多的工作表明这些系统可能会生成用户、模型创建者甚至整个社会不希望的输出。创建模仿人类的模型的能力与无法让它们满足特定需求(例如拒绝生成不正确的信息)之间的差距表明精确控制这些系统的能力存在重大缺陷。该项目旨在建立有原则的、透明的和精确的控制语言模型的方法。为了实现这些目标,该项目将可控生成视为创建精确遵循我们设计目标的指令跟踪语言模型的可行的长期路径。可控发电具有多种优势。首先,它定义了一个精确的统计建模问题,可以在此基础上构建原则性方法和严格的评估。其次,它将控制目标与任务分开,通过允许用户准确地看到模型设计者正在优化的内容来提高透明度。第三,它通过推理时间方法(例如拒绝采样)实现更精确的控制,这严格执行控制作为约束。虽然可控生成对语言模型具有重大的长期好处,但仍然存在必须首先解决的重大开放问题,包括执行离散搜索的困难、需要专门的培训以及缺乏现实的控制任务基准。荒野。我们将通过结合新模型(例如基于扩散的模型)、零样本和基于解码器的控制方法以及野外控制行为的广泛基准来应对这些挑战。该奖项反映了 NSF 的法定使命和通过使用基金会的智力价值和更广泛的影响审查标准进行评估,该项目被认为值得支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Tatsunori Hashimoto其他文献

Benchmarking and Improving Generator-Validator Consistency of Language Models
语言模型的基准测试和改进生成器-验证器一致性
  • DOI:
    10.48550/arxiv.2310.01846
  • 发表时间:
    2023-10-03
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xiang Lisa Li;Vaishnavi Shrivastava;Siyan Li;Tatsunori Hashimoto;Percy Liang
  • 通讯作者:
    Percy Liang
The Troubling Emergence of Hallucination in Large Language Models - An Extensive Definition, Quantification, and Prescriptive Remediations
大型语言模型中出现的令人不安的幻觉——广泛的定义、量化和规范性补救措施
  • DOI:
    10.48550/arxiv.2310.04988
  • 发表时间:
    2023-10-08
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Vipula Rawte;Swagata Chakraborty;Agnibh Pathak;Anubhav Sarkar;S.M. Towhidul Islam Tonmoy;Islam Tonmoy;Aman Chadha;Amit P. Sheth;Amitava Das;Paris;A. Sridhar;Erik Visser;Improved;Jianlin Su;Yu Lu;Shengfeng Pan;Ahmed Murtadha;Bo Wen;Yunfeng Liu;Roformer;Rohan Taori;Ishaan Gulrajani;Tianyi Zhang;Yann Dubois;Xuechen Li;Carlos Guestrin;Percy Liang;Tatsunori Hashimoto;Stanford;Hugo Touvron;Thibaut Lavril;Gautier Izacard;Xavier Martinet;Marie;Timothée Lacroix;Baptiste Rozière;Naman Goyal;Eric Hambro;Faisal Azhar;Aurelien Rodriguez;Arm;Joulin;Thomas Wolf;Lys;re Debut;re;Victor Sanh;Julien Chaumond;Clement Delangue;Anthony Moi;Pierric Cistac;Tim Rault;Rémi Louf;Morgan Funtow;Joe Davison;Sam Shleifer;Patrick von Platen;Clara Ma;Yacine Jernite;J. Plu;Canwen Xu;Teven Le Scao;Sylvain Gugger;Mariama Drame;Quentin Lhoest;Susan Zhang;Stephen Roller;Mikel Artetxe;Moya Chen;Shuohui Chen;Christopher De;Mona T. Diab;Xi Xian Li;Todor Victoria Lin;Myle Ott;Kurt Shuster;Punit Daniel Simig;Singh Koura;Anjali Sridhar;Tianlu Wang;Luke Zettlemoyer. 2022;Daniel M. Ziegler;Nisan Stiennon;Jeffrey Wu;Tom B. Brown;Alec Radford;Dario Amodei;Paul F. Chris
  • 通讯作者:
    Paul F. Chris
On the Opportunities and Risks of Foundation Models
论基金会模型的机遇与风险
  • DOI:
    10.3390/ijms20246214
  • 发表时间:
    2021-08-16
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Rishi Bommasani;Drew A. Hudson;E. Adeli;R. Altman;Simran Arora;Sydney von Arx;Michael S. Bernstein;J. Bohg;Antoine Bosselut;E. Brunskill;Erik Brynjolfsson;S. Buch;Dallas Card;Rodrigo Castellon;Niladri S. Chatterji;Annie S. Chen;Kathleen A. Creel;Jared Davis;Dora Demszky;Chris Donahue;Moussa Doumbouya;Esin Durmus;Stefano Ermon;J. Etchemendy;Kawin Ethayarajh;L. Fei;Chelsea Finn;Trevor Gale;Lauren Gillespie;Karan Goel;Noah D. Goodman;S. Grossman;Neel Guha;Tatsunori Hashimoto;Peter Henderson;John Hewitt;Daniel E. Ho;Jenny Hong;Kyle Hsu;Jing Huang;Thomas F. Icard;Saahil Jain;Dan Jurafsky;Pratyusha Kalluri;Siddharth Karamcheti;G. Keeling;Fereshte Khani;O. Khattab;Pang Wei Koh;M. Krass;Ranjay Krishna;Rohith Kuditipudi;Ananya Kumar;Faisal Ladhak;Mina Lee;Tony Lee;J. Leskovec;Isabelle Levent;Xiang Lisa Li;Xuechen Li;Tengyu Ma;Ali Malik;Christopher D. Manning;Suvir Mirch;ani;ani;E. Mitchell;Zanele Munyikwa;Suraj Nair;A. Narayan;D. Narayanan;Benjamin Newman;Allen Nie;Juan Carlos Niebles;H. Nilforoshan;J. Nyarko;Giray Ogut;Laurel J. Orr;Isabel Papadimitriou;J. Park;C. Piech;Eva Portelance;Christopher Potts;Aditi Raghunathan;Robert Reich;Hongyu Ren;Frieda Rong;Yusuf H. Roohani;Camilo Ruiz;Jack Ryan;Christopher R'e;Dorsa Sadigh;Shiori Sagawa;Keshav Santhanam;Andy Shih;K. Srinivasan;Alex Tamkin;Rohan Taori;A. Thomas;Florian Tramèr;Rose E. Wang;William Wang;Bohan Wu;Jiajun Wu;Yuhuai Wu;Sang Michael Xie;Michihiro Yasunaga;Jiaxuan You;M. Zaharia;Michael Zhang;Tianyi Zhang;Xikun Zhang;Yuhui Zhang;Lucia Zheng;Kaitlyn Zhou;Percy Liang
  • 通讯作者:
    Percy Liang
Identifying the Risks of LM Agents with an LM-Emulated Sandbox
使用 LM 模拟沙箱识别 LM 代理的风险
  • DOI:
    10.48550/arxiv.2309.15817
  • 发表时间:
    2023-09-25
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yangjun Ruan;Honghua Dong;Andrew Wang;Silviu Pitis;Yongchao Zhou;Jimmy Ba;Yann Dubois;Chris J. Maddison;Tatsunori Hashimoto
  • 通讯作者:
    Tatsunori Hashimoto
Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions
安全调整的 LLaMA:提高遵循指令的大型语言模型安全性的经验教训
  • DOI:
    10.48550/arxiv.2309.07875
  • 发表时间:
    2023-09-14
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Federico Bianchi;Mirac Suzgun;Giuseppe Attanasio;Paul Röttger;Dan Jurafsky;Tatsunori Hashimoto;James Zou
  • 通讯作者:
    James Zou

Tatsunori Hashimoto的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

基于液晶弹性体纤维的针织结构致动器可控构筑与响应形变机理研究
  • 批准号:
    52303147
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
麦角酸及其中间体在酵母中的高效可控合成
  • 批准号:
    82304357
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
羰基共价有机框架正极材料可控制备及可充水系铝电池应用
  • 批准号:
    22309208
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
可追责的云数据细粒度可控编辑共享技术研究
  • 批准号:
    62302152
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
立体位阻效应辅助共轭超分子多级次跨尺度有序材料的可控构筑
  • 批准号:
    22372055
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目

相似海外基金

CAREER: An Integrated Framework for Controllable Text Generation
职业:可控文本生成的集成框架
  • 批准号:
    2144493
  • 财政年份:
    2022
  • 资助金额:
    $ 53.13万
  • 项目类别:
    Continuing Grant
Controllable text generation: toward non-toxic, unbiased and factual language models for sensitive applications
可控文本生成:针对敏感应用的无毒、公正和事实的语言模型
  • 批准号:
    2902174
  • 财政年份:
    2022
  • 资助金额:
    $ 53.13万
  • 项目类别:
    Studentship
Engineering Efficient and Controllable Base Editors
工程高效且可控的碱基编辑器
  • 批准号:
    10209723
  • 财政年份:
    2021
  • 资助金额:
    $ 53.13万
  • 项目类别:
Bioengineering programmable and drug-controllable synthetic receptors fortunable CAR-T cell behaviors
生物工程可编程和药物可控合成受体可调节 CAR-T 细胞行为
  • 批准号:
    10617657
  • 财政年份:
    2021
  • 资助金额:
    $ 53.13万
  • 项目类别:
Engineering Efficient and Controllable Base Editors
工程高效且可控的碱基编辑器
  • 批准号:
    10609857
  • 财政年份:
    2021
  • 资助金额:
    $ 53.13万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了