Collaborative Research: RI:Medium:MoDL:Mathematical and Conceptual Understanding of Large Language Models
合作研究:RI:Medium:MoDL:大型语言模型的数学和概念理解
基本信息
- 批准号:2211780
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-10-01 至 2025-09-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Large language models (LLMs) have achieved unprecedented success in natural language processing (NLP). Since language models are being seen as a cornerstone of artificial intelligence in the near future, there is a need to be able to understand them, and to convey that understanding to regulators as well as the general public. These models are based on deep neural networks that are trained from vast quantities of text and have been demonstrated to be highly useful in performing tasks such as question answering, text classification, machine translation and summarization. Despite the huge empirical success, there is little understanding about their inner workings. This project seeks to bridge the gap by developing conceptual and mathematical understanding about training and using LLMs. The project will advance such understanding. The project will also seek to develop and disseminate instructional materials and draw on ideas from the project to impact ongoing programs at their institution to help increase participation in computing by individuals from underrepresented groups. The project has three components. (1) We will first build simplified generative models that capture the intrinsic structures of text, and analyze language models that are trained on texts from such generative models. (2) We then analyze why the learned language models can encode useful information that helps a wide range of downstream tasks. (3) Finally, we analyze and design new adaptation methods for downstream tasks with quantitative sample and computational efficiency guarantees. Education and outreach plans are integrated into this project: the investigators will develop a new introductory course in machine learning and disseminate instructional materials, mentor graduate and undergraduate students from underrepresented groups (through Princeton Freshman Scholars Institute, Stanford Summer Teacher Research Program, REU’s) and organize research workshops to promote conversations between the theoretical machine learning and NLP community.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
大型语言模型(LLM)在自然语言处理(NLP)方面取得了前所未有的成功。由于语言模型在不久的将来被视为人工智能的基石,因此有必要能够理解它们,并将这种理解传达给监管机构和公众。这些模型基于从大量文本训练的深度神经网络,并已被证明在执行问题回答,文本分类,机器翻译和摘要等任务方面非常有用。尽管经验取得了巨大的成功,但对他们的内部运作几乎没有理解。该项目旨在通过对培训和使用LLM的概念和数学理解来建立构成差距的弥合。该项目将提高这种理解。该项目还将寻求开发和传播教学材料,并利用项目的想法,以影响其机构正在进行的计划,以帮助增加代表性不足的群体的个人参与计算。该项目有三个组成部分。 (1)我们将首先构建简化的通用模型,以捕获文本的内在结构,并分析对此类通用模型的文本进行培训的语言模型。 (2)然后我们分析为什么学习的语言模型可以编码有用的信息来帮助广泛的下游任务。 (3)最后,我们分析并设计了具有定量样本和计算效率保证的下游任务的新适应方法。教育和宣传计划已融入该项目:研究人员将开发一门新的介绍课程,并传播教学材料,智力研究生和本科生(通过普林斯顿大学新生学者研究所,斯坦福大学暑期教师研究计划,REU's)的新介绍课程,并促进了宣传机器学习和NLP的统计信息。通过基金会的智力优点和更广泛的影响评估标准通过评估来支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Tengyu Ma其他文献
Decomposing Overcomplete 3rd Order Tensors using Sum-of-Squares Algorithms
使用平方和算法分解超完备三阶张量
- DOI:
10.4230/lipics.approx-random.2015.829 - 发表时间:
2015 - 期刊:
- 影响因子:0
- 作者:
Rong Ge;Tengyu Ma - 通讯作者:
Tengyu Ma
On the Performance of Thompson Sampling on Logistic Bandits
汤普森采样对Logistic Bandits的性能研究
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Shi Dong;Tengyu Ma;Benjamin Van Roy - 通讯作者:
Benjamin Van Roy
Learning Over-Parametrized Two-Layer Neural Networks beyond NTK
学习 NTK 之外的超参数化两层神经网络
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Yuanzhi Li;Tengyu Ma;Hongyang Zhang - 通讯作者:
Hongyang Zhang
Non-convex Optimization for Machine Learning: Design, Analysis, and Understanding
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Tengyu Ma - 通讯作者:
Tengyu Ma
Earthquake prediction : nine major earthquakes in China (1966-1976)
- DOI:
- 发表时间:
1990-07 - 期刊:
- 影响因子:0
- 作者:
Tengyu Ma - 通讯作者:
Tengyu Ma
Tengyu Ma的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Tengyu Ma', 18)}}的其他基金
Collaborative Research: CIF: Medium: MoDL:Toward a Mathematical Foundation of Deep Reinforcement Learning
合作研究:CIF:媒介:MoDL:迈向深度强化学习的数学基础
- 批准号:
2212263 - 财政年份:2022
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CAREER: Toward a Comprehensive Generalization Theory for Deep Learning
职业:走向深度学习的综合泛化理论
- 批准号:
2045685 - 财政年份:2021
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
相似国自然基金
跨膜蛋白LRP5胞外域调控膜受体TβRI促钛表面BMSCs归巢、分化的研究
- 批准号:82301120
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
Dectin-2通过促进FcεRI聚集和肥大细胞活化加剧哮喘发作的机制研究
- 批准号:82300022
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
TβRI的UFM化修饰调控TGF-β信号通路和乳腺癌转移的作用及机制研究
- 批准号:32200568
- 批准年份:2022
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
藏药甘肃蚤缀β-咔啉生物碱类TβRI抑制剂的发现及其抗肺纤维化作用机制研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
藏药甘肃蚤缀β-咔啉生物碱类TβRI抑制剂的发现及其抗肺纤维化作用机制研究
- 批准号:82204762
- 批准年份:2022
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312841 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312842 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Foundations of Few-Round Active Learning
协作研究:RI:小型:少轮主动学习的基础
- 批准号:
2313131 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313151 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
Collaborative Research: RI: Small: Motion Fields Understanding for Enhanced Long-Range Imaging
合作研究:RI:小型:增强远程成像的运动场理解
- 批准号:
2232298 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant