CAREER: Random Neural Nets and Random Matrix Products
职业:随机神经网络和随机矩阵产品
基本信息
- 批准号:2143754
- 负责人:
- 金额:$ 57.72万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-06-01 至 2027-05-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
We live in an era of big data and inexpensive computation. Vast stores of information can efficiently be analyzed for underlying patterns by machine learning algorithms, leading to remarkable progress in applications ranging from self-driving cars to automatic drug discovery and machine translation. Underpinning many of these exciting practical developments is a class of computational models called neural networks. Originally developed in the 1940's and 1950's, the neural nets used today are as complex as they are powerful. The purpose of this project is to develop a range of principled techniques for understanding key aspects of how neural networks work in practice and how to make them better. The approach taken by this project is probabilistic and statistical in nature. Just as the ideal gas law accurately describes the large-scale properties of a gas directly through pressure, volume, and temperature without the need specify the state of each individual gas molecule, this project will explore and identify emergent statistical behaviors of large neural networks that provably explain many of their key properties observed in practice. The project will also provide research training and educational opportunities through organization of summer schools in machine learning for graduate students. At a high level, a neural network is a family of functions given by composing affine transformations with elementary non-linear operations. The simplest important kind of neural networks are roughly described by two parameters called depth and width. The former is the dimension of the spaces on which the affine transformations act and the latter is the number of compositions. The technical heart of this project is to understand the statistical behavior of such networks when the affine transformations are chosen at random. The starting point is an analytically tractable regime in which the network width is sent to infinity at fixed depth. In this infinite width limit, random networks converge to Gaussian processes and optimization of network parameters from their randomly chosen starting points reduces to a kernel method. Unfortunately, this concise description cannot capture what is perhaps the most important empirical property of neural networks, namely their ability to learn data-dependent features. Understanding how feature learning occurs is at the core of this project and requires new probabilistic and analytic tools for studying random neural networks at finite width. The basic idea is to perform perturbation theory around the infinite width limit, treating the reciprocal of the network width as a small parameter. The goal is then to obtain, to all orders in this reciprocal, the expressions for joint distribution of the values and derivatives (with respect to both model inputs and model parameters) of a random neural network. Such formulas have practical consequences for understanding the numerical stability of neural network training, suggesting principled settings for optimization hyper-parameters, and quantifying feature learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
我们生活在一个大数据和廉价计算的时代。机器学习算法可以有效地分析大量信息的潜在模式,从而在从自动驾驶汽车到自动药物发现和机器翻译等应用领域取得显着进展。许多这些令人兴奋的实际发展的基础是一类称为神经网络的计算模型。最初开发于 20 世纪 40 年代和 1950 年代,如今使用的神经网络既复杂又强大。该项目的目的是开发一系列原则性技术,以了解神经网络在实践中如何工作的关键方面以及如何使其变得更好。该项目采用的方法本质上是概率和统计的。正如理想气体定律直接通过压力、体积和温度准确描述气体的大尺度特性,而无需指定每个气体分子的状态一样,该项目将探索和识别大型神经网络的紧急统计行为,可证明地解释了在实践中观察到的许多关键特性。该项目还将通过为研究生组织机器学习暑期学校来提供研究培训和教育机会。在较高层次上,神经网络是通过将仿射变换与基本非线性运算组合而给出的一系列函数。最简单的重要类型的神经网络大致由称为深度和宽度的两个参数来描述。前者是仿射变换作用的空间的维度,后者是组合的数量。该项目的技术核心是了解随机选择仿射变换时此类网络的统计行为。起点是一个分析上易于处理的机制,其中网络宽度在固定深度处发送到无穷大。在这种无限宽度限制下,随机网络收敛于高斯过程,并且从随机选择的起点对网络参数的优化简化为核方法。不幸的是,这种简洁的描述无法捕捉神经网络可能最重要的经验特性,即它们学习数据相关特征的能力。了解特征学习如何发生是该项目的核心,需要新的概率和分析工具来研究有限宽度的随机神经网络。基本思想是围绕无限宽度极限进行扰动理论,将网络宽度的倒数视为一个小参数。然后,目标是获得该倒数中所有阶数的随机神经网络的值和导数(相对于模型输入和模型参数)的联合分布表达式。这些公式对于理解神经网络训练的数值稳定性、提出优化超参数的原则性设置以及量化特征学习具有实际意义。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力优势和评估进行评估,被认为值得支持。更广泛的影响审查标准。
项目成果
期刊论文数量(6)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Deep ReLU Networks Preserve Expected Length
深度 ReLU 网络保留预期长度
- DOI:
- 发表时间:2022-01
- 期刊:
- 影响因子:0
- 作者:Hanin, Boris;Rolnick, David;Jeong, Ryan
- 通讯作者:Jeong, Ryan
Deep ReLU Networks Preserve Expected Length
深度 ReLU 网络保留预期长度
- DOI:
- 发表时间:2021-01
- 期刊:
- 影响因子:0
- 作者:Hanin, B;Jeong, R.;Rolnick, D
- 通讯作者:Rolnick, D
Random neural networks in the infinite width limit as Gaussian processes
作为高斯过程的无限宽度限制中的随机神经网络
- DOI:10.1214/23-aap1933
- 发表时间:2023-12
- 期刊:
- 影响因子:0
- 作者:Hanin; Boris
- 通讯作者:Boris
Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit
残差网络中的深度超参数传递:动态和缩放限制
- DOI:10.48550/arxiv.2309.16620
- 发表时间:2023-09-28
- 期刊:
- 影响因子:0
- 作者:Blake Bordelon;Lorenzo Noci;Mufan Bill Li;Boris Hanin;C. Pehlevan
- 通讯作者:C. Pehlevan
Maximal Initial Learning Rates in Deep ReLU Networks
深度 ReLU 网络中的最大初始学习率
- DOI:
- 发表时间:2023-07
- 期刊:
- 影响因子:0
- 作者:Iyer, Gaurav;Hanin, Boris;Rolnick, David
- 通讯作者:Rolnick, David
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Boris Hanin其他文献
Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
您需要更多的 LLM 电话吗?
- DOI:
- 发表时间:
2024-03-04 - 期刊:
- 影响因子:0
- 作者:
Lingjiao Chen;Jared Quincy Davis;Boris Hanin;Peter D. Bailis;Ion Stoica;Matei Zaharia;James Zou - 通讯作者:
James Zou
Les Houches Lectures on Deep Learning at Large & Infinite Width
Les Houches 深度学习讲座
- DOI:
10.48550/arxiv.2309.01592 - 发表时间:
2023-09-04 - 期刊:
- 影响因子:0
- 作者:
Yasaman Bahri;Boris Hanin;Antonin Brossollet;Vittorio Erba;Christian Keup;Rosalba Pacelli;James B. Simon - 通讯作者:
James B. Simon
Quantitative CLTs in Deep Neural Networks
深度神经网络中的定量 CLT
- DOI:
10.48550/arxiv.2307.06092 - 发表时间:
2023-07-12 - 期刊:
- 影响因子:0
- 作者:
Stefano Favaro;Boris Hanin;Domenico Marinucci;I. Nourdin;G. Peccati - 通讯作者:
G. Peccati
Principled Architecture-aware Scaling of Hyperparameters
- DOI:
10.48550/arxiv.2402.17440 - 发表时间:
2024-02-27 - 期刊:
- 影响因子:0
- 作者:
Wuyang Chen;Junru Wu;Zhangyang Wang;Boris Hanin - 通讯作者:
Boris Hanin
Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
您需要更多的 LLM 电话吗?
- DOI:
10.48550/arxiv.2403.02419 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Lingjiao Chen;Jared Quincy Davis;Boris Hanin;Peter D. Bailis;Ion Stoica;Matei Zaharia;James Zou - 通讯作者:
James Zou
Boris Hanin的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Boris Hanin', 18)}}的其他基金
Collaborative Research: Probabilistic, Geometric, and Topological Analysis of Neural Networks, From Theory to Applications
合作研究:神经网络的概率、几何和拓扑分析,从理论到应用
- 批准号:
2133806 - 财政年份:2022
- 资助金额:
$ 57.72万 - 项目类别:
Standard Grant
相似国自然基金
斯格明子的随机动力学及其神经形态计算应用器件基础研究
- 批准号:12304160
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于随机平均和神经网络的高维强非线性系统随机振动研究
- 批准号:12302039
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
非高斯噪声下高维随机动力学分岔及其在神经传导模型中的应用
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
具有随机扰动的右端不连续忆阻神经网络的动力学分析与同步控制
- 批准号:
- 批准年份:2022
- 资助金额:33 万元
- 项目类别:地区科学基金项目
高维强非线性随机动力学系统直接控制的神经网络框架
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Using Artificial Intelligence to Predict Cognitive Training Response in Amnestic Mild Cognitive Impairment
使用人工智能预测遗忘型轻度认知障碍患者的认知训练反应
- 批准号:
10572105 - 财政年份:2023
- 资助金额:
$ 57.72万 - 项目类别:
Training of machine learning algorithms for the classification of accelerometer-measured bednet use and related behaviors associated with malaria risk
训练机器学习算法,用于对加速计测量的蚊帐使用和与疟疾风险相关的相关行为进行分类
- 批准号:
10727374 - 财政年份:2023
- 资助金额:
$ 57.72万 - 项目类别:
Physical activity over the adult life course and cognitive resilience to Alzheimer's disease and related dementias
成人生命历程中的体力活动以及对阿尔茨海默病和相关痴呆症的认知恢复能力
- 批准号:
10572340 - 财政年份:2022
- 资助金额:
$ 57.72万 - 项目类别:
Multi-target repetitive Transcranial Magnetic Stimulation (rTMS) treatment for Major Depressive Disorder (MDD) and comorbid pain
多靶点重复经颅磁刺激 (rTMS) 治疗重度抑郁症 (MDD) 和共病疼痛
- 批准号:
10600978 - 财政年份:2022
- 资助金额:
$ 57.72万 - 项目类别:
Data to Clinical Action: Using Predictive Analytics to Improve Care of Veterans with Opioid Use Disorder
数据到临床行动:使用预测分析来改善对患有阿片类药物使用障碍的退伍军人的护理
- 批准号:
10317224 - 财政年份:2022
- 资助金额:
$ 57.72万 - 项目类别: