Collaborative Research: Probabilistic, Geometric, and Topological Analysis of Neural Networks, From Theory to Applications
合作研究:神经网络的概率、几何和拓扑分析,从理论到应用
基本信息
- 批准号:2133806
- 负责人:
- 金额:$ 50万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-01-01 至 2024-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
One of the most exciting technical developments of the last decade is the widespread adoption of a family of algorithms called neural networks, used in cutting-edge industrial applications ranging from self-driving cars to predicting the three-dimensional shapes of proteins from their amino acid sequences. The goals of this project are twofold. First, the investigators seek to use tools from mathematics (specifically probability and combinatorics) to better understand how neural networks behave and then to fashion this understanding into new, more efficient, and safer algorithms. This involves a collaborative effort between mathematicians, computer scientists, and electrical engineers. The project team seeks to unravel a fundamental mystery: why is it that neural networks appear to be incredibly complex, yet despite their seeing intricacy, still learn parsimonious and useful ways of making predictions? Put another way, the investigators aim to define and analyze different mathematical notions of neural network complexity and then to use them as theoretically grounded guides in the search for ever more efficient and interpretable algorithms related to neural networks. The second goal is to create a series of educational resources, ranging from videos to course notes, that will enable various segments of society at large (e.g. students, policy makers, scientists, and so on) to engage with and get a usable appreciation for the ideas, challenges, and opportunities surrounding modern neural networks. The research in this project consists of three interconnected parts. The first is a probabilistic analysis of a variety of neural network complexity measures before, during, and after training. Relevant tools come from probability, functional analysis, information theory, and geometry. Key theoretical questions include quantifying implicit bias and bounding generalization error for learning structured functions. The second is a topological and geometric analysis of both individual ReLU network functions and spaces of ReLU networks. Relevant tools come from Morse Theory and low-dimensional topology. Key theoretical questions hinge on understanding topological implicit bias and topological depth separation. Finally, the investigators seek theory-guided insights for applied deep learning via (i) principled, efficient neural architecture search using average case complexity measures as surrogates for practical expressivity, trainability, and generalization and (ii) novel approaches to model compression and scaling via topological expressivity of ReLU networks.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
过去十年最令人兴奋的技术发展之一是一系列称为神经网络的算法的广泛采用,该算法用于尖端工业应用,从自动驾驶汽车到根据氨基酸预测蛋白质的三维形状序列。该项目的目标是双重的。首先,研究人员寻求使用数学工具(特别是概率和组合学)来更好地理解神经网络的行为方式,然后将这种理解转化为新的、更高效、更安全的算法。这需要数学家、计算机科学家和电气工程师之间的协作努力。该项目团队试图解开一个基本谜团:为什么神经网络看起来非常复杂,但尽管它们看起来很复杂,但仍然学习简约且有用的预测方法?换句话说,研究人员的目标是定义和分析神经网络复杂性的不同数学概念,然后将它们用作寻找与神经网络相关的更高效和可解释算法的理论基础指南。第二个目标是创建一系列教育资源,从视频到课程笔记,使社会各阶层(例如学生、政策制定者、科学家等)能够参与并获得有用的欣赏。围绕现代神经网络的想法、挑战和机遇。该项目的研究由三个相互关联的部分组成。第一个是在训练之前、期间和之后对各种神经网络复杂性度量进行概率分析。相关工具来自概率、泛函分析、信息论和几何。关键的理论问题包括量化学习结构化函数的隐式偏差和边界泛化误差。第二个是对单个 ReLU 网络函数和 ReLU 网络空间的拓扑和几何分析。相关工具来自莫尔斯理论和低维拓扑。关键的理论问题取决于对拓扑隐式偏差和拓扑深度分离的理解。最后,研究人员通过以下方式寻求应用深度学习的理论指导见解:(i)使用平均案例复杂性度量作为实际表达性、可训练性和泛化性的替代品的有原则的、高效的神经架构搜索,以及(ii)通过模型压缩和缩放的新颖方法ReLU 网络的拓扑表达能力。该奖项反映了 NSF 的法定使命,并且通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Maximal Initial Learning Rates in Deep ReLU Networks
深度 ReLU 网络中的最大初始学习率
- DOI:10.48550/arxiv.2212.07295
- 发表时间:2022-12-14
- 期刊:
- 影响因子:0
- 作者:Gaurav M. Iyer;B. Hanin;D. Rolnick
- 通讯作者:D. Rolnick
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis
深层架构连接性对其融合至关重要:细粒度分析
- DOI:10.48550/arxiv.2205.05662
- 发表时间:2022-05-11
- 期刊:
- 影响因子:0
- 作者:Wuyang Chen;Wei Huang;Xinyu Gong;B. Hanin;Zhangyang Wang
- 通讯作者:Zhangyang Wang
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Boris Hanin其他文献
Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
您需要更多的 LLM 电话吗?
- DOI:
- 发表时间:
2024-03-04 - 期刊:
- 影响因子:0
- 作者:
Lingjiao Chen;Jared Quincy Davis;Boris Hanin;Peter D. Bailis;Ion Stoica;Matei Zaharia;James Zou - 通讯作者:
James Zou
Quantitative CLTs in Deep Neural Networks
深度神经网络中的定量 CLT
- DOI:
10.48550/arxiv.2307.06092 - 发表时间:
2023-07-12 - 期刊:
- 影响因子:0
- 作者:
Stefano Favaro;Boris Hanin;Domenico Marinucci;I. Nourdin;G. Peccati - 通讯作者:
G. Peccati
Les Houches Lectures on Deep Learning at Large & Infinite Width
Les Houches 深度学习讲座
- DOI:
10.48550/arxiv.2309.01592 - 发表时间:
2023-09-04 - 期刊:
- 影响因子:0
- 作者:
Yasaman Bahri;Boris Hanin;Antonin Brossollet;Vittorio Erba;Christian Keup;Rosalba Pacelli;James B. Simon - 通讯作者:
James B. Simon
Principled Architecture-aware Scaling of Hyperparameters
- DOI:
10.48550/arxiv.2402.17440 - 发表时间:
2024-02-27 - 期刊:
- 影响因子:0
- 作者:
Wuyang Chen;Junru Wu;Zhangyang Wang;Boris Hanin - 通讯作者:
Boris Hanin
Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit
残差网络中的深度超参数传递:动态和缩放限制
- DOI:
10.48550/arxiv.2309.16620 - 发表时间:
2023-09-28 - 期刊:
- 影响因子:0
- 作者:
Blake Bordelon;Lorenzo Noci;Mufan Bill Li;Boris Hanin;C. Pehlevan - 通讯作者:
C. Pehlevan
Boris Hanin的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Boris Hanin', 18)}}的其他基金
CAREER: Random Neural Nets and Random Matrix Products
职业:随机神经网络和随机矩阵产品
- 批准号:
2143754 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
相似国自然基金
自由概率论方法在量子结构理论研究中的应用
- 批准号:12171425
- 批准年份:2021
- 资助金额:50 万元
- 项目类别:面上项目
基于算子理论的广义概率论框架下的量子关联研究
- 批准号:11901421
- 批准年份:2019
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
非确定性环境下广义DEA方法理论研究及其在内蒙古经济有效性评价中的应用
- 批准号:71661027
- 批准年份:2016
- 资助金额:27.5 万元
- 项目类别:地区科学基金项目
若干类ABSDEs以及其他类型BSDEs的研究
- 批准号:11626236
- 批准年份:2016
- 资助金额:2.5 万元
- 项目类别:数学天元基金项目
数学天元基金统计学研究生暑期学校2015
- 批准号:11526007
- 批准年份:2015
- 资助金额:51.0 万元
- 项目类别:数学天元基金项目
相似海外基金
Collaborative Research: SHF: Medium: Verifying Deep Neural Networks with Spintronic Probabilistic Computers
合作研究:SHF:中:使用自旋电子概率计算机验证深度神经网络
- 批准号:
2311295 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
Collaborative Research: SHF: Medium: Verifying Deep Neural Networks with Spintronic Probabilistic Computers
合作研究:SHF:中:使用自旋电子概率计算机验证深度神经网络
- 批准号:
2311296 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
RAPID/Collaborative Research: Advancing Probabilistic Fault Displacement Hazard Assessments by Collecting Perishable Data from the 2023 Turkiye Earthquake Sequence
RAPID/合作研究:通过收集 2023 年土耳其地震序列的易腐烂数据推进概率断层位移危险评估
- 批准号:
2330153 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
RAPID/Collaborative Research: Advancing Probabilistic Fault Displacement Hazard Assessments by Collecting Perishable Data from the 2023 Turkiye Earthquake Sequence
RAPID/合作研究:通过收集 2023 年土耳其地震序列的易腐烂数据推进概率断层位移危险评估
- 批准号:
2330152 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Collaborative Research: Probabilistic, Geometric, and Topological Analysis of Neural Networks, From Theory to Applications
合作研究:神经网络的概率、几何和拓扑分析,从理论到应用
- 批准号:
2133861 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Standard Grant