Collaborative Research: New Perspectives on Deep Learning: Bridging Approximation, Statistical, and Algorithmic Theories

合作研究：深度学习的新视角：桥接近似、统计和算法理论

基本信息

批准号：
2134077
负责人：
Guergana Petrova
金额：
$ 22.5万
依托单位：
Texas A&M University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-11-01 至 2024-10-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2134077&HistoricalAwards=false
关键词：
Collaborative Research New Perspectives Deep

项目摘要

Deep Learning (DL) has led to a renaissance in neural network methods in data-driven science and engineering. The development of DL systems and applications, including computer vision and natural language understanding, has been led primarily by experiments and engineering practice. Mathematical analysis has only begun to provide insights into these complex machine learning systems. The lack of basic understanding has contributed to serious challenges and shortcomings ranging from the fragility and susceptibility to corrupted data to their uninterpretable behaviors. These problems can be traced to fundamental gaps in the mathematical understanding of DL. This project tackles this challenge by bringing approximation, statistical, and algorithmic theories together to develop new mathematical foundations for DL. The goals of the project are to mathematically characterize the strengths and limitations of DL models, and to understand the properties of DL models trained using examples of desired behavior (training data) as well as the tradeoffs between the performance of DL systems and the training dataset size. While DL is already in widespread use, the continued success of DL requires far more complete mathematical understandings and principled approaches to guide its use and reliable application. The project will provide practitioners with clearer guidance on the strengths, limitations, and best approaches to using DL. Broader impacts of the project also include education and mentoring, including the training of graduate students in mathematical fields such as approximation theory, signal processing, statistics, and machine learning and, most importantly, how these fields collectively inform the theory and practice of DL.DL seeks to learn an unknown function from data using compositions (layers) of linear combinations of simple functions (neurons). The shortcomings of DL can be traced to fundamental gaps in its mathematical theory including the following issues. The function spaces that capture the salient properties of DL applications are poorly understood. The characteristics of functions learned through neural network training are mysterious. The ability of DL models to discriminate between data distributions has not yet been quantified satisfactorily. Understanding of the tradeoffs between accuracy and training set size is lacking. This project tackles these challenges by bringing approximation, statistical, and algorithmic theories together to develop new theoretical foundations for DL. This project builds innovative bridges between approximation theory, nonparametric statistics, learning theory and algorithms to develop new mathematical foundations for DL. This includes the development of new model classes of functions that are naturally suited to characterize the properties, strengths, and limitations of deep neural network architectures and applications; novel approaches to understand the roles of regularization and sparsity in DL; fundamental frameworks to quantify the discrimination power of DL and generalized adversarial networks; and innovative theory to make DL algorithms more data efficient through the use of side-information, partial differential equations, and richer forms of data than the conventional function evaluations.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

深度学习 (DL) 引领了数据驱动科学和工程中神经网络方法的复兴。深度学习系统和应用程序（包括计算机视觉和自然语言理解）的开发主要由实验和工程实践主导。数学分析才刚刚开始为这些复杂的机器学习系统提供见解。缺乏基本的理解导致了严重的挑战和缺陷，从脆弱性和易受损坏的数据到其难以解释的行为。这些问题可以追溯到对深度学习的数学理解上的根本差距。该项目通过将近似理论、统计理论和算法理论结合起来，为深度学习开发新的数学基础，从而应对这一挑战。该项目的目标是从数学上描述深度学习模型的优点和局限性，并了解使用所需行为示例（训练数据）训练的深度学习模型的属性，以及深度学习系统性能和训练数据集之间的权衡尺寸。虽然深度学习已经得到广泛使用，但深度学习的持续成功需要更完整的数学理解和原则性方法来指导其使用和可靠的应用。该项目将为从业者提供关于使用深度学习的优势、局限性和最佳方法的更清晰的指导。该项目更广泛的影响还包括教育和指导，包括对近似理论、信号处理、统计和机器学习等数学领域的研究生进行培训，最重要的是，这些领域如何共同影响深度学习的理论和实践。深度学习寻求使用简单函数（神经元）线性组合的组合（层）从数据中学习未知函数。深度学习的缺点可以追溯到其数学理论的根本缺陷，包括以下问题。人们对捕捉深度学习应用显着特性的函数空间知之甚少。通过神经网络训练学习到的函数的特征是神秘的。深度学习模型区分数据分布的能力尚未得到令人满意的量化。缺乏对准确性和训练集大小之间权衡的理解。该项目通过将近似理论、统计理论和算法理论结合起来，为深度学习开发新的理论基础，从而应对这些挑战。该项目在近似理论、非参数统计、学习理论和算法之间建立了创新桥梁，为深度学习开发新的数学基础。这包括开发新的函数模型类别，这些函数自然适合表征深度神经网络架构和应用的属性、优势和局限性；理解深度学习中正则化和稀疏性作用的新方法；量化深度学习和广义对抗网络的辨别能力的基本框架；和创新理论，通过使用辅助信息、偏微分方程和比传统函数评估更丰富的数据形式，使深度学习算法的数据效率更高。该奖项反映了 NSF 的法定使命，并通过使用基金会的智力价值和更广泛的影响审查标准。

项目成果

期刊论文数量（5）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Lipschitz widths

利普希茨宽度

DOI：
发表时间：
2022
期刊：
Constructive approximation
影响因子：
2.7
作者：
G. Petrova, P. Wojtaszczyk
通讯作者：
G. Petrova, P. Wojtaszczyk

On the entropy numbers and the Kolmogorov widths

DOI：
10.48550/arxiv.2203.00605
发表时间：
2022-02
期刊：
ArXiv
影响因子：
0
作者：
G. Petrova;P. Wojtaszczyk
通讯作者：
G. Petrova;P. Wojtaszczyk

Neural Network Approximation of Refinable Functions

可求函数的神经网络逼近

DOI：
10.1109/tit.2022.3199601
发表时间：
2022
期刊：
IEEE Transactions on Information Theory
影响因子：
2.5
作者：
Daubechies, Ingrid;DeVore, Ronald;Dym, Nadav;Faigenbaum-Golovin, Shira;Kovalsky, Shahar Z.;Lin, Kung-Chin;Park, Josiah;Petrova, Guergana;Sober, Barak
通讯作者：
Sober, Barak

Optimal Learning