Collaborative Research: SCALE MoDL: Adaptivity of Deep Neural Networks

合作研究：SCALE MoDL：深度神经网络的适应性

基本信息

批准号：
2134145
负责人：
Yingbin Liang
金额：
$ 30万
依托单位：
Ohio State University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-10-01 至 2024-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2134145&HistoricalAwards=false
关键词：
Collaborative Research SCALE MoDL Adaptivity

项目摘要

The overarching theme of the project is to systematically expand understanding of how deep neural networks (DNNs) work and why or when they are better than classical methods through the lens of "adaptivity." Adaptivity refers to the properties of an algorithm that take advantage of favorable structures in the input data without knowing that these structures exist. That is, adaptive algorithms are those that are free of tuning parameters and could automatically configure themselves to adapt to each input data. The anticipated outcome of the project includes a new theory that explains and quantifies the adaptivity of popular DNN models such as multi-layer perceptrons, self-attention mechanisms (namely, transformer models), and meta-learning. The theory could result in substantial savings in the statistical and computational complexity of these models, allowing them to be applied in resource-constrained settings and to have more environmentally friendly energy footprint. This project will also provide opportunities for students and postdocs to explore interdisciplinary research topics related to deep learning.Specifically, this project investigates (1) the "local adaptivity" of DNNs in estimating functions from noisy data; (2) the "relational adaptivity" of self-attention mechanism that parses a structure data point (such as an image or a chunk of text); and (3) the "task adaptivity" of multi-task and meta-learning algorithms that learn to share information across multiple tasks. The research covers some of the most popular DNN models. Technically the project leverages multiple branches of mathematics (such as function classes, nonparametric statistics, statistical learning theory, optimization, and compressed sensing) and involves innovations in the approximation-theoretic understanding, algorithmic insights, and statistical theory of DNNs. The new analytical tools to be developed are also of independent interest to the broader machine learning theory community.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该项目的首要主题是通过“适应性”的视角，系统地扩展对深度神经网络 (DNN) 工作原理以及为何或何时优于经典方法的理解。适应性是指在不知道这些结构存在的情况下利用输入数据中有利的结构的算法的属性。也就是说，自适应算法是那些无需调整参数并且可以自动配置自身以适应每个输入数据的算法。该项目的预期成果包括一种新理论，该理论解释和量化了流行的 DNN 模型的适应性，例如多层感知器、自注意力机制（即 Transformer 模型）和元学习。该理论可以大大节省这些模型的统计和计算复杂性，使它们能够应用于资源有限的环境，并具有更环保的能源足迹。该项目还将为学生和博士后提供探索与深度学习相关的跨学科研究主题的机会。具体而言，该项目研究（1）DNN 在从噪声数据估计函数时的“局部适应性”；（2）解析结构数据点（例如图像或文本块）的自注意力机制的“关系适应性”；（3）多任务和元学习算法的“任务适应性”，学习跨多个任务共享信息。该研究涵盖了一些最流行的 DNN 模型。从技术上讲，该项目利用了数学的多个分支（例如函数类、非参数统计、统计学习理论、优化和压缩感知），并涉及 DNN 的近似理论理解、算法见解和统计理论方面的创新。待开发的新分析工具也对更广泛的机器学习理论界具有独立的兴趣。该奖项反映了 NSF 的法定使命，并通过使用基金会的智力价值和更广泛的影响审查标准进行评估，被认为值得支持。