III: Small: Stochastic Algorithms for Large Scale Data Analysis

III：小型：大规模数据分析的随机算法

基本信息

批准号：
2131335
负责人：
Arindam Banerjee
金额：
$ 50万
依托单位：
University of Illinois at Urbana-Champaign
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-05-01 至 2024-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2131335&HistoricalAwards=false
关键词：
III Small Stochastic Algorithms Large

项目摘要

Stochastic algorithms such as stochastic gradient descent (SGD) are the workhorse of modern data science. Such algorithms have been playing an important role in the success of deep learning. In spite of such empirical success, the behavior of SGD for challenging non-convex optimization problems as encountered in deep learning is shrouded in mystery. There is limited understanding of how SGD navigates non-convex loss landscapes, how bad local minima are avoided, and how deep models learned using SGD generalize well on future data. The project focuses on gaining clarity of understanding of SGD dynamics and generalization for non-convex problems arising in the context of deep learning. The project also uses the improved understanding to develop prinipled approches to adaptively use validation sets to choose hyper-parameters and avoid overfitting. The insights gained from the technical advances are applied to the challenging scientific problem of sub-seasonal to seasonal (S2S) weather forecasting, which focuses on forecasting weather on a few weeks to few months time-frame. Advances in S2S forecasting is critically important to a wide variety of application domains including water resource management, agriculture, energy, aviation, maritime planning, and emergency planning. The project also engages the broader data science community, incorporating the gained insights for curricular enrichment, and broadening participation from underepresented groups. The project studys SGD dynamics with primary focus on the over-parameterized setting, i.e., where the number of samples is smaller than the number of parameters, which is typical for deep learning. The dynamics is carefully studied based on two key matrices: the Hessian of the non-convex loss function and the covariance matrix of the stochastic gradients, their eigen-spectra, and the overlap between their principal subspaces. Although the SGD dynamics happen in a high-dimensional space, the principal subspaces of these matrices can be low-dimensional. Tools from high-dimensional geometry and associated stochastic processes are utilized to characterize such low dimensional dynamics in high-dimensional spaces. Principled approaches to explain the intriguing generalization behavior of deep learning models trained with SGD are also developed based on the properties of these matrices. Further, differential privacy based mechanisms are developed for adaptively using validation sets for choosing hyper-parameters and avoiding over-fitting in deep learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

随机梯度下降（SGD）等随机算法是现代数据科学的主力。这种算法在深度学习的成功中发挥了重要作用。尽管取得了如此的经验成功，但SGD在深度学习中遇到的挑战非凸优化问题的行为仍然笼罩在神秘之中。对SGD如何导航非凸损失景观，避免局部最小值的方式以及使用SGD使用SGD良好的深入模型对未来数据学到的深入模型的了解有限。该项目着重于了解在深度学习背景下引起的非凸问题的SGD动力学和概括。该项目还使用改进的理解来开发打印机的授权，以适应性地使用验证集来选择超参数并避免过度拟合。从技术进步中获得的见解应用于季节性到季节性（S2S）天气预报的具有挑战性的科学问题，该问题的重点是在几周到几个月的时间范围内预测天气。 S2S预测的进步对于各种应用领域至关重要，包括水资源管理，农业，能源，航空，海上计划和紧急计划。该项目还吸引了更广泛的数据科学界，并结合了获得课程丰富的见解，并扩大了不占代表性群体的参与。该项目Studys SGD动力学主要关注过度参数化的设置，即样品数量小于参数数量，这对于深度学习来说是典型的。基于两个关键矩阵仔细研究了动力学：非凸损耗函数的Hessian和随机梯度的协方差矩阵，它们的特征光谱，以及其主要子空间之间的重叠。尽管SGD动力学发生在高维空间中，但这些矩阵的主要子空间可能是低维的。高维几何形状和相关随机过程的工具用于表征高维空间中这种低维动力学。解释了用SGD训练的深度学习模型的有趣的概括行为的原则方法也根据这些矩阵的特性开发。此外，基于差异的隐私机制是为了使用验证集选择超级参数而自适应地开发的，并避免了深度学习中的过度拟合。该奖项反映了NSF的法定任务，并被认为是值得通过基金会的知识分子优点和更广泛影响的审查标准通过评估来进行评估的。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Arindam Banerjee其他文献

Technology acceptance model and customer engagement: mediating role of customer satisfaction

技术接受模型和客户参与：客户满意度的中介作用

DOI：
发表时间：
2023
期刊：
Journal of Financial Services Marketing
影响因子：
3
作者：
R. P. Kumar;Arindam Banerjee;Zahran Al;S. Ananda
通讯作者：
S. Ananda

AmbientFlow: Invertible generative models from incomplete, noisy measurements

AmbientFlow：来自不完整、噪声测量的可逆生成模型

DOI：
发表时间：
2023
期刊：
arXiv.org
影响因子：
0
作者：
Varun A. Kelkar;Rucha Deshpande;Arindam Banerjee;M. Anastasio
通讯作者：
M. Anastasio

Mixture Modeling

混合建模

DOI：
发表时间：
2019
期刊：
Encyclopedia of Machine Learning
影响因子：
0
作者：
Johannes Fürnkranz;Philip K. Chan;Susan Craw;Claude Sammut;W. Uther;A. Ratnaparkhi;Xin Jin;Jiawei Han;Ying Yang;K. Morik;M. Dorigo;M. Birattari;T. Stützle;P. Brazdil;R. Vilalta;C. Giraud;Carlos Soares;J. Rissanen;R. Baxter;I. Bruha;Geoffrey I. Webb;Luís Torgo;Arindam Banerjee;Hanhuai Shan;Soumya Ray;Prasad Tadepalli;Y. Shoham;Rob Powers;Stephen Scott;H. Blockeel;Luc De Raedt
通讯作者：
Luc De Raedt