Collaborative Research: SCALE MoDL: Advancing Theoretical Minimax Deep Learning: Optimization, Resilience, and Interpretability

合作研究：SCALE MoDL：推进理论极小极大深度学习：优化、弹性和可解释性

基本信息

批准号：
2134223
负责人：
Yi Zhou
金额：
$ 57.61万
依托单位：
University of Utah
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-09-01 至 2024-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2134223&HistoricalAwards=false
关键词：
Collaborative Research SCALE MoDL Advancing

项目摘要

The past decade has witnessed the great success of deep learning in broad societal and commercial applications. However, conventional deep learning relies on fitting data with neural networks, which is known to produce models that lack resilience. For instance, models used in autonomous driving are vulnerable to malicious attacks, e.g., putting an art sticker on a stop sign can cause the model to classify it as a speed limit sign; models used in facial recognition are known to be biased toward people of a certain race or gender; models in healthcare can be hacked to reconstruct the identities of patients that are used in training those models. The next-generation deep learning paradigm needs to deliver resilient models that promote robustness to malicious attacks, fairness among users, and privacy preservation. This project aims to develop a comprehensive learning theory to enhance the model resilience of deep learning. The project will produce fast algorithms and new diagnostic tools for training, enhancing, visualizing, and interpreting model resilience, all of which can have broad research and societal significance. The research activities will also generate positive educational impacts on undergraduate and graduate students. The materials developed by this project will be integrated into courses on machine learning, statistics, and data visualization and will benefit interdisciplinary students majoring in electrical and computer engineering, statistics, mathematics, and computer science. The project will actively involve underrepresented students and integrate research with education for undergraduate and graduate students in STEM. It will also produce introductory materials for K-12 students to be used in engineering summer camps.In this project, the investigators will collaboratively develop a comprehensive minimax learning theory that advances the fundamental understanding of minimax deep learning from the perspectives of optimization, resilience, and interpretability. These complementary theoretical developments, in turn, will guide the design of novel minimax learning algorithms with substantially improved computational efficiency, statistical guarantees, and interpretability. The research includes three major thrusts. First, the investigators will develop a principled non-convex minimax optimization theory that supports scalable, fast, and convergent gradient-descent-ascent algorithms for training complex minimax deep learning models. The theory will focus on analyzing the convergence rate and sample complexity of the developed algorithms. Second, the investigators will formulate a measure of vulnerability of deep learning models and study how minimaxity can enhance their resilience against data, model, and task deviations. This theory will focus on the statistical limits of deep learning. Lastly, the investigators will establish the mathematical foundations for a set of novel visual analytics techniques that increase the model interpretability of minimax learning. In particular, the theory will provide guidance on visualizing and interpreting model resilience.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

在过去的十年中，过去十年在广泛的社会和商业应用中取得了巨大的成功。但是，传统的深度学习取决于与神经网络拟合数据，该神经网络已知会产生缺乏弹性的模型。例如，自动驾驶中使用的模型容易受到恶意攻击的影响，例如，将艺术贴纸放在停车标志上可能会导致模型将其归类为速度限制标志；众所周知，用于面部识别的模型对某个种族或性别的人有偏见。可以将医疗保健中的模型入侵以重建用于培训这些模型的患者的身份。下一代深度学习范式需要提供弹性的模型，以促进恶意攻击，用户公平性和隐私保护。该项目旨在开发一种全面的学习理论，以增强深度学习的模型弹性。该项目将生产快速算法和新的诊断工具，用于培训，增强，可视化和解释模型弹性，所有这些都可以具有广泛的研究和社会意义。研究活动还将对本科生和研究生产生积极的教育影响。该项目开发的材料将集成到机器学习，统计和数据可视化的课程中，并将受益于电气和计算机工程，统计，数学和计算机科学专业的跨学科学生。该项目将积极参与代表性不足的学生，并将研究与STEM的本科生和研究生的教育相结合。它还将为工程夏令营中使用介绍材料。在这个项目中，研究人员将共同开发一种全面的微型型学习理论，从而促进对最小值深度学习的基本了解，从优化，弹性和解释性的角度来看。这些互补的理论发展反过来将指导新型最小值学习算法的设计，其计算效率，统计保证和可解释性大大提高。该研究包括三个主要推力。首先，研究人员将开发一种原则上的非凸极小优化理论，该理论支持可扩展，快速和收敛性的梯度降低算法，用于训练复杂的最小值深度学习模型。该理论将着重分析开发算法的收敛速率和样品复杂性。其次，研究人员将制定深度学习模型脆弱性的度量，并研究最小值如何增强其针对数据，模型和任务偏差的韧性。该理论将重点放在深度学习的统计限制上。最后，研究人员将为一组新型的视觉分析技术建立数学基础，以增加最小值学习的模型可解释性。特别是，该理论将提供有关可视化和解释模型弹性的指导。该奖项反映了NSF的法定使命，并被认为是值得通过基金会的知识分子优点和更广泛的影响审查标准的评估来支持的。

项目成果

期刊论文数量（11）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
Ziyi Chen;Shaocong Ma;Yi Zhou
通讯作者：
Ziyi Chen;Shaocong Ma;Yi Zhou

Experimental Observations of the Topology of Convolutional Neural Network Activations

卷积神经网络激活拓扑的实验观察

DOI：
发表时间：
2023
期刊：
Proceedings of the AAAI Conference on Artificial Intelligence
影响因子：
0
作者：
Purvine, Emilie;Brown, Davis;Jefferson, Brett;Joslyn, Cliff;Praggastis, Brenda;Rathore, Archit;Shapiro, Madelyn;Wang, Bei;Zhou, Youjia
通讯作者：
Zhou, Youjia

Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry