CAREER: Variational Inference for Resource-Efficient Learning

职业：资源高效学习的变分推理

基本信息

批准号：
2047418
负责人：
Stephan Mandt
金额：
$ 44.65万
依托单位：
University of California-Irvine
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-09-01 至 2026-08-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2047418&HistoricalAwards=false
关键词：
CAREER Variational Inference Resource Efficient

项目摘要

The power of Deep Learning (DL) comes with enormous energy and storage costs due to massive data needs and parameter-rich models. For example, recent models for natural language generation contain more than a hundred billion parameters and require huge amounts of training data. Training such a model can entail nearly five times the lifetime carbon dioxide emissions of the average American car. This project develops a holistic approach to resource-efficient DL based on a common set of methodologies: DL models and algorithms are viewed through the lens of information theory, making it possible to formally quantify and minimize the required resources. Outcomes of this project include new methods for the compression of both models (neural networks) and data (images and video), as well as new training algorithms for DL that reduce data requirements and improve runtime efficiency. These research activities will also inform summer teaching activities for under-represented students, lead to new open-source software for resource-efficient machine learning, as well as workshops and symposia on neural compression and statistical machine learning. In more detail, the project approaches resource-efficient machine learning from the perspective of variational inference (VI) and contains three thrusts that focus on different inefficiencies: (A) bandwidth inefficiency: a model's inefficient representation of data or parameters, (B) data inefficiency: a model's extensive need for training data, and (C) runtime inefficiency: a learning or inference algorithm's inability to produce desired answers within a given computational time budget. Thrust A draws on the connection between VI and rate-distortion theory to derive new neural compression algorithms with improved compression performance. Thrust B designs informative priors for effective learning with limited data in non-stationary environments. Finally, thrust C develops highly scalable training algorithms for Bayesian neural networks that hybridize Markov Chain Monte Carlo and VI, trading-off precision for convergence speed. The project contains applications from the domains of image and video compression as well as climate science.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

深度学习的力量（DL）伴随着巨大的能量和存储成本，这是由于巨大的数据需求和富含参数的模型。例如，自然语言生成的最新模型包含超过一亿参数，需要大量的培训数据。培训这种模型可能需要美国普通汽车的终身二氧化碳排放量的近五倍。该项目基于一组共同的方法来开发一种对资源有效DL的整体方法：通过信息理论的镜头来查看DL模型和算法，从而可以正式量化和最小化所需的资源。该项目的结果包括用于压缩模型（神经网络）和数据（图像和视频）的新方法，以及针对DL的新培训算法，以降低数据需求并提高运行时效率。这些研究活动还将为代表性不足的学生提供夏季教学活动，从而为资源有效的机器学习提供新的开源软件，以及有关神经压缩和统计机器学习的研讨会和研讨会。 In more detail, the project approaches resource-efficient machine learning from the perspective of variational inference (VI) and contains three thrusts that focus on different inefficiencies: (A) bandwidth inefficiency: a model's inefficient representation of data or parameters, (B) data inefficiency: a model's extensive need for training data, and (C) runtime inefficiency: a learning or inference algorithm's inability to produce desired在给定的计算时间预算中的答案。推力A借鉴VI和速率延伸理论之间的连接，以推导新的神经压缩算法，并改善压缩性能。推力B设计在非平稳环境中使用有限的数据，可提供有效学习的知名度。最后，推力C开发了贝叶斯神经网络的高度可扩展的训练算法，该贝叶斯神经网络融合了马尔可夫链蒙特卡洛和VI，这是收敛速度的交易精度。该项目包含来自图像和视频压缩领域以及气候科学领域的应用。该奖项反映了NSF的法定任务，并且使用基金会的知识分子优点和更广泛的影响审查标准，被认为值得通过评估来获得支持。

项目成果

期刊论文数量（22）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Supervised Compression for Resource- constrained Edge Computing Systems

资源受限边缘计算系统的监督压缩

DOI：
10.1109/wacv51458.2022.00100
发表时间：
2022
期刊：
IEEE Winter Conference on Applications of Computer Vision (IEEE WACV
影响因子：
0
作者：
Matsubara, Y;Yang, R.;Mandt, S;Levorato, M.
通讯作者：
Levorato, M.

Neural Transformation Learning for Deep Anomaly Detection Beyond Images

DOI：
发表时间：
2021-03
期刊：
ArXiv
影响因子：
0
作者：
Chen Qiu;Timo Pfrommer;M. Kloft;S. Mandt;Maja R. Rudolph
通讯作者：
Chen Qiu;Timo Pfrommer;M. Kloft;S. Mandt;Maja R. Rudolph

Towards Empirical Sandwich Bounds on the Rate-Distortion Function

DOI：
发表时间：
2021-11
期刊：
ArXiv
影响因子：
0
作者：
Yibo Yang;S. Mandt
通讯作者：
Yibo Yang;S. Mandt

Probabilistic Querying of Continuous-Time Event Sequences

连续时间事件序列的概率查询

DOI：
发表时间：
2023
期刊：
Artificial Intelligence and Statistics
影响因子：
0
作者：
Boyd, Alex;Chang, Yuxin;Mandt, Stephan;Smyth, Padhraic
通讯作者：
Smyth, Padhraic

Structured Stochastic Gradient MCMC

DOI：
发表时间：
2021-07
期刊：
ArXiv
影响因子：
0
作者：
Antonios Alexos;Alex Boyd;S. Mandt
通讯作者：
Antonios Alexos;Alex Boyd;S. Mandt

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Stephan Mandt其他文献

Understanding and Visualizing Droplet Distributions in Simulations of Shallow Clouds

理解和可视化浅层云模拟中的液滴分布

DOI：
发表时间：
2023
期刊：
arXiv.org
影响因子：
0
作者：
Justus C. Will;A. M. Jenney;Kara D. Lamb;Michael S. Pritchard;Colleen Kaul;Po;Kyle Pressel;Jacob Shpund;M. Lier;Stephan Mandt
通讯作者：
Stephan Mandt