CAREER: Coding Theory for Robust Large-Scale Machine Learning

职业：鲁棒大规模机器学习的编码理论

基本信息

批准号：
1844951
负责人：
Dimitrios Papailiopoulos
金额：
$ 50.83万
依托单位：
University of Wisconsin-Madison
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-05-01 至 2024-04-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1844951&HistoricalAwards=false
关键词：
CAREER Coding Theory Robust Large

项目摘要

Coding theory has played a critical role in modern information technology by supporting robustness of information against a backdrop of multifaceted uncertainty. Following recent successes in machine learning, robustness has emerged as a desired principle, but now in the context of large-scale computation. Challenges related to robustness are prevalent when deploying machine learning solutions in real applications and non-curated settings, which are often non-ideal environments. This project aims to address these challenges by developing novel solutions based on coding theory for computation. These solutions offer provable robustness guarantees, can outperform more traditional solutions in practice, and extend to machine learning systems the gains that have transformed communication and storage systems. Existing and new collaborations of the investigator will facilitate industry cooperation and increase the transition to practice for the frameworks and algorithms generated from this project. The research will be strongly coupled with educational developments guided by recent advances in education science, alongside an outreach program within the Wisconsin Institute for Discovery. This project aims to develop novel coding-theoretic solutions and fundamental trade-offs for robust large-scale machine learning. The research program is centered around three thrusts. The first thrust focuses on robustness during distributed optimization in the presence of delays and straggler nodes, where the speed of convergence is affected by nodes in the system that are significantly slower than average. The second thrust focuses on robustness during distributed optimization in the presence of Byzantine nodes and worst-case failures. Recent studies proposed robust aggregation rules to filter out the effect of worst-case or adversarial failures. This project develops coding-theoretic solutions that can be orders of magnitude faster, and give rise to unexplored trade-offs between computation and Byzantine tolerance. The third thrust focuses on adversarial perturbations during prediction that can force state-of-the-art models to consistently mis-classify events/data. The coding-theoretic approach of this project pursues provable defense mechanisms against adversarial attacks through ensembles of models with inherent redundancy and through data augmentation. The proposed theoretical and algorithmic solutions are afforded by an interdisciplinary mix of tools from information and coding theory, distributed optimization, and machine learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

编码理论通过在多方面的不确定性背景下支持信息的鲁棒性，在现代信息技术中发挥了关键作用。在机器学习方面取得了最新成功之后，鲁棒性已成为理想的原理，但现在在大规模计算的背景下。当在实际应用程序和非策划设置中部署机器学习解决方案时，与鲁棒性相关的挑战通常是非理想的环境。该项目的目的是通过基于编码理论进行计算来解决这些挑战。这些解决方案提供了可证明的鲁棒性保证，可以在实践中胜过更多传统的解决方案，并扩展到机器学习系统，这些收益改变了通信和存储系统。研究人员的现有和新合作将促进行业合作，并增加该项目产生的框架和算法实践的过渡。这项研究将与受教育科学的最新进展指导的教育发展以及威斯康星州发现研究所内的宣传计划的指导。该项目旨在为强大的大规模机器学习开发新颖的编码理论解决方案和基本权衡。该研究计划以三个推力为中心。第一个推力集中在分布式优化期间的鲁棒性上，在存在延迟和散落节点的情况下，收敛速度受系统中的节点的影响，这些节点明显慢于平均值。在存在拜占庭节点和最坏情况下故障的情况下，第二个推力集中在分布式优化过程中的鲁棒性。最近的研究提出了强大的聚合规则，以滤除最坏情况或对抗性失败的效果。该项目开发了可以更快的数量级的编码理论解决方案，并在计算和拜占庭公差之间产生未开发的权衡。第三个推力集中在预测期间的对抗扰动上，这可能会迫使最先进的模型始终如一地错误地分类事件/数据。该项目的编码理论方法通过具有固有的冗余性和通过数据增强的模型的合奏来追求可证明的防御机制，以防止对抗性攻击。提出的理论和算法解决方案是由信息和编码理论，分布式优化和机器学习的工具的跨学科组合提供的。该奖项反映了NSF的法定任务，并被认为是值得通过基金会的知识分子优点和更广泛的审查标准通过评估来通过评估来支持的。

项目成果

期刊论文数量（9）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient

DOI：
发表时间：
2020-06
期刊：
ArXiv
影响因子：
0
作者：
Ankit Pensia;Shashank Rajput;Alliot Nagle;Harit Vishwakarma;Dimitris Papailiopoulos
通讯作者：
Ankit Pensia;Shashank Rajput;Alliot Nagle;Harit Vishwakarma;Dimitris Papailiopoulos

Bad Global Minima Exist and SGD Can Reach Them

DOI：
发表时间：
2019-05
期刊：
ArXiv
影响因子：
0
作者：
Shengchao Liu;Dimitris Papailiopoulos;D. Achlioptas
通讯作者：
Shengchao Liu;Dimitris Papailiopoulos;D. Achlioptas

Attack of the Tails: Yes, You Really Can Backdoor Federated Learning

DOI：
发表时间：
2020-07
期刊：
ArXiv
影响因子：
0
作者：
Hongyi Wang;Kartik K. Sreenivasan;Shashank Rajput;Harit Vishwakarma;Saurabh Agarwal;Jy-yong Sohn;
通讯作者：
Hongyi Wang;Kartik K. Sreenivasan;Shashank Rajput;Harit Vishwakarma;Saurabh Agarwal;Jy-yong Sohn;

DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation

DOI：
发表时间：
2019-07
期刊：
ArXiv
影响因子：
0
作者：
Shashank Rajput;Hongyi Wang;Zachary B. Charles;Dimitris Papailiopoulos
通讯作者：
Shashank Rajput;Hongyi Wang;Zachary B. Charles;Dimitris Papailiopoulos

Finding Nearly Everything within Random Binary Networks

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
Kartik K. Sreenivasan;Shashank Rajput;Jy-yong Sohn;Dimitris Papailiopoulos
通讯作者：
Kartik K. Sreenivasan;Shashank Rajput;Jy-yong Sohn;Dimitris Papailiopoulos