RI: Small: Taming Massive Pre-trained Models under Label Scarcity via an Optimization Lens

RI：小型：通过优化镜头在标签稀缺的情况下驯服大量预训练模型

基本信息

批准号：
2226152
负责人：
Tuo Zhao
金额：
$ 53.99万
依托单位：
Georgia Tech Research Corporation
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-09-01 至 2025-08-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2226152&HistoricalAwards=false
关键词：
RI Small Taming Massive Pre

RI Small Taming Massive Pre

项目摘要

Deep transfer learning (DTL) has made significant progress in many real-world applications such as image and speech recognition. Training deep learning models in these applications often requires large amounts of labeled data, (e.g., images with annotated objects). Labelling these data by human labor, however, can be very expensive and time-consuming, which significantly limits the broader adoption of deep learning. Such an issue is more pronounced in certain domains (e.g. biomedical domain), where labeled data are scarce. To address the concern of label scarcity, researchers have resorted to deep transfer learning, where a massive deep learning model is first pre-trained only using unlabeled data and then adapted to the downstream task of our interests with only limited labelled data. Due to the gap between the enormous sizes of the pre-trained models and the limited labeled data, however, such a deep transfer learning approach is prone to overfitting and fail to generalize well on the unseen data, especially when there are noisy labels. Moreover, the enormous model sizes make practical deployment very difficult when there are constraints on storage/memory usage, inference latency and energy consumption, especially on edge devices. This project aims to develop an efficient computational framework to improve the generalization of deep transfer learning and reduce the model sizes by leveraging cutting-edge optimization and machine learning techniques.Specifically, this project aims to develop: (I) new adversarial regularization methods, which can regularize the complexity of deep learning models and prevent overfitting of the training data, (II) new self-training methods robust to noisy labels in the training data, and (III) new optimization methods, which can improve the training of compact deep learning models in deep transfer learning. Moreover, we will develop new generalization and approximation theories for understanding the benefits of our proposed methods in transfer learning. The proposed research will also deliver open-source software in the form of easy-to-use libraries, which facilitate researchers and practitioners to apply DTL in related fields.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

深度转移学习（DTL）在许多现实世界应用（例如图像和语音识别）中取得了重大进展。这些应用程序中的培训深度学习模型通常需要大量的标记数据（例如带有带注释的对象的图像）。但是，通过人工劳动将这些数据标记可能非常昂贵且耗时，这显着限制了更广泛的深度学习。这样的问题在某些领域（例如生物医学领域）中更为明显，其中标记的数据很少。为了解决标签稀缺性的关注，研究人员已诉诸深度转移学习，其中首先仅使用未标记的数据对大规模的深度学习模型进行预培训，然后适应只有有限的标记数据的利益的下游任务。但是，由于预先训练的模型的巨大尺寸与有限的标记数据之间的差距，因此，这种深厚的传输学习方法容易过度拟合，并且无法很好地概括看不见的数据，尤其是在有嘈杂的标签时。此外，当在存储/内存使用情况，推理潜伏期和能耗（尤其是在边缘设备上）有限制时，巨大的模型大小使实际部署非常困难。该项目旨在开发一个有效的计算框架，以通过利用尖端优化和机器学习技术来改善深度转移学习的概括，并减少模型大小。特别是，该项目旨在开发：（i新的对抗正则化方法，可以将深度学习模型的复杂性和训练数据的培训（II）培训（ii）训练（II）训练（II）（ii）训练（II）（ii）的新型（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）（II）优化方法，可以改善深度转移学习中紧凑型深度学习模型的训练。此外，我们将开发新的概括和近似理论，以理解我们提出的方法在转移学习中的好处。拟议的研究还将以易于使用的库的形式提供开源软件，这些软件促进了研究人员和从业人员在相关领域中应用DTL。该奖项反映了NSF的法定任务，并被认为是通过基金会的知识分子优点和更广泛的审查标准通过评估来通过评估来获得支持的。

项目成果

期刊论文数量（4）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation

DOI：
10.48550/arxiv.2306.11222
发表时间：
2023-06
期刊：
ArXiv
影响因子：
0
作者：
Yixiao Li;Yifan Yu;Qingru Zhang;Chen Liang;Pengcheng He;Weizhu Chen;Tuo Zhao
通讯作者：
Yixiao Li;Yifan Yu;Qingru Zhang;Chen Liang;Pengcheng He;Weizhu Chen;Tuo Zhao

Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms

DOI：
10.48550/arxiv.2310.10810
发表时间：
2023-10
期刊：
ArXiv
影响因子：
0
作者：
Alexander W. Bukharin;Yan Li;Yue Yu;Qingru Zhang;Zhehui Chen;Simiao Zuo;Chao Zhang;Songan Zhang;Tuo Zhao
通讯作者：
Alexander W. Bukharin;Yan Li;Yue Yu;Qingru Zhang;Zhehui Chen;Simiao Zuo;Chao Zhang;Songan Zhang;Tuo Zhao

Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

DOI：
10.48550/arxiv.2303.10512
发表时间：
2023
期刊：
ArXiv
影响因子：
0
作者：
Qingru Zhang;Minshuo Chen;Alexander Bukharin;Pengcheng He;Yu Cheng;Weizhu Chen;Tuo Zhao
通讯作者：
Qingru Zhang;Minshuo Chen;Alexander Bukharin;Pengcheng He;Yu Cheng;Weizhu Chen;Tuo Zhao

Machine Learning Force Fields with Data Cost Aware Training

DOI：
10.48550/arxiv.2306.03109
发表时间：
2023-06
期刊：
影响因子：
0
作者：
Alexander W. Bukharin;Tianyi Liu;Sheng Wang;Simiao Zuo;Weihao Gao;Wen Yan;Tuo Zhao
通讯作者：
Alexander W. Bukharin;Tianyi Liu;Sheng Wang;Simiao Zuo;Weihao Gao;Wen Yan;Tuo Zhao

共 4 条前往

页

Tuo Zhao其他文献

Learning explainable task-relevant state representation for model-free deep reinforcement learning

DOI：
10.1016/j.neunet.2024.106741
10.1016/j.neunet.2024.106741
发表时间：
2024-12-01
2024-12-01
期刊：
Research article
Research article
影响因子：
作者：
Tingting Zhao;Guixi Li;Tuo Zhao;Yarui Chen;Ning Xie;Gang Niu;Masashi Sugiyama
Tingting Zhao;Guixi Li;Tuo Zhao;Yarui Chen;Ning Xie;Gang Niu;Masashi Sugiyama
通讯作者：
Masashi Sugiyama
Masashi Sugiyama

Time-frequency kernel-based CNN for speech recognition

基于时频核的 CNN 语音识别

DOI：
10.21437/interspeech.2015-417
10.21437/interspeech.2015-417
发表时间：
2015
2015
期刊：
SEG Technical Program Expanded Abstracts 2019
SEG Technical Program Expanded Abstracts 2019
影响因子：
0
作者：
Tuo Zhao;Yunxin Zhao;Xin Chen
Tuo Zhao;Yunxin Zhao;Xin Chen
通讯作者：
Xin Chen
Xin Chen

Heat transfer and pressure drop in a double pipe exchanger equipped with novel perforated magnetic turbulator (PMT): An experimental study

DOI：
10.1016/j.applthermaleng.2023.121278
10.1016/j.applthermaleng.2023.121278
发表时间：
2023-11-25
2023-11-25
期刊：
Research article
Research article
影响因子：
作者：
Kai Sun;Dong Liu;S.P. Ghoushchi;Tuo Zhao;Xijie Chen;Bashir Salah;Wenqi Zhao
Kai Sun;Dong Liu;S.P. Ghoushchi;Tuo Zhao;Xijie Chen;Bashir Salah;Wenqi Zhao
通讯作者：
Wenqi Zhao
Wenqi Zhao

Ensemble Acoustic Modeling for CD-DNN-HMM Using Random Forests of Phonetic Decision Trees

使用语音决策树随机森林的 CD-DNN-HMM 集成声学建模

DOI：
10.1007/s11265-015-1001-9
10.1007/s11265-015-1001-9
发表时间：
2015
2015
期刊：
Journal of Signal Processing Systems
Journal of Signal Processing Systems
影响因子：
0
作者：
Tuo Zhao;Yunxin Zhao;Xin Chen
Tuo Zhao;Yunxin Zhao;Xin Chen
通讯作者：
Xin Chen
Xin Chen

TDOA Estimation of Speech Source in Noisy Reverberant Environments

噪声混响环境中语音源的 TDOA 估计

DOI：
10.1109/slt54892.2023.10023256
10.1109/slt54892.2023.10023256
发表时间：
2023
2023
期刊：
2022 IEEE Spoken Language Technology Workshop (SLT)
2022 IEEE Spoken Language Technology Workshop (SLT)
影响因子：
0
作者：
Suliang Bu;Tuo Zhao;Yunxin Zhao
Suliang Bu;Tuo Zhao;Yunxin Zhao
通讯作者：
Yunxin Zhao
Yunxin Zhao