Reducing Training Data in Deep Learning
减少深度学习中的训练数据
基本信息
- 批准号:RGPIN-2019-06222
- 负责人:
- 金额:$ 4.01万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2020
- 资助国家:加拿大
- 起止时间:2020-01-01 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Deep neural networks have been highly successful in supervised learning for a variety of AI related applications. They include automatic speech recognition, image classification and face recognition in computer vision, and natural language translation. However, their success relies on huge volumes of labeled training data, which is time-consuming and expensive to obtain. While data is abundant in today's digital world of the Web, mobile devices, and the Internet of Things, unsupervised learning (which does not need labels) has yet to live up to its promises.
In this research program we plan to study machine learning requiring human capabilities. These learning problems often have many unlabeled data, very few labeled data, and abstract concepts and knowledge are learned and accumulated across many tasks. Human learning is also active and interactive. Progress on these problems would lead to new theory and algorithms that not only significantly reduce the amount of labeled data needed in supervised learning, but also advance our understanding of machine learning in solving difficult real-world problems.
We propose a novel deep learning framework in which autoencoders and classifiers are coupled and optimized simultaneously to make maximal usage of both unlabeled and labeled data. The autoencoder networks are trained from a large set of unlabeled data, but only need to recall enough details for the purpose of classifying a small number of labeled examples. The proposed research nicely unifies and integrates supervised and unsupervised learning, feature learning, learning representations, lifelong learning, and few-shot learning.
The research proposal consists of two long-term objectives, and five short-term objectives, each with clear and feasible methodologies. These will provide ample opportunities for training PhD and MSc students. In total, the proposal will train 4 PhD students and 6 MSc students, as well as one Postdoc, in the next 5 years of the proposed research.
As deep learning in AI is an extremely popular area that attracts both academia and industry, I expect that the HQP trained in this research will be in high demand, and will be making an impact in their future research career in academia and industry.
We expect to make significant contributions not only to the academic research of machine learning and deep learning, but also to various real-world applications. We expect that less than 10% of the training data (or the labeling cost) would be needed to train the deep neural networks without affecting much the predictive accuracy or the computational cost. The savings would be very significant in any real-world application of deep learning.
深度神经网络在各种人工智能相关应用的监督学习中取得了巨大成功。 它们包括自动语音识别、计算机视觉中的图像分类和人脸识别以及自然语言翻译。 然而,他们的成功依赖于大量标记的训练数据,而获取这些数据既耗时又昂贵。 尽管当今的网络、移动设备和物联网数字世界中的数据非常丰富,但无监督学习(不需要标签)尚未兑现其承诺。
在这个研究项目中,我们计划研究需要人类能力的机器学习。这些学习问题通常有许多未标记的数据,很少有标记的数据,并且抽象概念和知识是在许多任务中学习和积累的。 人类的学习也是主动的、互动的。 这些问题的进展将带来新的理论和算法,不仅显着减少监督学习所需的标记数据量,而且增进我们对机器学习解决现实世界难题的理解。
我们提出了一种新颖的深度学习框架,其中自动编码器和分类器同时耦合和优化,以最大限度地利用未标记和标记数据。 自动编码器网络是根据大量未标记数据进行训练的,但只需要回忆足够的细节即可对少量标记示例进行分类。 所提出的研究很好地统一和集成了监督学习和无监督学习、特征学习、学习表示、终身学习和小样本学习。
研究计划包括两个长期目标和五个短期目标,每个目标都有明确可行的方法。 这些将为培养博士生和硕士生提供充足的机会。 总的来说,该提案将在未来 5 年的研究中培养 4 名博士生和 6 名硕士生以及一名博士后。
由于人工智能中的深度学习是一个非常受欢迎的领域,吸引了学术界和工业界,我预计在这项研究中接受培训的 HQP 将会有很高的需求,并将对他们未来在学术界和工业界的研究生涯产生影响。
我们期望不仅为机器学习和深度学习的学术研究,而且为各种实际应用做出重大贡献。 我们预计训练深度神经网络只需要不到 10% 的训练数据(或标记成本),而不会影响预测准确性或计算成本。 在深度学习的任何实际应用中,节省的费用都非常可观。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ling, Charles其他文献
GlucoGuide: An Intelligent Type-2 Diabetes Solution Using Data Mining and Mobile Computing
- DOI:
10.1109/icdmw.2014.177 - 发表时间:
2014-01-01 - 期刊:
- 影响因子:0
- 作者:
Luo, Yan;Ling, Charles;Petrella, Robert - 通讯作者:
Petrella, Robert
Ling, Charles的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ling, Charles', 18)}}的其他基金
Reducing Training Data in Deep Learning
减少深度学习中的训练数据
- 批准号:
RGPIN-2019-06222 - 财政年份:2022
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Reducing Training Data in Deep Learning
减少深度学习中的训练数据
- 批准号:
RGPIN-2019-06222 - 财政年份:2021
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Reducing Training Data in Deep Learning
减少深度学习中的训练数据
- 批准号:
RGPAS-2019-00084 - 财政年份:2020
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Reducing Training Data in Deep Learning
减少深度学习中的训练数据
- 批准号:
RGPIN-2019-06222 - 财政年份:2019
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Reducing Training Data in Deep Learning
减少深度学习中的训练数据
- 批准号:
RGPAS-2019-00084 - 财政年份:2019
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Improving Information Retrieval with Machine Learning
通过机器学习改进信息检索
- 批准号:
46392-2012 - 财政年份:2018
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Improving Information Retrieval with Machine Learning
通过机器学习改进信息检索
- 批准号:
46392-2012 - 财政年份:2017
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Improving Highstreet's Assets Model with Advanced Machine Learning
利用先进的机器学习改进 Highstreet 的资产模型
- 批准号:
501559-2016 - 财政年份:2016
- 资助金额:
$ 4.01万 - 项目类别:
Engage Plus Grants Program
Improving Information Retrieval with Machine Learning
通过机器学习改进信息检索
- 批准号:
46392-2012 - 财政年份:2015
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Improving customer service with predictive models for IBM London Software Lab
利用 IBM 伦敦软件实验室的预测模型改善客户服务
- 批准号:
491890-2015 - 财政年份:2015
- 资助金额:
$ 4.01万 - 项目类别:
Engage Grants Program
相似国自然基金
基于多组学数据和预训练模型的肿瘤驱动因素识别方法研究
- 批准号:32370712
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
异构异质密态数据计算的联邦模型安全训练推理研究
- 批准号:62372350
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
面向AI分布式训练的光电混合数据中心网络
- 批准号:
- 批准年份:2022
- 资助金额:55 万元
- 项目类别:面上项目
基于人工智能与大数据的上肢康复机器人功能训练及诊断方法的研究
- 批准号:
- 批准年份:2022
- 资助金额:53 万元
- 项目类别:面上项目
面向交通预测的时空轨迹数据预训练表示学习方法研究
- 批准号:
- 批准年份:2022
- 资助金额:55 万元
- 项目类别:面上项目
相似海外基金
Reducing stigmatizing attitudes and behaviors of nursing students in simulated clinical visits of patients living with HIV in Iran
在伊朗艾滋病毒感染者的模拟临床就诊中减少护生的污名化态度和行为
- 批准号:
10542953 - 财政年份:2023
- 资助金额:
$ 4.01万 - 项目类别:
Dietary prevention for colorectal cancer: targeting the bile acid/gut microbiome axis
结直肠癌的饮食预防:针对胆汁酸/肠道微生物组轴
- 批准号:
10723195 - 财政年份:2023
- 资助金额:
$ 4.01万 - 项目类别:
Implementation of the Federal 988 Suicide and Mental Health Crisis Hotline Policy: Determinants and Effects of State Policy Implementation Financing Strategies
联邦 988 自杀和心理健康危机热线政策的实施:州政策实施融资策略的决定因素和影响
- 批准号:
10563424 - 财政年份:2023
- 资助金额:
$ 4.01万 - 项目类别:
Systems Science Approaches for Reducing Youth Obesity Disparities
减少青少年肥胖差异的系统科学方法
- 批准号:
10664145 - 财政年份:2023
- 资助金额:
$ 4.01万 - 项目类别:
From Court to the Community: Improving Access to Evidence-Based Treatment for Underserved Justice-Involved Youth At-Risk for Suicide
从法院到社区:改善有自杀风险、司法服务不足的青少年获得循证治疗的机会
- 批准号:
10804858 - 财政年份:2023
- 资助金额:
$ 4.01万 - 项目类别: