Distributed Optimization for Machine Learning on Decentralized Data and Features

基于分散数据和特征的机器学习分布式优化

基本信息

  • 批准号:
    RGPIN-2019-04998
  • 负责人:
  • 金额:
    $ 2.99万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

The need to scale up machine learning, in the presence of a rapid growth in both data volumes and model complexity, has sparked broad interests in developing distributed machine learning systems as well as parallel optimization algorithms. Most existing studies focus on partitioning computation among a tightly coupled cluster of machines either in a data-parallel fashion, to deal with large amounts of training samples, or a model-parallel fashion, to deal with large models, such as deep neural networks. In contrast, the goal of this research program is to glean insights and build models when the dataset (features, samples, labels or a combination of them) used for machine learning is inherently decentralized and owned by multiple participants/domains. Our long-term vision is to design reliable distributed algorithms and systems that can build models from decentralized data, without letting participants share original data with each other or to a central site. By effectively leveraging data from other domains, each participant is expected to enhance its predicting power over a model built only based on its local data, while the joint model built in a decentralized way is expected to approach and approximate the global model if all data were collected centrally. In the meantime, the sharing of model parameters among parties should also be minimized to preserve privacy and reduce communication overhead. Toward these objectives, we will introduce generic composite model structures that can jointly reap insights from data in different decentralization scenarios, including decentralization by features, by samples, by labels or decentralization by a combination of them. We will design theoretically inspired distributed optimization algorithms to solve these problems, develop effective communication compression techniques to reduce overhead, and also study implementation issues for specific applications. Specifically, our algorithms will be inspired by the recent advancements in the convergence of ADMM, stochastic gradient descent (SGD) and proximal SGD in an asynchronous and blockwise setting. Our communication compression techniques will be inspired by the opportunity to suppress model parameter transfers in flat regions of the optimization objective function, whereas existing literature mainly considers significance filters for gradients.  Finally, we will use the proposed model architectures and algorithms to solve various real-world applications in cross-domain recommender systems, multitasked natural language understanding and collaborative mobile edge computing. We will design specific model composition and decomposition structures as well as distributed algorithms based on the data decentralization pattern inherent in each problem.
在数据量和模型复杂性迅速增长的情况下,扩大机器学习的需求激发了人们对开发分布式机器学习系统以及并行优化算法的广泛兴趣。大多数现有的研究都集中于以数据并行的方式分配一组紧密耦合的计算机,以处理大量训练样本或模型并行的方式,以处理大型模型,例如深神经网络。相比之下,该研究计划的目标是在用于机器学习的数据集(功能,样本,标签或它们的组合)本质上是由多个参与者/域的分散和拥有的。我们的长期愿景是设计可靠的分布式算法和系统,这些算法和系统可以通过分散数据来构建模型,而无需让参与者彼此共享原始数据或到中央站点。通过有效利用来自其他域的数据,每次参与都有望增强其仅基于其本地数据构建的模型的预测能力,而如果集中收集所有数据,则以分散方式构建的联合模型有望接近并近似全球模型。同时,当事方之间的模型参数共享也应最小化以保护隐私并减少沟通开销。针对这些目标,我们将引入通用复合模型结构,这些结构可以共同从不同的权力下放场景中从数据中获得洞察力,包括通过特征,样本,标签或通过其组合进行分散化的分散性。我们将设计理论上灵感的分布式优化算法来解决这些问题,开发有效的沟通压缩技术以减少开销,并研究针对特定应用程序的实施问题。具体而言,我们的算法将受到ADMM,随机梯度下降(SGD)和代理SGD的最新进步的启发。我们的通信压缩技术将受到抑制优化目标函数平坦区域模型参数转移的机会的启发,而现有文献主要考虑梯度的显着性过滤器。最后,我们将使用拟议的模型体系结构和算法来解决跨域推荐系统中的各种现实世界应用程序,多任务自然语言理解和协作移动边缘计算。我们将根据每个问题中固有的数据分散模式设计特定的模型组成和分解结构以及分布式算法。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Niu, Di其他文献

FDML: A Collaborative Machine Learning Framework for Distributed Features
Random Network Coding in Peer-to-Peer Networks: From Theory to Practice
  • DOI:
    10.1109/jproc.2010.2091930
  • 发表时间:
    2011-03-01
  • 期刊:
  • 影响因子:
    20.6
  • 作者:
    Li, Baochun;Niu, Di
  • 通讯作者:
    Niu, Di
BLCA prognostic model creation and validation based on immune gene-metabolic gene combination.
基于免疫基因-代谢基因组合的BLCA预后模型创建和验证。
  • DOI:
    10.1007/s12672-023-00853-6
  • 发表时间:
    2023-12-16
  • 期刊:
  • 影响因子:
    2.2
  • 作者:
    Yue, Shao-Yu;Niu, Di;Liu, Xian-Hong;Li, Wei-Yi;Ding, Ke;Fang, Hong-Ye;Wu, Xin-Dong;Li, Chun;Guan, Yu;Du, He-Xi
  • 通讯作者:
    Du, He-Xi
Experimental and numerical investigation of a microchannel heat sink (MCHS) with micro-scale ribs and grooves for chip cooling
  • DOI:
    10.1016/j.applthermaleng.2015.04.009
  • 发表时间:
    2015-06-25
  • 期刊:
  • 影响因子:
    6.4
  • 作者:
    Wang, Guilian;Niu, Di;Ding, Guifu
  • 通讯作者:
    Ding, Guifu
Metabonomic analysis of cerebrospinal fluid in epilepsy.
  • DOI:
    10.21037/atm-22-1219
  • 发表时间:
    2022-04
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Niu, Di;Sun, Pin;Zhang, Fenghua;Song, Fan
  • 通讯作者:
    Song, Fan

Niu, Di的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Niu, Di', 18)}}的其他基金

Distributed Optimization for Machine Learning on Decentralized Data and Features
基于分散数据和特征的机器学习分布式优化
  • 批准号:
    RGPIN-2019-04998
  • 财政年份:
    2021
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Discovery Grants Program - Individual
Advanced Malware Detection Techniques based on Artificial Intelligence and Distributed Machine Learning
基于人工智能和分布式机器学习的先进恶意软件检测技术
  • 批准号:
    531722-2018
  • 财政年份:
    2021
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Collaborative Research and Development Grants
Advanced Malware Detection Techniques based on Artificial Intelligence and Distributed Machine Learning
基于人工智能和分布式机器学习的先进恶意软件检测技术
  • 批准号:
    531722-2018
  • 财政年份:
    2020
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Collaborative Research and Development Grants
Distributed Optimization for Machine Learning on Decentralized Data and Features
基于分散数据和特征的机器学习分布式优化
  • 批准号:
    RGPIN-2019-04998
  • 财政年份:
    2020
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Discovery Grants Program - Individual
Distributed Optimization for Machine Learning on Decentralized Data and Features
基于分散数据和特征的机器学习分布式优化
  • 批准号:
    RGPIN-2019-04998
  • 财政年份:
    2019
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Discovery Grants Program - Individual
Advanced Malware Detection Techniques based on Artificial Intelligence and Distributed Machine Learning
基于人工智能和分布式机器学习的先进恶意软件检测技术
  • 批准号:
    531722-2018
  • 财政年份:
    2019
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Collaborative Research and Development Grants
Intelligent Internet-Scale Multimedia Storage and Delivery
智能互联网规模多媒体存储和传输
  • 批准号:
    436170-2013
  • 财政年份:
    2018
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Discovery Grants Program - Individual
Analyzing real estate transaction and pricing data via statistical machine learning
通过统计机器学习分析房地产交易和定价数据
  • 批准号:
    479555-2015
  • 财政年份:
    2017
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Collaborative Research and Development Grants
Intelligent Internet-Scale Multimedia Storage and Delivery
智能互联网规模多媒体存储和传输
  • 批准号:
    436170-2013
  • 财政年份:
    2017
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Discovery Grants Program - Individual
Intelligent Internet-Scale Multimedia Storage and Delivery
智能互联网规模多媒体存储和传输
  • 批准号:
    436170-2013
  • 财政年份:
    2016
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

机器学习中复合非凸优化问题的高性能分布式算法研究
  • 批准号:
    62302068
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于分布式优化的动态平均一致算法及多机器人协同控制
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    55 万元
  • 项目类别:
    面上项目
无需地图拼接的多机器人系统分布式位姿图优化方法研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
无需地图拼接的多机器人系统分布式位姿图优化方法研究
  • 批准号:
    62203383
  • 批准年份:
    2022
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
基于分布式优化的动态平均一致算法及多机器人协同控制
  • 批准号:
    62276062
  • 批准年份:
    2022
  • 资助金额:
    55.00 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: Consensus and Distributed Optimization in Non-Convex Environments with Applications to Networked Machine Learning
协作研究:非凸环境中的共识和分布式优化及其在网络机器学习中的应用
  • 批准号:
    2240789
  • 财政年份:
    2023
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Standard Grant
Collaborative Research: Consensus and Distributed Optimization in Non-Convex Environments with Applications to Networked Machine Learning
协作研究:非凸环境中的共识和分布式优化及其在网络机器学习中的应用
  • 批准号:
    2240788
  • 财政年份:
    2023
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Standard Grant
Accelerated distributed stochastic optimization methods and applications in machine learning
加速分布式随机优化方法及其在机器学习中的应用
  • 批准号:
    2208394
  • 财政年份:
    2022
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Standard Grant
Matrix Decomposition for Scalable Conic Optimization with Applications to Distributed Control and Machine Learning
用于可扩展圆锥优化的矩阵分解及其在分布式控制和机器学习中的应用
  • 批准号:
    2154650
  • 财政年份:
    2022
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Standard Grant
Consensus-Based Distributed Optimization Algorithms of Low Computational Cost and Their Applications to Machine Learning
基于共识的低计算成本分布式优化算法及其在机器学习中的应用
  • 批准号:
    21H03510
  • 财政年份:
    2021
  • 资助金额:
    $ 2.99万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了