Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
基本信息
- 批准号:RGPIN-2019-04067
- 负责人:
- 金额:$ 2.4万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Machine learning (ML) involves having stronger control on optimization formulations and algorithms, as well as their implementations. Local graph optimization formulations and algorithms, and second-order methods, both look at finer structure, and to do them at scale requires revisiting theoretical and implementation issues. We do both in this proposal. Objective 1: standard graph-based methods are intrinsically biased towards global relationships among nodes, so they struggle to identify small- and meso-scale clusters, which are often more meaningful in practice. This motivates the development of locally-biased formulations. We will develop optimization formulations which have locally-biased solutions around a target set of nodes. The solutions will have a large number of zeros away from the target nodes. Beyond clustering, the solutions of the new formulations will be used for personalized ordering of nodes around the target nodes. They will also be used for routing mass from the target nodes to other nodes in the graph. We will develop new algorithms for these problems, which will compute the solutions with running time which depends on the number of non-zeros at optimality instead of the size of the entire graph. Moreover, recent ML datasets require giga or tera bytes of memory, but ML applications do not necessarily require highly accurate solutions. Rather than seeking high accuracy of solutions, we need to develop methods that have better scalability with respect to the size of the data and which are scalable to thousands of processors. Based on these principles, we propose the following objectives: Objective 2: we will develop new methods that are more efficient than first-order methods on very non-linear and non-convex problems. Currently, most researchers are focused on first-order methods for ML. However, these methods suffer from poor performance on real world datasets with high correlation among samples or features. The new methods will make use of curvature information of the objective function in an inexpensive manner. Our aim is to extend Newton-type coordinate descent frameworks to non-convex problems, improve their iteration complexity and develop stochastic methods that converge to higher-order stationary points. Objective 3: we will develop new communication avoiding optimization algorithms for ML problems. We will apply the new methods to classification and regression problems, such as logistic, linear and non-linear regression. The new methods will be scalable as the number of processors increases for data with high correlation among their samples or features. Our research will allow Canada to compete on the global stage for data analysis. This will be realized through papers, implementations and dissemination of our work to conferences. Moreover, we will train students who are highly employable, since modern industries in the IT sector that use data analysis tools demand high quality personnel with such experience.
机器学习 (ML) 涉及对优化公式和算法及其实现进行更强的控制。局部图优化公式和算法以及二阶方法都着眼于更精细的结构,并且要大规模地实现它们需要重新审视理论和实现问题。我们在这个提案中同时做到了这两点。目标 1:标准的基于图的方法本质上偏向于节点之间的全局关系,因此它们很难识别小规模和中规模的集群,而这在实践中通常更有意义。这促进了局部配方的开发。我们将开发优化公式,该公式具有围绕目标节点集的局部偏置解决方案。该解将有大量远离目标节点的零。除了聚类之外,新公式的解决方案还将用于目标节点周围节点的个性化排序。它们还将用于将质量从目标节点路由到图中的其他节点。我们将为这些问题开发新的算法,该算法将计算解决方案的运行时间,这取决于最优时非零的数量,而不是整个图的大小。此外,最近的机器学习数据集需要千兆或万亿字节的内存,但机器学习应用程序不一定需要高精度的解决方案。我们需要开发的方法不是寻求高精度的解决方案,而是在数据大小方面具有更好的可扩展性,并且可扩展至数千个处理器。基于这些原则,我们提出以下目标: 目标 2:我们将在非常非线性和非凸问题上开发比一阶方法更有效的新方法。目前,大多数研究人员都专注于机器学习的一阶方法。然而,这些方法在样本或特征之间具有高度相关性的现实世界数据集上表现不佳。新方法将以廉价的方式利用目标函数的曲率信息。我们的目标是将牛顿型坐标下降框架扩展到非凸问题,提高其迭代复杂度并开发收敛到高阶驻点的随机方法。目标 3:我们将为 ML 问题开发新的通信避免优化算法。我们将把新方法应用于分类和回归问题,例如逻辑回归、线性回归和非线性回归。随着样本或特征之间具有高度相关性的数据的处理器数量的增加,新方法将具有可扩展性。我们的研究将使加拿大能够在全球数据分析舞台上竞争。这将通过论文、实施和向会议传播我们的工作来实现。此外,我们还将培养具有高就业能力的学生,因为使用数据分析工具的IT行业的现代产业需要具有这种经验的高素质人才。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Fountoulakis, Kimon其他文献
Variational perspective on local graph clustering
- DOI:
10.1007/s10107-017-1214-8 - 发表时间:
2019-03-01 - 期刊:
- 影响因子:2.7
- 作者:
Fountoulakis, Kimon;Roosta-Khorasani, Farbod;Mahoney, Michael W. - 通讯作者:
Mahoney, Michael W.
A PRECONDITIONER FOR A PRIMAL-DUAL NEWTON CONJUGATE GRADIENT METHOD FOR COMPRESSED SENSING PROBLEMS
- DOI:
10.1137/141002062 - 发表时间:
2015-01-01 - 期刊:
- 影响因子:3.1
- 作者:
Dassios, Ioannis;Fountoulakis, Kimon;Gondzio, Jacek - 通讯作者:
Gondzio, Jacek
An Optimization Approach to Locally-Biased Graph Algorithms
- DOI:
10.1109/jproc.2016.2637349 - 发表时间:
2017-02-01 - 期刊:
- 影响因子:20.6
- 作者:
Fountoulakis, Kimon;Gleich, David F.;Mahoney, Michael W. - 通讯作者:
Mahoney, Michael W.
Fountoulakis, Kimon的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Fountoulakis, Kimon', 18)}}的其他基金
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2021
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2020
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
DGECR-2019-00147 - 财政年份:2019
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Launch Supplement
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2019
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
不确定条件下雷达网络推断优化资源分配方法
- 批准号:62371379
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
基于语义布局自适应优化的图像智能适配方法研究
- 批准号:62302356
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
异构特征自适应融合优化及鲁棒关键点匹配方法研究
- 批准号:
- 批准年份:2021
- 资助金额:57 万元
- 项目类别:面上项目
基于种群生存压力图模型的海洋资源优化分配方法研究
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
异构特征自适应融合优化及鲁棒关键点匹配方法研究
- 批准号:62176242
- 批准年份:2021
- 资助金额:57 万元
- 项目类别:面上项目
相似海外基金
Clustering and semi-supervised learning on large heterogeneous graphs: Mathematical formulations and numerical optimization algorithms
大型异构图上的聚类和半监督学习:数学公式和数值优化算法
- 批准号:
569398-2022 - 财政年份:2022
- 资助金额:
$ 2.4万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2021
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2020
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2019
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
DGECR-2019-00147 - 财政年份:2019
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Launch Supplement