Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
基本信息
- 批准号:RGPIN-2019-04067
- 负责人:
- 金额:$ 2.4万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Machine learning (ML) involves having stronger control on optimization formulations and algorithms, as well as their implementations. Local graph optimization formulations and algorithms, and second-order methods, both look at finer structure, and to do them at scale requires revisiting theoretical and implementation issues. We do both in this proposal. Objective 1: standard graph-based methods are intrinsically biased towards global relationships among nodes, so they struggle to identify small- and meso-scale clusters, which are often more meaningful in practice. This motivates the development of locally-biased formulations. We will develop optimization formulations which have locally-biased solutions around a target set of nodes. The solutions will have a large number of zeros away from the target nodes. Beyond clustering, the solutions of the new formulations will be used for personalized ordering of nodes around the target nodes. They will also be used for routing mass from the target nodes to other nodes in the graph. We will develop new algorithms for these problems, which will compute the solutions with running time which depends on the number of non-zeros at optimality instead of the size of the entire graph. Moreover, recent ML datasets require giga or tera bytes of memory, but ML applications do not necessarily require highly accurate solutions. Rather than seeking high accuracy of solutions, we need to develop methods that have better scalability with respect to the size of the data and which are scalable to thousands of processors. Based on these principles, we propose the following objectives: Objective 2: we will develop new methods that are more efficient than first-order methods on very non-linear and non-convex problems. Currently, most researchers are focused on first-order methods for ML. However, these methods suffer from poor performance on real world datasets with high correlation among samples or features. The new methods will make use of curvature information of the objective function in an inexpensive manner. Our aim is to extend Newton-type coordinate descent frameworks to non-convex problems, improve their iteration complexity and develop stochastic methods that converge to higher-order stationary points. Objective 3: we will develop new communication avoiding optimization algorithms for ML problems. We will apply the new methods to classification and regression problems, such as logistic, linear and non-linear regression. The new methods will be scalable as the number of processors increases for data with high correlation among their samples or features. Our research will allow Canada to compete on the global stage for data analysis. This will be realized through papers, implementations and dissemination of our work to conferences. Moreover, we will train students who are highly employable, since modern industries in the IT sector that use data analysis tools demand high quality personnel with such experience.
机器学习(ML)涉及对优化配方和算法及其实施更强大的控制。局部图优化公式和算法以及二阶方法,既着眼于更精细的结构,又要在大规模上进行进行,需要重新审视理论和实施问题。我们在此提案中都做。目标1:基于标准的基于图形的方法在本质上偏向节点之间的全球关系,因此他们难以识别小和中尺度的群集,在实践中通常更有意义。这激发了本地偏见的制定的发展。我们将开发优化配方,这些配方具有围绕一组目标节点的局部偏向解决方案。该溶液将使大量零远离目标节点。除了聚类外,新配方的解决方案还将用于个性化目标节点周围节点的订购。它们还将用于将质量从目标节点路由到图中的其他节点。我们将为这些问题开发新的算法,该算法将使用运行时间计算解决方案,这取决于最优性的非零件数量,而不是整个图的大小。此外,最近的ML数据集需要GIGA或TERA字节的内存,但是ML应用程序不一定需要高度准确的解决方案。我们不必寻求高精度的解决方案,而是需要开发有关数据大小具有更好可扩展性并且可扩展到数千个处理器的方法。基于这些原则,我们提出以下目标:目标2:我们将开发出比非常非线性和非凸问题的一阶方法更有效的新方法。目前,大多数研究人员都专注于ML的一阶方法。但是,这些方法在样本或功能之间具有很高相关性的现实世界数据集上的性能差。新方法将以廉价的方式利用目标函数的曲率信息。我们的目的是将牛顿型坐标的下降框架扩展到非凸问题,改善其迭代复杂性并开发随机方法,以收敛到高阶固定点。目标3:我们将开发新的沟通,以避免针对ML问题的优化算法。我们将把新方法应用于分类和回归问题,例如逻辑,线性和非线性回归。随着处理器的样本或功能之间高相关性的数据增加,新方法将是可扩展的。我们的研究将使加拿大能够在全球阶段竞争数据分析。这将通过论文,实施和将我们的作品传播给会议来实现。此外,由于使用数据分析工具的IT领域的现代行业需要具有如此经验的高质量人员,因此我们将培训高度可就业的学生。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Fountoulakis, Kimon其他文献
Variational perspective on local graph clustering
- DOI:
10.1007/s10107-017-1214-8 - 发表时间:
2019-03-01 - 期刊:
- 影响因子:2.7
- 作者:
Fountoulakis, Kimon;Roosta-Khorasani, Farbod;Mahoney, Michael W. - 通讯作者:
Mahoney, Michael W.
A PRECONDITIONER FOR A PRIMAL-DUAL NEWTON CONJUGATE GRADIENT METHOD FOR COMPRESSED SENSING PROBLEMS
- DOI:
10.1137/141002062 - 发表时间:
2015-01-01 - 期刊:
- 影响因子:3.1
- 作者:
Dassios, Ioannis;Fountoulakis, Kimon;Gondzio, Jacek - 通讯作者:
Gondzio, Jacek
An Optimization Approach to Locally-Biased Graph Algorithms
- DOI:
10.1109/jproc.2016.2637349 - 发表时间:
2017-02-01 - 期刊:
- 影响因子:20.6
- 作者:
Fountoulakis, Kimon;Gleich, David F.;Mahoney, Michael W. - 通讯作者:
Mahoney, Michael W.
Fountoulakis, Kimon的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Fountoulakis, Kimon', 18)}}的其他基金
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2021
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2020
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2019
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
DGECR-2019-00147 - 财政年份:2019
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Launch Supplement
相似国自然基金
基于语义布局自适应优化的图像智能适配方法研究
- 批准号:62302356
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
不确定条件下雷达网络推断优化资源分配方法
- 批准号:62371379
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
异构特征自适应融合优化及鲁棒关键点匹配方法研究
- 批准号:62176242
- 批准年份:2021
- 资助金额:57 万元
- 项目类别:面上项目
基于种群生存压力图模型的海洋资源优化分配方法研究
- 批准号:42106190
- 批准年份:2021
- 资助金额:24.00 万元
- 项目类别:青年科学基金项目
异构特征自适应融合优化及鲁棒关键点匹配方法研究
- 批准号:
- 批准年份:2021
- 资助金额:57 万元
- 项目类别:面上项目
相似海外基金
Clustering and semi-supervised learning on large heterogeneous graphs: Mathematical formulations and numerical optimization algorithms
大型异构图上的聚类和半监督学习:数学公式和数值优化算法
- 批准号:
569398-2022 - 财政年份:2022
- 资助金额:
$ 2.4万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2021
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2020
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
RGPIN-2019-04067 - 财政年份:2019
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Grants Program - Individual
Numerical Optimization, Formulations and Algorithms, for Machine Learning
用于机器学习的数值优化、公式和算法
- 批准号:
DGECR-2019-00147 - 财政年份:2019
- 资助金额:
$ 2.4万 - 项目类别:
Discovery Launch Supplement