Statistical and Computational Aspects of Geometry- and Topology-Based Machine Learning

基于几何和拓扑的机器学习的统计和计算方面

基本信息

项目摘要

As complex high-dimensional data is generated at a large-scale across a wide variety of scientific fields, exploratory data analysis is crucial to gain a better understanding about the data generating process. Indeed, the primary step in the data analysis pipeline arguably is to use unsupervised machine learning methods that help the data analyst to effectively visualize and understand the data being analyzed. This project will concentrate on developing a deeper understanding of such methods so as to enable interpreting the outputs of such procedures better. The novel methodology developed in this project will be disseminated to the applied fields based on existing collaborations of the PIs. Implementations of the developed methodologies will be made available for use by the wider public via open-source packages. This project will also train graduate students and undergraduate students (from socio-economically disadvantaged backgrounds) for a successful career in statistical data science. More specifically, the main goal of this project is to develop statistical and computational methods to extract low-dimensional geometric and topological structure available in high-dimensional datasets. The contributions of this project will hence lie at the intersection of statistical machine learning, and geometric and topological data analysis. The PIs will work both on developing a deeper understanding of existing methodology via a geometric lens, and on proposing novel methodology for unsupervised machine learning based on a topological lens. In the first part, the PIs will study the reason for the emergence of a certain geometric orthogonal cone structures when constructing low-dimension embeddings of high-dimensional data, with non-linear dimension reduction techniques like kernel principal component analysis, diffusion maps, non-local method like ISOMAP and Local Linear Embedding (LLE), and topological methods like Uniform Manifold Approximation and Projection (UMAP). In the second part, the PIs will develop and analyze novel dimension reduction techniques that preserve the topological information available in high-dimensional data. Finally, the PIs will examine the use of topological regularization techniques for regression and classification, from a theoretical and methodological perspective.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
由于在各种科学领域的大规模上生成了复杂的高维数据,因此探索性数据分析对于更好地了解数据生成过程至关重要。确实,数据分析管道的主要步骤可以说是使用无监督的机器学习方法,以帮助数据分析师有效地可视化和了解所分析的数据。该项目将集中于对此类方法有更深入的了解,以便更好地解释此类程序的输出。该项目中开发的新方法将根据现有的PIS合作将其传播到应用领域。将通过开源软件包提供开发方法的实现。该项目还将培训研究生和本科生(来自社会经济处境不利的背景),以成功地在统计数据科学领域。更具体地说,该项目的主要目标是开发统计和计算方法,以提取高维数据集中可用的低维几何和拓扑结构。因此,该项目的贡献将在于统计机器学习的交集以及几何和拓扑数据分析。 PI将通过几何镜头对现有方法进行更深入的了解,又要提出基于拓扑镜头的无监督机器学习的新方法。在第一部分中,PI将在构建高维数据的低维嵌入时,使用非线性尺寸降低技术构建一定的几何正交锥结构的出现,诸如内核成分分析,扩散图,类似于ISOMAP和局部式嵌入方式(lle lile and of Oftocity and Oft Offication and lile and Offication and lile and Offication and lile and of Offication and lile and lile and lile and),以及(UMAP)。在第二部分中,PI将开发和分析新颖的降低缩小技术,以保留高维数据中可用的拓扑信息。最后,PI将从理论和方法论的角度研究拓扑正则化技术对回归和分类的使用。该奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子优点和更广泛影响的评估来支持的。

项目成果

期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Stochastic Zeroth-Order Functional Constrained Optimization: Oracle Complexity and Applications
随机零阶函数约束优化:Oracle 复杂性和应用
  • DOI:
    10.1287/ijoo.2022.0085
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Nguyen, Anthony;Balasubramanian, Krishnakumar
  • 通讯作者:
    Balasubramanian, Krishnakumar
Stochastic Zeroth-Order Riemannian Derivative Estimation and Optimization
  • DOI:
    10.1287/moor.2022.1302
  • 发表时间:
    2022-09
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jiaxiang Li;K. Balasubramanian;Shiqian Ma
  • 通讯作者:
    Jiaxiang Li;K. Balasubramanian;Shiqian Ma
Topologically penalized regression on manifolds
  • DOI:
  • 发表时间:
    2021-10
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Olympio Hacquard;K. Balasubramanian;G. Blanchard;W. Polonik;Clément Levrard
  • 通讯作者:
    Olympio Hacquard;K. Balasubramanian;G. Blanchard;W. Polonik;Clément Levrard
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Krishnakumar Balasubramanian其他文献

Krishnakumar Balasubramanian的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

粒子-场方法在计算回旋动理学系统守恒定律方面的应用
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    24 万元
  • 项目类别:
受支撑的二维纳米结构材料在催化方面的应用:覆盖层上、下的物理
  • 批准号:
    11874033
  • 批准年份:
    2018
  • 资助金额:
    64.0 万元
  • 项目类别:
    面上项目
Ti-O 系的新结构预测及其在光催化方面的应用
  • 批准号:
    11604056
  • 批准年份:
    2016
  • 资助金额:
    22.0 万元
  • 项目类别:
    青年科学基金项目
统计推断理论和方法及其在生物医学、数据分析与计算等方面的应用
  • 批准号:
    11171001
  • 批准年份:
    2011
  • 资助金额:
    52.0 万元
  • 项目类别:
    面上项目
继续发展FDC在BES物理的分波分析方面的应用
  • 批准号:
    10979056
  • 批准年份:
    2009
  • 资助金额:
    32.0 万元
  • 项目类别:
    联合基金项目

相似海外基金

Investigating the computational and statistical aspects of star and planet formation
研究恒星和行星形成的计算和统计方面
  • 批准号:
    2263733
  • 财政年份:
    2019
  • 资助金额:
    $ 32.41万
  • 项目类别:
    Studentship
The birth of modern trends on commutative algebra and convex polytopes with statistical and computational strategies
交换代数和凸多面体的统计和计算策略的现代趋势的诞生
  • 批准号:
    26220701
  • 财政年份:
    2014
  • 资助金额:
    $ 32.41万
  • 项目类别:
    Grant-in-Aid for Scientific Research (S)
Dynamical, computational and statistical aspects of stochastic processes on networks
网络随机过程的动力学、计算和统计方面
  • 批准号:
    1208339
  • 财政年份:
    2012
  • 资助金额:
    $ 32.41万
  • 项目类别:
    Standard Grant
Mathematical Sciences: Methodological and Computational Aspects of Statistical Modeling and Inference
数学科学:统计建模和推理的方法论和计算方面
  • 批准号:
    8807085
  • 财政年份:
    1988
  • 资助金额:
    $ 32.41万
  • 项目类别:
    Continuing Grant
Statistical, Computational and Algorithmic Aspects of Kernel Clustering
核聚类的统计、计算和算法方面
  • 批准号:
    499522214
  • 财政年份:
  • 资助金额:
    $ 32.41万
  • 项目类别:
    Research Grants
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了