CAREER: Machine learning, Mapping Spaces, and Obstruction Theoretic Methods in Topological Data Analysis

职业:拓扑数据分析中的机器学习、映射空间和障碍理论方法

基本信息

  • 批准号:
    2415445
  • 负责人:
  • 金额:
    $ 40万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2024
  • 资助国家:
    美国
  • 起止时间:
    2024-04-01 至 2025-04-30
  • 项目状态:
    未结题

项目摘要

Data analysis can be described as the dual process of extracting information from observations, and of understanding patterns in a principled manner. This process and the deployment of data-centric technologies have recently brought unprecedented advances in many scientific fields, as well as increased global prosperity with the advent of knowledge-based economies and systems. At a high level, this revolution is driven by two thrusts: the modern technologies which allow for the collection of complex data sets, and the theories and algorithms we use to make sense of them. That said, and for all its benefits, extracting actionable knowledge from data is difficult. Observations gathered in uncontrolled environments are often high-dimensional, complex and noisy; and even when controlled experiments are used, the intricate systems that underlie them --- like those from meteorology, chemistry, medicine and biology --- can yield data sets with highly nontrivial underlying topology. This refers to properties such as the number of disconnected pieces (i.e., clusters), the existence of holes or the orientability of the data space. The research funded through this CAREER award will leverage ideas from algebraic topology to address data science questions like visualization and representation of complex data sets, as well as the challenges posed by nontrivial topology when designing learning systems for prediction and classification. This work will be integrated into the educational program of the PI through the creation of an online TDA (Topological Data Analysis) academy, with the dual purpose of lowering the barrier of entry into the field for data scientists and academics, as well as increasing the representation of underserved communities in the field of computational mathematics. The project provides research training opportunities for graduate students.Understanding the set of maps between topological spaces has led to rich and sophisticated mathematics, for it subsumes algebraic invariants like homotopy groups and generalized (co)homology theories. And while several data science questions are discrete versions of mapping space problems --- including nonlinear dimensionality reduction and supervised learning --- the corresponding theoretical and algorithm treatment is currently lacking. This CAREER award will contribute towards remedying this situation. The research program articulated here seeks to launch a novel research program addressing the theory and algorithms of how the underlying topology of a data set can be leveraged for data modeling (e.g., in dimensionality reduction) as well as when learning maps between complex data spaces (e.g., in supervised learning). This work will yield methodologies for the computation of topology-aware and robust multiscale coordinatizations for data via classifying spaces, a computational theory of topological obstructions to the robust extension of maps between data sets, as well as the introduction of modern deep learning paradigms in order to learn maps between non-Euclidean data sets.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据分析可以描述为从观察中提取信息和以原则性方式理解模式的双重过程。这一过程和以数据为中心的技术的部署最近在许多科学领域带来了前所未有的进步,并随着知识经济和系统的出现而促进了全球繁荣。从高层次来看,这场革命是由两个推动力驱动的:允许收集复杂数据集的现代技术,以及我们用来理解它们的理论和算法。也就是说,尽管有很多好处,但从数据中提取可操作的知识仍然很困难。在不受控制的环境中收集的观察结果通常是高维的、复杂的和嘈杂的;即使使用受控实验,其背后的复杂系统(例如来自气象学、化学、医学和生物学的系统)也可以产生具有非常重要的基础拓扑的数据集。这是指诸如断开块(即簇)的数量、孔的存在或数据空间的可定向性等属性。通过该职业奖资助的研究将利用代数拓扑的思想来解决数据科学问题,例如复杂数据集的可视化和表示,以及在设计预测和分类学习系统时由非平凡拓扑带来的挑战。这项工作将通过创建在线 TDA(拓扑数据分析)学院纳入 PI 的教育计划,其双重目的是降低数据科学家和学者进入该领域的门槛,并提高数据科学家和学者进入该领域的门槛。计算数学领域服务不足的社区的代表。该项目为研究生提供研究培训机会。理解拓扑空间之间的映射集带来了丰富而复杂的数学,因为它包含了同伦群和广义(共)同调理论等代数不变量。虽然一些数据科学问题是映射空间问题的离散版本——包括非线性降维和监督学习——但目前缺乏相应的理论和算法处理。该职业奖将有助于纠正这种情况。这里阐述的研究计划旨在启动一项新颖的研究计划,解决如何利用数据集的底层拓扑进行数据建模(例如,降维)以及学习复杂数据空间之间的映射的理论和算法(例如,在监督学习中)。这项工作将产生通过分类空间来计算数据的拓扑感知和鲁棒多尺度协调的方法,数据集之间映射的鲁棒扩展的拓扑障碍的计算理论,以及引入现代深度学习范式学习非欧几里得数据集之间的地图。该奖项反映了 NSF 的法定使命,并且通过使用基金会的智力优点和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Toroidal Coordinates: Decorrelating Circular Coordinates with Lattice Reduction
环形坐​​标:通过晶格缩减去关联圆坐标
  • DOI:
    10.4230/lipics.socg.2023.57
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Scoccola, Luis;Gakhar, Hitesh;Bush, Johnathan;Schonsheck, Nikolas;Rask, Tatum;Zhou, Ling;Perea, Jose A.
  • 通讯作者:
    Perea, Jose A.
Topological Data Analysis of Electroencephalogram Signals for Pediatric Obstructive Sleep Apnea
小儿阻塞性睡眠呼吸暂停脑电图信号的拓扑数据分析
FibeRed: Fiberwise Dimensionality Reduction of Topologically Complex Data with Vector Bundles
FiberRed:使用向量束对拓扑复杂数据进行纤维维数降低
Persistable: persistent and stable clustering
  • DOI:
    10.21105/joss.05022
  • 发表时间:
    2023-03
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Luis Scoccola;Alexander Rolle
  • 通讯作者:
    Luis Scoccola;Alexander Rolle
Sliding window persistence of quasiperiodic functions
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jose Perea其他文献

Jose Perea的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jose Perea', 18)}}的其他基金

CAREER: Machine learning, Mapping Spaces, and Obstruction Theoretic Methods in Topological Data Analysis
职业:拓扑数据分析中的机器学习、映射空间和障碍理论方法
  • 批准号:
    1943758
  • 财政年份:
    2020
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
AF: Small: Bundle-theoretic methods for local-to-global inference
AF:小:用于局部到全局推理的捆绑理论方法
  • 批准号:
    2006661
  • 财政年份:
    2020
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CDS&E: Collaborative Research: Machine Learning on Dynamical Systems via Topological Features
CDS
  • 批准号:
    1622301
  • 财政年份:
    2016
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant

相似国自然基金

基于可解释机器学习的科学知识角色转变预测研究
  • 批准号:
    72304108
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
精神分裂症阴性症状经颅磁刺激治疗效应遗传影像学机器学习预测模型研究
  • 批准号:
    82371510
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
基于机器学习的蚀变矿物勘查方法与应用研究:以西藏岗讲矿床为例
  • 批准号:
    42363009
  • 批准年份:
    2023
  • 资助金额:
    32 万元
  • 项目类别:
    地区科学基金项目
机器学习驱动的复杂量子系统鲁棒最优控制
  • 批准号:
    62373342
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
基于机器学习方法的土壤多孔介质中EPFRs环境行为与生态毒性研究
  • 批准号:
    42377385
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目

相似海外基金

CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Mitigating the Lack of Labeled Training Data in Machine Learning Based on Multi-level Optimization
职业:基于多级优化缓解机器学习中标记训练数据的缺乏
  • 批准号:
    2339216
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Integrated and end-to-end machine learning pipeline for edge-enabled IoT systems: a resource-aware and QoS-aware perspective
职业:边缘物联网系统的集成端到端机器学习管道:资源感知和 QoS 感知的视角
  • 批准号:
    2340075
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Gaussian Processes for Scientific Machine Learning: Theoretical Analysis and Computational Algorithms
职业:科学机器学习的高斯过程:理论分析和计算算法
  • 批准号:
    2337678
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
Arlene George F32
阿琳·乔治 F32
  • 批准号:
    10722238
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了