CAREER: Machine learning, Mapping Spaces, and Obstruction Theoretic Methods in Topological Data Analysis

职业:拓扑数据分析中的机器学习、映射空间和障碍理论方法

基本信息

  • 批准号:
    2415445
  • 负责人:
  • 金额:
    $ 40万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2024
  • 资助国家:
    美国
  • 起止时间:
    2024-04-01 至 2025-04-30
  • 项目状态:
    未结题

项目摘要

Data analysis can be described as the dual process of extracting information from observations, and of understanding patterns in a principled manner. This process and the deployment of data-centric technologies have recently brought unprecedented advances in many scientific fields, as well as increased global prosperity with the advent of knowledge-based economies and systems. At a high level, this revolution is driven by two thrusts: the modern technologies which allow for the collection of complex data sets, and the theories and algorithms we use to make sense of them. That said, and for all its benefits, extracting actionable knowledge from data is difficult. Observations gathered in uncontrolled environments are often high-dimensional, complex and noisy; and even when controlled experiments are used, the intricate systems that underlie them --- like those from meteorology, chemistry, medicine and biology --- can yield data sets with highly nontrivial underlying topology. This refers to properties such as the number of disconnected pieces (i.e., clusters), the existence of holes or the orientability of the data space. The research funded through this CAREER award will leverage ideas from algebraic topology to address data science questions like visualization and representation of complex data sets, as well as the challenges posed by nontrivial topology when designing learning systems for prediction and classification. This work will be integrated into the educational program of the PI through the creation of an online TDA (Topological Data Analysis) academy, with the dual purpose of lowering the barrier of entry into the field for data scientists and academics, as well as increasing the representation of underserved communities in the field of computational mathematics. The project provides research training opportunities for graduate students.Understanding the set of maps between topological spaces has led to rich and sophisticated mathematics, for it subsumes algebraic invariants like homotopy groups and generalized (co)homology theories. And while several data science questions are discrete versions of mapping space problems --- including nonlinear dimensionality reduction and supervised learning --- the corresponding theoretical and algorithm treatment is currently lacking. This CAREER award will contribute towards remedying this situation. The research program articulated here seeks to launch a novel research program addressing the theory and algorithms of how the underlying topology of a data set can be leveraged for data modeling (e.g., in dimensionality reduction) as well as when learning maps between complex data spaces (e.g., in supervised learning). This work will yield methodologies for the computation of topology-aware and robust multiscale coordinatizations for data via classifying spaces, a computational theory of topological obstructions to the robust extension of maps between data sets, as well as the introduction of modern deep learning paradigms in order to learn maps between non-Euclidean data sets.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据分析可以描述为从观察中提取信息和以原则性方式理解模式的双重过程。这一过程和以数据为中心的技术的部署最近在许多科学领域带来了前所未有的进步,并随着知识经济和系统的出现而促进了全球繁荣。从高层次来看,这场革命是由两个推动力驱动的:允许收集复杂数据集的现代技术,以及我们用来理解它们的理论和算法。也就是说,尽管有很多好处,但从数据中提取可操作的知识仍然很困难。在不受控制的环境中收集的观察结果通常是高维的、复杂的和嘈杂的;即使使用受控实验,其背后的复杂系统(例如来自气象学、化学、医学和生物学的系统)也可以产生具有非常重要的基础拓扑的数据集。这是指诸如断开块(即簇)的数量、孔的存在或数据空间的可定向性等属性。通过该职业奖资助的研究将利用代数拓扑的思想来解决数据科学问题,例如复杂数据集的可视化和表示,以及在设计预测和分类学习系统时由非平凡拓扑带来的挑战。这项工作将通过创建在线 TDA(拓扑数据分析)学院纳入 PI 的教育计划,其双重目的是降低数据科学家和学者进入该领域的门槛,并提高数据科学家和学者进入该领域的门槛。计算数学领域服务不足的社区的代表。该项目为研究生提供研究培训机会。理解拓扑空间之间的映射集带来了丰富而复杂的数学,因为它包含了同伦群和广义(共)同调理论等代数不变量。虽然一些数据科学问题是映射空间问题的离散版本——包括非线性降维和监督学习——但目前缺乏相应的理论和算法处理。该职业奖将有助于纠正这种情况。这里阐述的研究计划旨在启动一项新颖的研究计划,解决如何利用数据集的底层拓扑进行数据建模(例如,降维)以及学习复杂数据空间之间的映射的理论和算法(例如,在监督学习中)。这项工作将产生通过分类空间来计算数据的拓扑感知和鲁棒多尺度协调的方法,数据集之间映射的鲁棒扩展的拓扑障碍的计算理论,以及引入现代深度学习范式学习非欧几里得数据集之间的地图。该奖项反映了 NSF 的法定使命,并且通过使用基金会的智力优点和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A Comparative Study of Machine Learning Methods for Persistence Diagrams
持久性图机器学习方法的比较研究
  • DOI:
    10.3389/frai.2021.681174
  • 发表时间:
    2021-07-28
  • 期刊:
  • 影响因子:
    4
  • 作者:
    D. Barnes;Luis Polanco;Jose A. Perea
  • 通讯作者:
    Jose A. Perea
Topological Data Analysis of Electroencephalogram Signals for Pediatric Obstructive Sleep Apnea
小儿阻塞性睡眠呼吸暂停脑电图信号的拓扑数据分析
Approximate and discrete Euclidean vector bundles
近似和离散欧几里得向量丛
  • DOI:
    10.1017/fms.2023.16
  • 发表时间:
    2023-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Scoccola, Luis;Perea, Jose A.
  • 通讯作者:
    Perea, Jose A.
Persistable: persistent and stable clustering
Persistable:持久稳定的集群
  • DOI:
    10.21105/joss.05022
  • 发表时间:
    2023-03
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Scoccola, Luis;Rolle, Alexander
  • 通讯作者:
    Rolle, Alexander
Toroidal Coordinates: Decorrelating Circular Coordinates with Lattice Reduction
环形坐​​标:通过晶格缩减去关联圆坐标
  • DOI:
    10.4230/lipics.socg.2023.57
  • 发表时间:
    2023-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Scoccola, Luis;Gakhar, Hitesh;Bush, Johnathan;Schonsheck, Nikolas;Rask, Tatum;Zhou, Ling;Perea, Jose A.
  • 通讯作者:
    Perea, Jose A.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jose Perea其他文献

Red Flag Signs and Symptoms for Patients With Early-Onset Colorectal Cancer
早发性结直肠癌患者的危险信号体征和症状
  • DOI:
    10.1001/jamanetworkopen.2024.13157
  • 发表时间:
    2024-05-01
  • 期刊:
  • 影响因子:
    13.8
  • 作者:
    J. Demb;Jennifer M Kolb;Jonathan Dounel;Cassandra D L Fritz;Shailesh M. Advani;Yin Cao;Penny Coppernoll;Andrea J Dwyer;Jose Perea;Karen Heskett;A. Holowatyj;Christopher H Lieu;Siddharth Singh;M. Spaander;F. Vuik;Samir Gupta
  • 通讯作者:
    Samir Gupta

Jose Perea的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jose Perea', 18)}}的其他基金

AF: Small: Bundle-theoretic methods for local-to-global inference
AF:小:用于局部到全局推理的捆绑理论方法
  • 批准号:
    2006661
  • 财政年份:
    2020
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: Machine learning, Mapping Spaces, and Obstruction Theoretic Methods in Topological Data Analysis
职业:拓扑数据分析中的机器学习、映射空间和障碍理论方法
  • 批准号:
    1943758
  • 财政年份:
    2020
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CDS&E: Collaborative Research: Machine Learning on Dynamical Systems via Topological Features
CDS
  • 批准号:
    1622301
  • 财政年份:
    2016
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant

相似国自然基金

面向机器人复杂操作的接触形面和抓取策略共适应学习
  • 批准号:
    52305030
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
机器学习驱动的复杂量子系统鲁棒最优控制
  • 批准号:
    62373342
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
机器学习增强的多尺度固体电解质相界面结构预测
  • 批准号:
    22303058
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
机器学习指导构建新型电解液体系实现高性能低温锂离子电池
  • 批准号:
    52303299
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
面向海量重力卫星观测数据精化处理的机器学习方法研究
  • 批准号:
    42374004
  • 批准年份:
    2023
  • 资助金额:
    51 万元
  • 项目类别:
    面上项目

相似海外基金

CAREER: Intelligent Battery Management with Safe, Efficient, Fast-Adaption Reinforcement Learning and Physics-Inspired Machine Learning: From Cells to Packs
职业:具有安全、高效、快速适应的强化学习和物理启发机器学习的智能电池管理:从电池到电池组
  • 批准号:
    2340194
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Towards Trustworthy Machine Learning via Learning Trustworthy Representations: An Information-Theoretic Framework
职业:通过学习可信表示实现可信机器学习:信息理论框架
  • 批准号:
    2339686
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Integrated and end-to-end machine learning pipeline for edge-enabled IoT systems: a resource-aware and QoS-aware perspective
职业:边缘物联网系统的集成端到端机器学习管道:资源感知和 QoS 感知的视角
  • 批准号:
    2340075
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Heterogeneous Neuromorphic and Edge Computing Systems for Realtime Machine Learning Technologies
职业:用于实时机器学习技术的异构神经形态和边缘计算系统
  • 批准号:
    2340249
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Algorithm-Hardware Co-design of Efficient Large Graph Machine Learning for Electronic Design Automation
职业:用于电子设计自动化的高效大图机器学习的算法-硬件协同设计
  • 批准号:
    2340273
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了