Collaborative Research: Aggregated Monte Carlo: A General Framework for Distributed Bayesian Inference in Massive Spatiotemporal Data

合作研究:聚合蒙特卡罗:海量时空数据中分布式贝叶斯推理的通用框架

基本信息

  • 批准号:
    2220840
  • 负责人:
  • 金额:
    $ 17.2万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2021
  • 资助国家:
    美国
  • 起止时间:
    2021-10-01 至 2024-05-31
  • 项目状态:
    已结题

项目摘要

With tremendous advancements in spatial referencing technologies such as Global Positioning Systems that can identify geographical coordinates with a simple hand-held device, researchers in various disciplines have gathered an unprecedented variety of geo-coded temporal data. Consequently, modeling spatiotemporal data with flexible statistical models has become an enormously active area of research over the last decade in many disciplines including the environmental sciences, health sciences and oceanography, among others. In all these applications, researchers require efficient data modeling tools that can adapt to the complexity and size of modern spatiotemporal data, empowering them to quickly fit a variety of scientific models that explain the intricate nature of associations. This research project develops a new class of distributed Bayesian statistical algorithms, the Aggregated Monte Carlo (AMC), that enables efficient modeling of massive spatiotemporal data on an unprecedented scale. While the motivation of the PIs comes primarily from complex modeling and uncertainty quantification of massive spatiotemporal data, the proposed algorithm is general enough to set important footprints in the related literature of machine learning and computer experiments. The overarching goal also includes the development of software toolkits to better serve practitioners in related disciplines. There has been an explosion in the size, complexity, and availability of spatiotemporally indexed data. This event has outpaced the development in Bayesian statistical methodology in that the fitting of state-of-the-art methods based on stochastic processes for analyzing spatiotemporal point referenced and point process data is prohibitively slow unless restrictive assumptions are imposed. The main problem is that the Monte Carlo (MC) computations in Markov chain Monte Carlo (MCMC) methods for fitting these models scale poorly with the size of the data. Solving this problem, the PIs develop a general framework, called Aggregated Monte Carlo (AMC), for scaling MC computations in the stochastic process-based modeling of massive space-time data using a divide-and-conquer technique. AMC has three stages that involve dividing the data into smaller subsets, obtaining posterior samples of the unknown parameters and latent variables across all the subsets using MCMC, and combining the MCMC samples from all the subsets. AMC is tuned to boost the scalability of any state-of-the-art model based on a stochastic process using a divide-and-conquer technique. Computationally, the main innovations include the development of general division and combination schemes for data with diverse spatiotemporal structures. Theoretically, the project provides bounds on the number of subsets such that the posterior distribution estimated using AMC provides a near optimal approximation of the full data posterior distribution in terms of decay of the posterior risks and contraction rates. Conceptually, AMC provides a natural extension of the existing results for combination using the barycenter of subset posterior distributions in parametric models to non-parametric models with complex spatiotemporal structures. The most appealing features of AMC are that it exploits parallel computer architecture for efficient and flexible modeling of massive spatiotemporal data and it provides posterior inference and uncertainty estimates with theoretical guarantees.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
随着空间参考技术的巨大进步,例如可以通过简单的手持设备来识别地理坐标的全球定位系统,各个学科的研究人员都收集了史无前例的地理编码时间数据。因此,在过去十年中,在包括环境科学,健康科学和海洋学等许多学科中,使用灵活的统计模型对时空数据进行建模已成为一个极具活跃的研究领域。在所有这些应用中,研究人员都需要有效的数据建模工具,这些工具可以适应现代时空数据的复杂性和大小,从而使它们能够快速拟合各种科学模型,以解释关联的复杂性质。该研究项目开发了一类新的分布式贝叶斯统计算法,即汇总的蒙特卡洛(AMC),该算法可以在前所未有的规模上有效地对大规模时空数据进行有效建模。尽管PI的动机主要来自大量时空数据的复杂建模和不确定性量化,但所提出的算法足以在机器学习和计算机实验的相关文献中设置重要的足迹。总体目标还包括开发软件工具包,以更好地为相关学科的从业人员提供服务。时空索引数据的大小,复杂性和可用性发生了爆炸。该事件已经超过了贝叶斯统计方法的发展,因为基于随机过程的最新方法拟合用于分析所引用时空点的随机过程,并且除非强加了限制性假设,否则要慢慢地进行点过程数据。主要问题是马尔可夫链蒙特卡洛(MCMC)中的蒙特卡洛(MC)计算,用于拟合这些模型的拟合与数据尺寸的尺寸较差。解决这个问题,PIS开发了一个通用框架,称为汇总的蒙特卡洛(AMC),用于使用分隔和争议技术在基于随机过程的大规模时空数据中扩展MC计算。 AMC具有三个阶段,涉及将数据分为较小的子集,使用MCMC在所有子集中获得未知参数的后验样本,并将来自所有子集的MCMC样本组合在一起。对AMC进行了调整,以通过使用分隔和互动技术基于随机过程来提高任何最先进模型的可扩展性。从计算上讲,主要创新包括开发具有多种时空结构数据的数据的一般划分和组合方案。从理论上讲,该项目对亚集的数量提供了界限,以至于使用AMC估算的后验分布在后验风险和收缩率的衰减方面提供了完整数据后验分布的最佳近似。从概念上讲,AMC在参数模型中使用子集后部分布的重中心到具有复杂时空结构的非参数模型中的Barycenter提供了现有结果的自然扩展。 AMC的最吸引人的特征是,它利用平行的计算机架构来对大量时空数据进行有效,灵活的建模,并通过理论保证提供后验推断和不确定性估计。该奖项反映了NSF的法定任务,并通过基金会的知识优点和广泛的效果来评估,这是NSF的法定任务,并被认为是值得通过评估的支持。

项目成果

期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Bayesian Dynamic Feature Partitioning in High-Dimensional Regression With Big Data
大数据高维回归中的贝叶斯动态特征划分
  • DOI:
    10.1080/00401706.2021.1952899
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    2.5
  • 作者:
    Gutierrez, Rene;Guhaniyogi, Rajarshi
  • 通讯作者:
    Guhaniyogi, Rajarshi
Distributed Bayesian Varying Coefficient Modeling Using a Gaussian Process Prior
  • DOI:
  • 发表时间:
    2020-06
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Rajarshi Guhaniyogi;Cheng Li;T. Savitsky;Sanvesh Srivastava
  • 通讯作者:
    Rajarshi Guhaniyogi;Cheng Li;T. Savitsky;Sanvesh Srivastava
Distributed Bayesian Inference in Massive Spatial Data
  • DOI:
    10.1214/22-sts868
  • 发表时间:
    2023-01
  • 期刊:
  • 影响因子:
    5.7
  • 作者:
    Rajarshi Guhaniyogi;Cheng Li;T. Savitsky;Sanvesh Srivastava
  • 通讯作者:
    Rajarshi Guhaniyogi;Cheng Li;T. Savitsky;Sanvesh Srivastava
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Rajarshi Guhaniyogi其他文献

Bayesian Conditional Density Filtering
贝叶斯条件密度过滤
Bayesian nonparametric areal wombling for small‐scale maps with an application to urinary bladder cancer data from Connecticut
小比例尺地图的贝叶斯非参数区域波动及其在康涅狄格州膀胱癌数据中的应用
  • DOI:
    10.1002/sim.7408
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    2
  • 作者:
    Rajarshi Guhaniyogi
  • 通讯作者:
    Rajarshi Guhaniyogi
Approximated Bayesian Inference for Massive Streaming Data
海量流数据的近似贝叶斯推理
  • DOI:
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Rajarshi Guhaniyogi;R. Willett;D. Dunson
  • 通讯作者:
    D. Dunson
InVA: Integrative Variational Autoencoder for Harmonization of Multi-modal Neuroimaging Data
InVA:用于协调多模态神经影像数据的综合变分自动编码器
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Bowen Lei;Rajarshi Guhaniyogi;Krishnendu Chandra;Aaron Scheffler;Bani Mallick
  • 通讯作者:
    Bani Mallick
Multivariate bias adjusted tapered predictive process models
多变量偏差调整锥形预测过程模型
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Rajarshi Guhaniyogi
  • 通讯作者:
    Rajarshi Guhaniyogi

Rajarshi Guhaniyogi的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Rajarshi Guhaniyogi', 18)}}的其他基金

Collaborative Research: Use of Random Compression Matrices For Scalable Inference in High Dimensional Structured Regressions
合作研究:使用随机压缩矩阵进行高维结构化回归中的可扩展推理
  • 批准号:
    2210672
  • 财政年份:
    2022
  • 资助金额:
    $ 17.2万
  • 项目类别:
    Standard Grant
Collaborative Research: Aggregated Monte Carlo: A General Framework for Distributed Bayesian Inference in Massive Spatiotemporal Data
合作研究:聚合蒙特卡罗:海量时空数据中分布式贝叶斯推理的通用框架
  • 批准号:
    1854662
  • 财政年份:
    2019
  • 资助金额:
    $ 17.2万
  • 项目类别:
    Standard Grant

相似国自然基金

溶酶体靶向聚集性无药抗肿瘤纳米颗粒的研究
  • 批准号:
    52303170
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
朊蛋白聚集体诱导肾小管细胞衰老的机制和干预策略研究
  • 批准号:
    82330019
  • 批准年份:
    2023
  • 资助金额:
    220 万元
  • 项目类别:
    重点项目
黑磷纳米片对帕金森病α-突触核蛋白聚集的清除作用及其分子机制研究
  • 批准号:
    32360241
  • 批准年份:
    2023
  • 资助金额:
    34 万元
  • 项目类别:
    地区科学基金项目
罗汉果苷V抑制α-突触核蛋白异常磷酸化和聚集的机制研究
  • 批准号:
    82360241
  • 批准年份:
    2023
  • 资助金额:
    32.2 万元
  • 项目类别:
    地区科学基金项目
小胶质细胞介导Tau病理聚集和传播参与放射后认知障碍的机制研究
  • 批准号:
    82371408
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: SHF: Small: Tangram: Scaling into the Exascale Era with Reconfigurable Aggregated "Virtual Chips"
合作研究:SHF:小型:七巧板:通过可重构聚合“虚拟芯片”扩展到百亿亿次时代
  • 批准号:
    2245129
  • 财政年份:
    2022
  • 资助金额:
    $ 17.2万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Small: Tangram: Scaling into the Exascale Era with Reconfigurable Aggregated "Virtual Chips"
合作研究:SHF:小型:七巧板:通过可重构聚合“虚拟芯片”扩展到百亿亿次时代
  • 批准号:
    2124525
  • 财政年份:
    2021
  • 资助金额:
    $ 17.2万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Small: Tangram: Scaling into the Exascale Era with Reconfigurable Aggregated "Virtual Chips"
合作研究:SHF:小型:七巧板:通过可重构聚合“虚拟芯片”扩展到百亿亿次时代
  • 批准号:
    2008911
  • 财政年份:
    2020
  • 资助金额:
    $ 17.2万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Small: Tangram: Scaling into the Exascale Era with Reconfigurable Aggregated "Virtual Chips"
合作研究:SHF:小型:七巧板:通过可重构聚合“虚拟芯片”扩展到百亿亿次时代
  • 批准号:
    2007796
  • 财政年份:
    2020
  • 资助金额:
    $ 17.2万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Small: Tangram: Scaling into the Exascale Era with Reconfigurable Aggregated "Virtual Chips"
合作研究:SHF:小型:七巧板:通过可重构聚合“虚拟芯片”扩展到百亿亿次时代
  • 批准号:
    2008477
  • 财政年份:
    2020
  • 资助金额:
    $ 17.2万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了