Inference in High-Dimensional Statistical Models: Algorithmic Tractability and Computational Barriers

高维统计模型中的推理:算法易处理性和计算障碍

基本信息

  • 批准号:
    2015517
  • 负责人:
  • 金额:
    $ 20万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-09-01 至 2023-08-31
  • 项目状态:
    已结题

项目摘要

Extracting knowledge from data using statistical and machine learning methods often involves computations, which don't scale well with dataset sizes. This is dictated by the necessity of analyzing large scale statistical models, where the scale of the data ever increases due to our unprecedented ability to accumulative massive amounts of it. Often this leads to models where the number of parameters far exceeds the amount of collected data, rendering many classical inference models ill-posed and classical computational methods prohibitively time consuming. Thus the value brought about by the abundance of data comes at the expense of the necessity to develop completely novel computational tools that are capable of dealing with the curse of dimensionality. While there is an abundance of literature devoted to designing efficient computational methods of inference in high-dimensional statistical models, it was discovered that many algorithms hit a certain computational barrier, beyond which seemingly only brute-force and thus computationally prohibitive algorithms can succeed. Not much is known regarding the fundamental computational limitations arising above this barrier, which is popularly dubbed the nformation Theoretic vs Computation gap. What is the origin of this barrier? Does it indeed correspond to the onset of algorithmically intractable problems, or is it just a matter of being more clever about designing faster algorithms? The project also provides research training opportunities for graduate students. In the present project the PI develops a completely novel approach for understanding fundamental computational barriers arising in high dimensional statistical models. The approach is based on powerful and illuminating insights derived from the field of statistical physics, specifically the theory of spin glasses. In particular, the PI intends to establish that the onset of the algorithmic barriers is caused by phase transition in the landscape of the solution space, marking a drastic change in the solution space geometry of underlying inference problems. This change in geometry of the solution space landscape taking the form of the so-called Overlap Gap Property (OGP), can further be used to rule out broad classes of algorithms as potential contenders to bridge the information theoretic and algorithmic gap. These classes of algorithms include algorithms based on local improvements, such as Gradient Descent and Stochastic Gradient Descend algorithms, algorithms based on Markov Chain Monte Carlo Method, algorithms broadly defined as Approximate Message Passing iterations, and algorithms based on constructing low-degree polynomials. The PI in particular intends to investigate the validity of a bold conjecture stating that for most, if not all of the known models exhibiting apparent algorithmic barriers, the onset of this barrier coincides with the onset of the OGP. The PI intends to investigate this conjecture in the context of several widely studied modern models of high dimensional statistics and machine learning fields, including the Stochastic Block Model, the Spiked Tensor Model, and Wide Neural Networks model. All of these models are known to exhibit an apparent algorithmic hardness in some parameter regimes and thus these models offer a valuable framework for investigating the validity of the aforementioned conjecture, as well as algorithmic intractability implications.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
使用统计和机器学习方法从数据中提取知识通常涉及计算,这些计算与数据集大小不太缩小。这是由分析大规模统计模型的必要性决定的,在该模型中,由于我们前所未有的累积大量量的能力,数据的规模始终增加。通常,这会导致模型,其中参数数量远远超过了收集的数据的数量,从而使许多经典的推理模型过时耗时。因此,大量数据带来的价值是为了开发能够应对维度诅咒的完全新颖的计算工具的必要性。尽管有大量文献致力于设计高维统计模型中的有效计算方法,但发现许多算法遇到了某种计算障碍,除此之外,似乎只有蛮力,因此计算效率较高的算法才能成功。关于在此障碍之上产生的基本计算局限性,该障碍的理论与计算差距众所周知。这个障碍的起源是什么?它确实与算法上棘手的问题的发作相对应,还是只是在设计更快的算法方面变得更加聪明?该项目还为研究生提供了研究培训机会。在本项目中,PI开发了一种完全新颖的方法,用于理解在高维统计模型中产生的基本计算障碍。该方法基于统计物理领域(特别是自旋玻璃理论)得出的强大而启发性的见解。特别是,PI打算确定算法屏障的发作是由溶液空间景观中的相变引起的,这标志着基础推理问题的解决方案空间几何形状发生了巨大变化。以所谓的重叠间隙特性(OGP)形式的解决方案空间景观的几何形状变化,可以进一步使用,以排除广泛的算法类别,作为弥合信息理论和算法差距的潜在竞争者。这些类别的算法包括基于局部改进的算法,例如梯度下降和随机梯度降低算法,基于马尔可夫链蒙特卡洛方法的算法,算法广泛定义为近似消息传递和算法,以及基于基于低维多重元素的构建算法。 PI尤其打算研究一个大胆的猜想的有效性,表明对于大多数(如果不是全部)表现出明显的算法屏障的模型,该屏障的开始与OGP的发作相吻合。 PI打算在几个广泛研究的高维统计和机器学习领域的现代模型(包括随机块模型,尖刺张量模型和广泛的神经网络模型)的情况下研究这种猜想。已知所有这些模型都在某些参数方面表现出明显的算法硬度,因此这些模型为研究上述猜想的有效性以及算法的顽固性含义提供了一个宝贵的框架,以及该奖项颁发奖项。该奖项反映了NSF的法定任务,并通过评估了范围的范围,并通过评估了范围的范围。

项目成果

期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Computing the partition function of the Sherrington–Kirkpatrick model is hard on average
平均而言,计算 Sherrington–Kirkpatrick 模型的配分函数很困难
  • DOI:
    10.1214/20-aap1625
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Gamarnik, David;Kızıldağ, Eren C.
  • 通讯作者:
    Kızıldağ, Eren C.
Algorithms and Barriers in the Symmetric Binary Perceptron Model
The overlap gap property and approximate message passing algorithms for $p$-spin models
$p$-spin 模型的重叠间隙属性和近似消息传递算法
  • DOI:
    10.1214/20-aop1448
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Gamarnik, David;Jagannath, Aukosh
  • 通讯作者:
    Jagannath, Aukosh
Performance and limitations of the QAOA at constant levels on large sparse hypergraphs and spin glass models
Sharp Thresholds Imply Circuit Lower Bounds: from random 2-SAT to Planted Clique
尖锐的阈值意味着电路下限:从随机 2-SAT 到种植派
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    David Gamarnik;Elchanan Mossel;Ilias Zadik
  • 通讯作者:
    Ilias Zadik
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

David Gamarnik其他文献

Hamiltonian completions of sparse random graphs
  • DOI:
    10.1016/j.dam.2005.05.001
  • 发表时间:
    2005-11-01
  • 期刊:
  • 影响因子:
  • 作者:
    David Gamarnik;Maxim Sviridenko
  • 通讯作者:
    Maxim Sviridenko
Integrating High-Dimensional Functions Deterministically
确定性地积分高维函数
  • DOI:
    10.48550/arxiv.2402.08232
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    David Gamarnik;Devin Smedira
  • 通讯作者:
    Devin Smedira
Computing the Volume of a Restricted Independent Set Polytope Deterministically
确定性计算受限独立集多面体的体积
  • DOI:
    10.48550/arxiv.2312.03906
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    David Gamarnik;Devin Smedira
  • 通讯作者:
    Devin Smedira
Stationary Points of a Shallow Neural Network with Quadratic Activations and the Global Optimality of the Gradient Descent Algorithm
具有二次激活的浅层神经网络的驻点和梯度下降算法的全局最优性
  • DOI:
    10.1287/moor.2021.0082
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    1.7
  • 作者:
    David Gamarnik;Eren C. Kizildag;Ilias Zadik
  • 通讯作者:
    Ilias Zadik

David Gamarnik的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('David Gamarnik', 18)}}的其他基金

AF: Small: Low-Degree Methods for Optimization in Random Structures. Power and Limitations
AF:小:随机结构优化的低度方法。
  • 批准号:
    2233897
  • 财政年份:
    2023
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Local Algorithms for Random Networks: Power, Limitations and Applications
随机网络的局部算法:能力、限制和应用
  • 批准号:
    1335155
  • 财政年份:
    2013
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Statistical Physics Methods and Algorithmic Applications in Graphical Games and Combinatorial Optimization
统计物理方法和算法在图形游戏和组合优化中的应用
  • 批准号:
    1031332
  • 财政年份:
    2010
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Stochastic Networks in the Heavy Traffic Regime: Algorithms, Approximations and Applications
高流量情况下的随机网络:算法、近似值和应用
  • 批准号:
    0726733
  • 财政年份:
    2007
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant

相似国自然基金

大尺寸多几何体统计最优近场声全息及声隐身性能评估
  • 批准号:
    51775407
  • 批准年份:
    2017
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目
多尺度、非线性电大尺寸电磁工程问题分析的关键技术研究
  • 批准号:
    61571022
  • 批准年份:
    2015
  • 资助金额:
    67.0 万元
  • 项目类别:
    面上项目
混凝土Weibull统计尺寸效应理论模型改进研究
  • 批准号:
    51408127
  • 批准年份:
    2014
  • 资助金额:
    25.0 万元
  • 项目类别:
    青年科学基金项目
拓扑结构高分子在体积排除色谱中的分离:理论模型的建立及淋出曲线的预测
  • 批准号:
    21204061
  • 批准年份:
    2012
  • 资助金额:
    27.0 万元
  • 项目类别:
    青年科学基金项目
混凝土两种尺寸效应对损伤演化诱致灾变破坏的影响研究
  • 批准号:
    11072210
  • 批准年份:
    2010
  • 资助金额:
    36.0 万元
  • 项目类别:
    面上项目

相似海外基金

CAREER: Towards Tight Guarantees of Markov Chain Sampling Algorithms in High Dimensional Statistical Inference
职业:高维统计推断中马尔可夫链采样算法的严格保证
  • 批准号:
    2237322
  • 财政年份:
    2023
  • 资助金额:
    $ 20万
  • 项目类别:
    Continuing Grant
Scalable Computational Methods for Genealogical Inference: from species level to single cells
用于谱系推断的可扩展计算方法:从物种水平到单细胞
  • 批准号:
    10889303
  • 财政年份:
    2023
  • 资助金额:
    $ 20万
  • 项目类别:
Bayesian Modeling and Inference for High-Dimensional Disease Mapping and Boundary Detection"
用于高维疾病绘图和边界检测的贝叶斯建模和推理”
  • 批准号:
    10568797
  • 财政年份:
    2023
  • 资助金额:
    $ 20万
  • 项目类别:
CAREER: Computer-Intensive Statistical Inference on High-Dimensional and Massive Data: From Theoretical Foundations to Practical Computations
职业:高维海量数据的计算机密集统计推断:从理论基础到实际计算
  • 批准号:
    2347760
  • 财政年份:
    2023
  • 资助金额:
    $ 20万
  • 项目类别:
    Continuing Grant
Statistical learning and causal inference in high-dimensional genomics data across multiple information layers
跨多个信息层的高维基因组数据的统计学习和因果推理
  • 批准号:
    DGECR-2022-00445
  • 财政年份:
    2022
  • 资助金额:
    $ 20万
  • 项目类别:
    Discovery Launch Supplement
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了