NSF-BSF: AF: Small: Identifying Functional Structure in Data

NSF-BSF:AF:小:识别数据中的功能结构

基本信息

  • 批准号:
    1909972
  • 负责人:
  • 金额:
    $ 40万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-10-01 至 2023-09-30
  • 项目状态:
    已结题

项目摘要

Algorithms for unsupervised learning tasks are in constant use in science, engineering, medical research, etc. Progress in optimization procedures for these problems has therefore the potential for very wide adoption. One of the primary research topics of this award is unsupervised learning methods for determining causal effect in scenarios where one cannot, for practical or ethical reasons, perform controlled experiments. At present, the theory for deducing causal effects is somewhat brittle, since it relies upon unrealistic accuracy in the mathematical modeling, and, for large systems, upon unrealistic sample sizes. Scientists therefore do not have very reliable ways of advising the public about "what-if" scenarios. This research aims to provide, and characterize the performance of, causal-identification algorithms with the following goals: (a) Algorithms that work under weaker assumptions than current theory makes (in particular regarding the fidelity of the model to reality), yet deliver still-useful guarantees; (b) Algorithms that work under stronger structural assumptions than current theory makes and can use smaller sample sizes as a result. A related research topic concerns approximation algorithms, and inapproximability bounds, for optimization problems such as data clustering and learning of mixture models. The investigator will train students and postdocs in these and related areas of the theory of computation.Causal-inference problems will be addressed through the framework of directed probabilistic graphical models. Existing causal-identification algorithms operate with a naive parameter size because, for N-variate graphical models, the complexity parameter (and sample complexity) is exponential in N. By contrast all algorithmic methods for clustering, for learning mixtures of product distributions, or for topic models regard the dimension (i.e., the number of variables N) as a parameter that should not appear, or at worst appear poly-logarithmically, in the exponent. This project aims to provide algorithms with such complexity; however, this is not only an algorithmic question. The existing characterization of when causal identification is even possible is stated under the assumption that one has access to full knowledge of the joint distribution on the N variables. Therefore a related, statistical but not algorithmic goal is to obtain a characterization of graphical models in which causal inference is possible only from the joint distributions of sets of variables of size small relative to N. A candidate class of such models comes from mixture models, beginning with mixtures of products and extending to mixtures of Markov models. Another question concerns graphical models in which causal identification is not possible, yet the range of causal effects consistent with the data is small. In such cases, an algorithm which bounds the causal effect is an adequate replacement for full identification; providing such guarantees is another goal of this work. Finally, research will focus on clustering with squared Euclidean costs: both on the inapproximability threshold and in improving approximation factors through linear programming relaxations or other methods.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
无监督学习任务的算法在科学、工程、医学研究等领域中不断使用。因此,这些问题的优化程序的进展具有广泛采用的潜力。该奖项的主要研究主题之一是无监督学习方法,用于确定由于实际或道德原因而无法进行受控实验的场景中的因果效应。目前,推断因果效应的理论有些脆弱,因为它依赖于数学建模中不切实际的准确性,并且对于大型系统,依赖于不切实际的样本量。因此,科学家们没有非常可靠的方法来向公众提供有关“假设”情景的建议。这项研究旨在提供因果识别算法并描述其性能,其目标如下:(a) 在比当前理论更弱的假设下工作的算法(特别是关于模型对现实的保真度),但仍然提供- 有用的保证; (b) 在比当前理论更强的结构假设下工作的算法,因此可以使用更小的样本量。一个相关的研究主题涉及数据聚类和混合模型学习等优化问题的逼近算法和不可逼近性界限。研究者将在计算理论的这些和相关领域对学生和博士后进行培训。因果推理问题将通过有向概率图模型的框架来解决。现有的因果识别算法以朴素的参数大小运行,因为对于 N 变量图形模型,复杂性参数(和样本复杂性)是 N 的指数。相比之下,所有用于聚类、学习产品分布混合或主题模型将维度(即变量的数量 N)视为不应出现在指数中的参数,或者最坏的情况下以多对数形式出现在指数中。该项目旨在提供具有如此复杂性的算法;然而,这不仅仅是一个算法问题。因果识别何时可能的现有特征是在人们能够完全了解 N 个变量的联合分布的假设下陈述的。因此,一个相关的统计而非算法目标是获得图模型的表征,其中因果推断只能从相对于 N 而言较小的变量集的联合分布中进行。此类模型的候选类别来自混合模型,从产品的混合物开始,扩展到马尔可夫模型的混合物。另一个问题涉及图形模型,其中不可能识别因果关系,但与数据一致的因果效应范围很小。在这种情况下,限制因果效应的算法足以替代完全识别;提供此类保证是这项工作的另一个目标。最后,研究将集中于具有平方欧几里德成本的聚类:无论是不可逼近性阈值还是通过线性规划松弛或其他方法改进逼近因子。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值进行评估,被认为值得支持以及更广泛的影响审查标准。

项目成果

期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Convergence of incentive-driven dynamics in Fisher markets
渔业市场激励驱动动态的融合
  • DOI:
    10.1016/j.geb.2020.11.005
  • 发表时间:
    2020-12
  • 期刊:
  • 影响因子:
    1.1
  • 作者:
    Dvijotham, Krishnamurthy;Rabani, Yuval;Schulman, Leonard J.
  • 通讯作者:
    Schulman, Leonard J.
Condition number bounds for causal inference
因果推理的条件数界限
A refined approximation for Euclidean k-means
欧几里得 k 均值的精确近似
  • DOI:
    10.1016/j.ipl.2022.106251
  • 发表时间:
    2022-06
  • 期刊:
  • 影响因子:
    0.5
  • 作者:
    Grandoni, Fabrizio;Ostrovsky, Rafail;Rabani, Yuval;Schulman, Leonard J.;Venkat, Rakesh
  • 通讯作者:
    Venkat, Rakesh
Condition number bounds for causal inference
因果推理的条件数界限
Hadamard Extensions and the Identification of Mixtures of Product Distributions
Hadamard 扩展和产品分布混合的识别
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Leonard Schulman其他文献

Leonard Schulman的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Leonard Schulman', 18)}}的其他基金

NSF-BSF: AF: Small: Algorithmic and Information-Theoretic Challenges in Causal Inference
NSF-BSF:AF:小:因果推理中的算法和信息论挑战
  • 批准号:
    2321079
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
AF: Small: Algorithms and Information Theory for Causal Inference
AF:小:因果推理的算法和信息论
  • 批准号:
    1618795
  • 财政年份:
    2016
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
AF: Small: Algorithms for Inference
AF:小:推理算法
  • 批准号:
    1319745
  • 财政年份:
    2013
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
AF: EAGER: Algorithms in Linear Algebra and Optimization
AF:EAGER:线性代数和优化算法
  • 批准号:
    1038578
  • 财政年份:
    2011
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
Collaborative Research: EMT/QIS: Quantum Algorithms and Post-Quantum Cryptography
合作研究:EMT/QIS:量子算法和后量子密码学
  • 批准号:
    0829909
  • 财政年份:
    2008
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
SGER: Planning for a Cross-Cutting Initiative in Computational Discovery
SGER:规划计算发现的跨领域计划
  • 批准号:
    0652536
  • 财政年份:
    2007
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
QnTM: Collaborative Research: Quantum Algorithms
QnTM:协作研究:量子算法
  • 批准号:
    0524828
  • 财政年份:
    2005
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
Algorithms for Data Analysis
数据分析算法
  • 批准号:
    0515342
  • 财政年份:
    2005
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: Computation Methods
职业:计算方法
  • 批准号:
    0049092
  • 财政年份:
    2000
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Computation Methods
职业:计算方法
  • 批准号:
    9876172
  • 财政年份:
    1999
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant

相似国自然基金

枯草芽孢杆菌BSF01降解高效氯氰菊酯的种内群体感应机制研究
  • 批准号:
    31871988
  • 批准年份:
    2018
  • 资助金额:
    59.0 万元
  • 项目类别:
    面上项目
基于掺硼直拉单晶硅片的Al-BSF和PERC太阳电池光衰及其抑制的基础研究
  • 批准号:
    61774171
  • 批准年份:
    2017
  • 资助金额:
    63.0 万元
  • 项目类别:
    面上项目
B细胞刺激因子-2(BSF-2)与自身免疫病的关系
  • 批准号:
    38870708
  • 批准年份:
    1988
  • 资助金额:
    3.0 万元
  • 项目类别:
    面上项目

相似海外基金

NSF-BSF: Collaborative Research: AF: Small: Algorithmic Performance through History Independence
NSF-BSF:协作研究:AF:小型:通过历史独立性实现算法性能
  • 批准号:
    2420942
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
NSF-BSF: AF: Small: Advancing Coding Theory Through the Lens of Pseudorandomness
NSF-BSF:AF:小:通过伪随机性的视角推进编码理论
  • 批准号:
    2231157
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
NSF-BSF: Collaborative Research: AF: Small: Algorithmic Performance through History Independence
NSF-BSF:协作研究:AF:小型:通过历史独立性实现算法性能
  • 批准号:
    2247576
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
NSF-BSF: AF: Small: Algorithmic and Information-Theoretic Challenges in Causal Inference
NSF-BSF:AF:小:因果推理中的算法和信息论挑战
  • 批准号:
    2321079
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
NSF-BSF: AF: Small: Parameter-Free Stochastic Optimization via Trajectory Cues
NSF-BSF:AF:小:通过轨迹线索进行无参数随机优化
  • 批准号:
    2239527
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了