CAREER: Computational and Statistical Tradeoffs in Massive Data Analysis
职业:海量数据分析中的计算和统计权衡
基本信息
- 批准号:1350590
- 负责人:
- 金额:$ 47.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2014
- 资助国家:美国
- 起止时间:2014-02-01 至 2020-01-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In modern signal processing, one is frequently faced with statistical inference problems involving massive datasets. For example, the experiments at the Large Hadron Collider at CERN generate hundreds of petabytes of data each year, which must be stored and processed efficiently in order to further our understanding of particle physics. Similar challenges also arise in seismic monitoring where massive amounts of data are acquired over large areas via cellphone accelerometers. Analyzing such large datasets is usually viewed as a substantial computational challenge. However, if data are a signal processor?s main resource then access to more data should be viewed as an asset rather than as a burden, and larger datasets should lead to a reduction in the runtime of data analysis algorithms.This project blends concepts from computer science and from statistical signal processing to address the challenge with massive datasets by developing ?algorithm weakening? frameworks in which a data analysis procedure backs off to simpler methods as the data scale in size, leveraging the growing inferential strength of the data to ensure that a desired level of statistical accuracy is achieved with reduced runtime. The approach is concretely illustrated across a range of statistical estimation tasks, with convex relaxation techniques playing a prominent role as an algorithm weakening mechanism. In seeking a precise characterization of the computational and statistical tradeoffs obtained via convex relaxation, the investigator formalizes and studies new measures for characterizing the quality of approximation of one convex set by another. An interesting feature of this research is that convex relaxations which provide poor performance in combinatorial optimization problems may nonetheless yield useful solutions when employed in problems with inferential objectives.
在现代信号处理中,人们经常面临涉及大量数据集的统计推断问题。例如,在CERN的大型强子对撞机上进行的实验每年会产生数百个数据,必须进行有效的存储和处理,以进一步了解我们对粒子物理的理解。在地震监测中也出现了类似的挑战,在地震监测中,通过手机加速度计在大区域中获取了大量数据。分析如此大的数据集通常被视为重大的计算挑战。但是,如果数据是信号处理器的主要资源,则应将更多数据视为资产,而不是负担,而较大的数据集应导致数据分析算法的运行时降低。该项目将计算机科学的概念从计算机科学和统计信号处理中融合来解决与大规模数据集来通过开发质疑的挑战?随着数据量表的规模,数据分析过程逐渐回到更简单的方法的框架,利用数据的推论强度不断增长,以确保通过降低的运行时达到所需的统计准确性水平。 该方法在一系列统计估计任务中有具体说明,凸松弛技术在算法弱化机制中起着重要的作用。 在寻求通过凸松弛获得的计算和统计折衷的精确表征,研究者对表征另一个凸的近似质量进行了形式化并研究了新的措施。 这项研究的一个有趣特征是,在推理目标问题中使用时,凸出的放松在组合优化问题中提供较差的性能可能会产生有用的解决方案。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

暂无数据
数据更新时间:2024-06-01
Venkat Chandraseka...的其他基金
Learning Algorithms for Inverse Problems from Data: Statistical and Computational Foundations
从数据中学习反问题的算法:统计和计算基础
- 批准号:21137242113724
- 财政年份:2021
- 资助金额:$ 47.5万$ 47.5万
- 项目类别:Standard GrantStandard Grant
相似国自然基金
第二十一届全国凝聚态理论与统计物理学术会议
- 批准号:12342018
- 批准年份:2023
- 资助金额:8.00 万元
- 项目类别:专项项目
高维统计模型的高效计算
- 批准号:12301389
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
去中心化分布式计算中数据异质性的非监督统计模型研究
- 批准号:12301336
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
异构系统计算软件性能度量方法研究
- 批准号:62372428
- 批准年份:2023
- 资助金额:50.00 万元
- 项目类别:面上项目
含有分布型输入的计算机试验统计问题研究
- 批准号:12301320
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
相似海外基金
Statistical Methods for Whole-Brain Dynamic Connectivity Analysis
全脑动态连接分析的统计方法
- 批准号:1059426610594266
- 财政年份:2023
- 资助金额:$ 47.5万$ 47.5万
- 项目类别:
Acquiring cognitive maps: how brains learn hidden structure
获取认知图:大脑如何学习隐藏结构
- 批准号:1073962210739622
- 财政年份:2023
- 资助金额:$ 47.5万$ 47.5万
- 项目类别:
Neurodevelopment of executive function, appetite regulation, and obesity in children and adolescents
儿童和青少年执行功能、食欲调节和肥胖的神经发育
- 批准号:1064363310643633
- 财政年份:2023
- 资助金额:$ 47.5万$ 47.5万
- 项目类别:
Bringing the clinic to the lab: the effects of forced and non-forced rehabilitation on functional recovery after spinal cord injury
将临床带入实验室:强制和非强制康复对脊髓损伤后功能恢复的影响
- 批准号:1064125910641259
- 财政年份:2023
- 资助金额:$ 47.5万$ 47.5万
- 项目类别:
Comprehensive Pediatric Phenotyping for Evidence-Based Diagnosis in Genetic Disease
用于遗传病循证诊断的综合儿科表型分析
- 批准号:1064420510644205
- 财政年份:2023
- 资助金额:$ 47.5万$ 47.5万
- 项目类别: