Theory and practice for exploiting the underlying structure of probability models in big data analysis
在大数据分析中利用概率模型的底层结构的理论与实践
基本信息
- 批准号:1622490
- 负责人:
- 金额:$ 25万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-08-01 至 2019-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Ever-increasing use of data-intensive methods in scientific discoveries has led to a paradigm shift in science in recent years. High throughput scientific experiments, routine use of digital sensors, and intensive computer simulations have created a data deluge imposing new challenges on scientific communities to find effective and computationally feasible methods for processing and analyzing very large datasets. Despite many attempts, however, the necessary development of theoretical and computational foundations for big data analysis is lagging far behind. Many existing statistical methods are not capable of handling such data-intensive problems in terms of theoretical foundation as well as computational complexity and scalability. For analyzing high dimensional data with possibly complex structures, this research will offer a set of fundamental solutions using principled statistical methods. The resulting methods will provide a robust framework for big data analysis and allow scientists to use statistical models beyond their current limited applicability. The techniques developed in this project are likely to gain widespread acceptance across a broad spectrum of scientific disciplines, as well as in industry.The focus of this research is mainly on Bayesian statistics. Many recent methods aim to improve computational efficiency of Bayesian models by approximating the likelihood function using a small subset of data. In contrast, the objective of this research is to explore underlying structures of probability models and exploit these features to design efficient and scalable computational methods and algorithms for Bayesian inference in big data analysis. To this end, (1) the PIs will define and study the structure of probability distributions in order to develop novel geometrically motivated methods for statistical inference; (2) the PIs will develop efficient and scalable computational methods that accurately approximate probability distributions by exploiting their geometric properties; (3) finally, the PIs will apply these methods to real computationally-intensive problems from biological sciences. Due to its interdisciplinary nature, this research is expected to contribute to several fields, including statistics, machine learning, applied mathematics, and data-intensive computing.
在科学发现中,对数据密集型方法的使用不断增加,近年来导致了科学的范式转移。高吞吐量的科学实验,数字传感器的常规使用以及密集的计算机模拟已经创造了一个数据洪水,对科学社区施加了新的挑战,以找到有效且在计算上可行的方法来处理和分析非常大的数据集。但是,尽管有许多尝试,但大数据分析的理论和计算基础的必要发展仍远远落后。许多现有的统计方法无法根据理论基础以及计算复杂性和可扩展性来处理此类数据密集型问题。为了分析具有可能复杂结构的高维数据,该研究将使用原则上的统计方法提供一组基本解决方案。最终的方法将为大数据分析提供强大的框架,并允许科学家使用超出其当前有限适用性的统计模型。该项目中开发的技术可能会在广泛的科学学科以及行业中获得广泛的接受。这项研究的重点主要是贝叶斯统计。许多最近的方法旨在通过使用少量数据子集近似可能性函数来提高贝叶斯模型的计算效率。相比之下,这项研究的目的是探索概率模型的基本结构,并利用这些特征在大数据分析中设计有效且可扩展的计算方法和算法。为此,(1)PIS将定义和研究概率分布的结构,以开发出新颖的统计推断的新几何动机方法; (2)PIS将开发高效且可扩展的计算方法,通过利用其几何特性来准确地近似概率分布; (3)最后,PI将将这些方法应用于生物科学的实际计算密集型问题。由于其跨学科性质,这项研究有望为包括统计,机器学习,应用数学和数据密集型计算的几个领域做出贡献。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Babak Shahbaba其他文献
MP33-06 COMBINED URINE AND PLASMA BIOMARKERS ARE HIGHLY ACCURATE FOR PREDICTING HIGH GRADE PROSTATE CANCER
- DOI:
10.1016/j.juro.2017.02.1002 - 发表时间:
2017-04-01 - 期刊:
- 影响因子:
- 作者:
Maher Albitar;Wanlong Ma;Lars Lund;Babak Shahbaba;Edward Uchio;Soren Feddersen;Donald Moylan;Kirk Wojno;Neal Shore - 通讯作者:
Neal Shore
Babak Shahbaba的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Babak Shahbaba', 18)}}的其他基金
Collaborative Research: HDR DSC: Data Science Training and Practices: Preparing a Diverse Workforce via Academic and Industrial Partnership
合作研究:HDR DSC:数据科学培训和实践:通过学术和工业合作培养多元化的劳动力
- 批准号:
2123366 - 财政年份:2021
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
MODULUS: Data-Driven Mechanistic Modeling of Hierarchical Tissues
MODULUS:分层组织的数据驱动机制建模
- 批准号:
1936833 - 财政年份:2019
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
相似国自然基金
基于认知过程挖掘的教师实践性知识演进机制研究
- 批准号:62307017
- 批准年份:2023
- 资助金额:20 万元
- 项目类别:青年科学基金项目
云南自然保护区社区生计空间的规制与实践:人地系统适应性视角
- 批准号:42361037
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
数字地理视角下的女性空间与女性休闲实践研究
- 批准号:42371241
- 批准年份:2023
- 资助金额:46 万元
- 项目类别:面上项目
破除行政垄断、统一大市场建设与公司财务行为研究:基于政策审查与执法实践的视角
- 批准号:72302086
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
“不流动”的日常实践与身份“邂逅”——公众旅游抵制行为的规律性和可预测性研究
- 批准号:42301260
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
A Study of Legal Theory About Water Trade System: Theory and Practice for Transferable Water Rights.
水交易制度的法律理论研究:水权可转让的理论与实践。
- 批准号:
26380139 - 财政年份:2014
- 资助金额:
$ 25万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
CAREER: Exploiting Antenna Capabilities in Wireless Mesh Networks: Theory, Protocols, and Practice
职业:在无线网状网络中利用天线功能:理论、协议和实践
- 批准号:
1441638 - 财政年份:2014
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
CAREER: Exploiting Antenna Capabilities in Wireless Mesh Networks: Theory, Protocols, and Practice
职业:在无线网状网络中利用天线功能:理论、协议和实践
- 批准号:
0747206 - 财政年份:2008
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Sustainable Land Use and Agricultural Environmental Policy in WTO scheme : Theory and Practice.
世贸组织计划中的可持续土地利用和农业环境政策:理论与实践。
- 批准号:
20580239 - 财政年份:2008
- 资助金额:
$ 25万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
CAREER: Exploiting Tomography in Network-Aware Protocols: Theory and Practice
职业:在网络感知协议中利用断层扫描:理论与实践
- 批准号:
0238294 - 财政年份:2003
- 资助金额:
$ 25万 - 项目类别:
Standard Grant