Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
基本信息
- 批准号:RGPIN-2014-05193
- 负责人:
- 金额:$ 0.8万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2016
- 资助国家:加拿大
- 起止时间:2016-01-01 至 2017-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Nowadays, high throughput data arising for instance in biostatistics-especially those observed in connection with gene expression studies-and in the environmental sciences-for instance, meteorological data transmitted from satellites-must be rapidly analyzed. Innovative techniques such as those based on samples moments, which the applicant has previously advocated, or those relying on the Bayesian approach, which are discussed in several of his papers, shall be further developed and adapted to data mine such massive sets of observations. Since complex data frequently involve several variables, I also plan to extend the semi-parametric univariate moment-based density estimation techniques that I have introduced to the multivariate context. Novel multivariate data visualization techniques that would be suited to certain types of large data sets shall be proposed as well. Extant distributional results on singular quadratic forms in Gaussian and elliptically contoured vectors shall be extended to the Hermitian case and to generalized quadratic expressions, which involve random matrices in lieu of random vectors. The bivariate density estimation techniques introduced by the applicant at the last annual meeting of The International Environmentrics Society, which consists in expressing joint density estimates in terms the product of the density estimates of the marginal distributions and a polynomial adjustment whose coefficients are determined from a moment matching technique, will be extended to multivariate settings. Once evaluated at the inverse distribution functions of the marginals, such a polynomial turns out to be a copula density. This approach arguably gives rise to the most flexible type of copulae one could devise. This methodology shall be applied to colossal data sets arising from various fields of scientific investigation such as environmetrics, financial modeling, econometrics and genomic studies. Being merely based on a finite number of joint sample moments, such techniques should prove more suitable than, for instance, kernel density estimates for modeling series of observations that can be construed as "big data", as they readily produce density estimates in a functional form that lends itself to algebraic manipulations. Given their computational simplicity, moment-based data mining methods ought to efficiently assist researchers in detecting anomalies, patterns and dependencies in large and complex data sets. I also intend to develop software documentation and source code to facilitate the implementation of the aforementioned distributional methodologies. Additionally, monographs on the evaluation of the distribution of various types of quadratic forms and on moment-based density estimation and approximation techniques are planned.
如今,例如在生物统计学中产生的高吞吐量数据,尤其是与基因表达研究有关的数据,以及在环境科学方面观察到的,从卫星量传输的气象数据将得到迅速分析。申请人以前提倡的创新技术,例如基于样本时刻或依靠贝叶斯方法的创新技术,在他的几篇论文中进行了讨论,并应进一步开发并适应此类大规模观察的数据。由于复杂的数据经常涉及几个变量,因此我还计划扩展我将其引入多变量环境的半参数密度估计技术。还应提出适合某些类型的大数据集的新型多元数据可视化技术。高斯和椭圆形轮廓矢量中奇异二次形式的现有分布结果应扩展到Hermitian病例,并扩展到广义二次表达式,涉及随机矩阵代替随机载体。申请人在国际环境学会的上一次年会上引入的双变量密度估计技术包括用边缘分布的密度估计值表达关节密度估计,而边缘分布的密度估计值和多项式调整的系数是从匹配技术匹配到多个分化的设置中确定的。一旦在边际分布函数上进行了评估,这种多项式被证明是copula密度。可以说,这种方法可以产生最灵活的类型的Copulae。该方法应应用于由科学研究的各个领域(例如环境,财务建模,计量经济学和基因组研究)引起的巨大数据集。仅基于有限数量的关节样品矩,这种技术应该比例如模拟一系列观测值的内核密度估计值更合适,这些观测值可以解释为“大数据”,因为它们很容易以功能形式产生密度估计,该功能形式可以将其自身用于代数操作。鉴于它们的计算简单性,基于力矩的数据挖掘方法应该有效地帮助研究人员在大而复杂的数据集中检测异常,模式和依赖项。我还打算开发软件文档和源代码,以促进上述分销方法的实现。此外,还计划了有关评估各种二次形式的分布以及基于力矩的密度估计和近似技术的专着。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Provost, Serge其他文献
Provost, Serge的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Provost, Serge', 18)}}的其他基金
Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
- 批准号:
RGPIN-2019-06323 - 财政年份:2022
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
- 批准号:
RGPIN-2019-06323 - 财政年份:2021
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
- 批准号:
RGPIN-2019-06323 - 财政年份:2020
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
- 批准号:
RGPIN-2019-06323 - 财政年份:2019
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
- 批准号:
RGPIN-2014-05193 - 财政年份:2018
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
- 批准号:
RGPIN-2014-05193 - 财政年份:2017
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
- 批准号:
RGPIN-2014-05193 - 财政年份:2015
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
- 批准号:
RGPIN-2014-05193 - 财政年份:2014
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Advances in distribution theory with applications to transportation logistics and statiscal genesis
分配理论的进展及其在运输物流和统计生成中的应用
- 批准号:
8666-2009 - 财政年份:2013
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Advances in distribution theory with applications to transportation logistics and statiscal genesis
分配理论的进展及其在运输物流和统计生成中的应用
- 批准号:
8666-2009 - 财政年份:2012
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
面向功能分析的智能化几何造型方法
- 批准号:92270205
- 批准年份:2022
- 资助金额:300.00 万元
- 项目类别:重大研究计划
基于弱监督深度学习的三维模型多特征自适应形状分析方法研究
- 批准号:61872321
- 批准年份:2018
- 资助金额:55.0 万元
- 项目类别:面上项目
复杂产品模型等几何有限块理论方法及技术研究
- 批准号:51705158
- 批准年份:2017
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
基于有理分形理论的复杂纹理图像的几何特征分析及其应用
- 批准号:61672018
- 批准年份:2016
- 资助金额:50.0 万元
- 项目类别:面上项目
面向CAD/CAE的T样条实体建模理论与应用研究
- 批准号:61572056
- 批准年份:2015
- 资助金额:65.0 万元
- 项目类别:面上项目
相似海外基金
Probabilistic deep learning models and integrated biological experiments for analyzing dynamic and heterogeneous microbiomes
用于分析动态和异质微生物组的概率深度学习模型和集成生物实验
- 批准号:
10622713 - 财政年份:2023
- 资助金额:
$ 0.8万 - 项目类别:
Computational Methods for Analyzing lmmunoglobulin Allelic Diversity in B cells
分析 B 细胞中免疫球蛋白等位基因多样性的计算方法
- 批准号:
10751541 - 财政年份:2023
- 资助金额:
$ 0.8万 - 项目类别:
A Scalable Platform for Exploring and Analyzing Whole Brain Tissue Cleared Images
用于探索和分析全脑组织清晰图像的可扩展平台
- 批准号:
10463036 - 财政年份:2019
- 资助金额:
$ 0.8万 - 项目类别:
A Scalable Platform for Exploring and Analyzing Whole Brain Tissue Cleared Images
用于探索和分析全脑组织清晰图像的可扩展平台
- 批准号:
10370398 - 财政年份:2019
- 资助金额:
$ 0.8万 - 项目类别:
A Scalable Platform for Exploring and Analyzing Whole Brain Tissue Cleared Images
用于探索和分析全脑组织清晰图像的可扩展平台
- 批准号:
10582669 - 财政年份:2019
- 资助金额:
$ 0.8万 - 项目类别: