Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
基本信息
- 批准号:RGPIN-2019-06323
- 负责人:
- 金额:$ 1.17万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2020
- 资助国家:加拿大
- 起止时间:2020-01-01 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Nowadays, multivariate data originating for instance from biostatistics, meteorological, engineering or astronomical studies are becoming more challenging to data mine in light of their increasing complexity and size. Efficient methodologies that are principally based on joint sample moments and are independent of the sample size are advocated in this research proposal as they are ideally suited for analyzing Big Data'. As well, such techniques mitigate the curse of dimensionality. The distributional representations resulting from generalizations of widely utilized models are expressed in functional forms that allow for interpretability, lend themselves to algebraic manipulations and give rise to highly flexible copulae, which describe the dependence between variables of interest. Being remarkably versatile, such models should find applications in reliability theory and quality assurance testing.
The results will be adapted to the context of regression with a view to discarding uninformative variables and eliciting relevant patterns and relationships between the significant ones. As well, both novel and established multivariate methodologies such as hierarchical clustering analysis and data visualization techniques such as scatterplot matrices will be brought to bear to great advantage in the fields of neuroimaging - for assessing the dissimilarities between vectors of responses associated with certain stimuli - and environmetrics - for detecting trends in the face of climatic changes. As well, they should enhance the understanding of the underlying processes and, for instance, lead to advances in predictive analytics in connection with the occurrence of catastrophic events such as floods and earthquakes. The software documentation and source code to be developed for implementing the planned distributional breakthroughs shall be made available.
Various approaches will be devised to extract pertinent distributional information from relatively small subsets of large-scale data sets. Once utilized in conjunction with innovative data reduction and variable selection techniques, the modeling methodologies being herein advocated will permit to process more rapidly massive spatio-temporal and higher-dimensional data sets that frequently arrive in streams as in the cases of high throughput cancer screening and DNA sequencing, the burgeoning blockchain technologies, metadata analyses, and the fast expanding field of artificial intelligence, which is at the core of autonomous and interactive systems such as self-driving vehicles.
By addressing both volume and velocity in connection with the analysis of massive and complex streaming data, the proposed generalized models and innovative moment-based methodologies herald a paradigmatic shift in the processing of large-scale multivariate observations.
如今,源自生物统计学、气象、工程或天文学研究的多变量数据,由于其复杂性和规模不断增加,对数据挖掘来说变得更具挑战性。本研究提案提倡主要基于联合样本矩且独立于样本大小的有效方法,因为它们非常适合分析大数据。同样,此类技术减轻了维数灾难。由广泛使用的模型的概括产生的分布表示以函数形式表示,该函数形式允许可解释性,适合代数操作并产生高度灵活的联结,这些联结描述了感兴趣的变量之间的依赖性。这些模型具有非常多的用途,应该在可靠性理论和质量保证测试中找到应用。
结果将适应回归的背景,以丢弃无信息变量并引出重要变量之间的相关模式和关系。此外,新颖的和已建立的多元方法(例如层次聚类分析)和数据可视化技术(例如散点图矩阵)将在神经影像领域发挥巨大优势 - 用于评估与某些刺激相关的反应向量之间的差异 - 和环境计量学 - 用于检测气候变化的趋势。此外,它们还应该增强对基本过程的理解,例如,促进与洪水和地震等灾难性事件发生相关的预测分析的进步。应提供为实现计划的分布式突破而开发的软件文档和源代码。
将设计各种方法来从大规模数据集的相对较小的子集中提取相关的分布信息。一旦与创新的数据缩减和变量选择技术结合使用,本文所提倡的建模方法将允许更快地处理大量时空和更高维的数据集,这些数据集经常以流的形式到达,就像在高通量癌症筛查和癌症筛查的情况下一样。 DNA 测序、新兴的区块链技术、元数据分析以及快速扩展的人工智能领域,人工智能是自动驾驶汽车等自主和交互系统的核心。
通过解决与大量复杂流数据分析相关的体积和速度问题,所提出的广义模型和创新的基于矩的方法预示着大规模多变量观测处理的范式转变。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Provost, Serge其他文献
Provost, Serge的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Provost, Serge', 18)}}的其他基金
Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
- 批准号:
RGPIN-2019-06323 - 财政年份:2022
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
- 批准号:
RGPIN-2019-06323 - 财政年份:2021
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
- 批准号:
RGPIN-2019-06323 - 财政年份:2019
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
- 批准号:
RGPIN-2014-05193 - 财政年份:2018
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
- 批准号:
RGPIN-2014-05193 - 财政年份:2017
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
- 批准号:
RGPIN-2014-05193 - 财政年份:2016
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
- 批准号:
RGPIN-2014-05193 - 财政年份:2015
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
- 批准号:
RGPIN-2014-05193 - 财政年份:2014
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Advances in distribution theory with applications to transportation logistics and statiscal genesis
分配理论的进展及其在运输物流和统计生成中的应用
- 批准号:
8666-2009 - 财政年份:2013
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Advances in distribution theory with applications to transportation logistics and statiscal genesis
分配理论的进展及其在运输物流和统计生成中的应用
- 批准号:
8666-2009 - 财政年份:2012
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
几何造型与机器学习融合的图像数据拟合问题研究
- 批准号:
- 批准年份:2022
- 资助金额:54 万元
- 项目类别:面上项目
产能共享背景下的制造型企业运营决策研究:基于信息共享与数据质量的视角
- 批准号:72271252
- 批准年份:2022
- 资助金额:44 万元
- 项目类别:面上项目
基于空间拓展的自支撑曲面构造理论与应用研究
- 批准号:61802228
- 批准年份:2018
- 资助金额:27.0 万元
- 项目类别:青年科学基金项目
数据几何与差商特征驱动的构造性逼近理论与方法研究
- 批准号:11771453
- 批准年份:2017
- 资助金额:48.0 万元
- 项目类别:面上项目
复杂曲面高质量加工中的插补算法研究
- 批准号:61602074
- 批准年份:2016
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Time series clustering to identify and translate time-varying multipollutant exposures for health studies
时间序列聚类可识别和转化随时间变化的多污染物暴露以进行健康研究
- 批准号:
10749341 - 财政年份:2024
- 资助金额:
$ 1.17万 - 项目类别:
Network Canvas 2.0: Enhancing network data capture for drug use and HIV research
Network Canvas 2.0:增强药物使用和艾滋病毒研究的网络数据捕获
- 批准号:
10715902 - 财政年份:2023
- 资助金额:
$ 1.17万 - 项目类别:
Predictive modeling of mammalian cell fate transitions over time and space with single-cell genomics
利用单细胞基因组学预测哺乳动物细胞命运随时间和空间转变的模型
- 批准号:
10572855 - 财政年份:2023
- 资助金额:
$ 1.17万 - 项目类别:
Understanding Risk Heterogeneity Following Child Maltreatment: An Integrative Data Analysis Approach.
了解虐待儿童后的风险异质性:综合数据分析方法。
- 批准号:
10721233 - 财政年份:2023
- 资助金额:
$ 1.17万 - 项目类别:
Risk stratifying indeterminate pulmonary nodules with jointly learned features from longitudinal radiologic and clinical big data
利用纵向放射学和临床大数据共同学习的特征对不确定的肺结节进行风险分层
- 批准号:
10678264 - 财政年份:2023
- 资助金额:
$ 1.17万 - 项目类别: