Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values

复杂特征数据的统计分析:高维、测量误差和缺失值

基本信息

  • 批准号:
    RGPIN-2018-03819
  • 负责人:
  • 金额:
    $ 3.28万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2018
  • 资助国家:
    加拿大
  • 起止时间:
    2018-01-01 至 2019-12-31
  • 项目状态:
    已结题

项目摘要

As the advancement of modern technology in acquiring data, data with diverse features are becoming more accessible than ever before. The increasing complexity of structures and the large dimension of data have posed an urgent need for the development of novel and flexible modeling and analysis tools. While many complex features may be present in different applications, this research focuses on two prevailing issues commonly present in modern data : the quality and dimensionality of data. I plan to explore important problems in the following areas.******(1) High dimensional data with measurement error and missing values******In the era of Big Data, large scale data are often available where the dimension of the variables is much larger than the number of subjects in the study. This presents a great challenge to traditional statistical methods which normally require the sample size to be bigger than the dimension of the variables. In addition, we face challenges related to data quality - measurement imprecision and missing observations. This research aims to investigate problems concerning high dimensionality, measurement error, and missing observations. The plan is to examine how measurement error and missing values may interplay in the analysis of high dimensional data. The objectives are to develop valid inference methods to handle data with all these features involved. Applications of the developed methods to survival data, image data and longitudinal data are planned.******(2) Causal inference with complex featured data******As opposed to association studies, causal inference is often the focus of empirical research. While many research methods are available for various settings, they are vulnerable to poor quality data. Most existing methods require that the data are “perfect” in the sense that no missing observations nor measurement error are present, but these assumptions are often violated in practice. Measurement error and missing observations have been a long standing concern in many studies including epidemiological, nutrition and environmental studies. However, research on causal inference with these features is rather limited and remains unexplored. I plan to explore this exciting area and develop new methods to address complex effects caused by measurement error and/or missing observation on causal inference. Furthermore, I intend to investigate the problems in the presence of large scale data where the dimension of potential confounders is high.*********My primary goals are to develop original and innovative methodology in advancing foundational work and to facilitate applications. This research is anticipated to provide valuable insights into making the best use of available large scale data and to broaden the scope of existing strategies and research. It is expected to have significant impact on the statistical community as well as other fields including public health, medical studies and data science.
随着现代技术在获取数据方面的进步,具有潜水员功能的数据比以往任何时候都更容易访问。结构的复杂性日益增强和数据的巨大维度迫切需要开发新颖而灵活的建模和分析工具。尽管不同的应用程序中可能存在许多复杂的功能,但本研究的重点是现代数据中通常存在的两个主要问题:数据的质量和维度。我计划在以下领域探索重要问题。****(1)高维数据具有测量误差和缺失值****在大数据时代,大规模数据通常可用,而变量的维度比研究中的受试者数量大得多。这给传统的统计方法带来了巨大的挑战,该方法通常要求样本量大于变量的尺寸。此外,我们面临与数据质量有关的挑战 - 测量暗示和缺失的观察结果。这项研究旨在调查有关高维度,测量误差和缺失观察结果的问题。该计划是检查测量误差和缺失值如何在高维数据的分析中相互作用。目标是开发有效的推理方法来处理具有所有这些功能的数据。计划在生存数据,图像数据和纵向数据的生存数据中的应用。尽管许多研究方法可用于各种环境,但它们容易受到质量差数据的影响。大多数现有方法都要求数据是“完美”的,因为没有缺少观察结果或测量错误,但是这些假设在实践中经常违反。在许多研究中,包括流行病学,营养和环境研究在内的许多研究中,测量误差和缺失的观察结果一直是一个长期的关注。但是,对这些特征的随意推断的研究相当有限,并且仍然出乎意料。我计划探索这个令人兴奋的领域,并开发新的方法来解决因测量误差和/或对休闲推断的观察而引起的复杂效果。此外,我打算在存在大规模数据的情况下调查潜在混杂因素较高的大规模数据的问题。******我的主要目标是在推进基础工作和促进应用程序方面开发原始和创新的方法。预计这项研究将提供有价值的见解,以充分利用可用的大规模数据,并扩大现有策略和研究的范围。预计它将对统计社区以及其他领域产生重大影响,包括公共卫生,医学研究和数据科学。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Yi, Grace其他文献

The Effect of Intimate Partner Violence and Probable Traumatic Brain Injury on Mental Health Outcomes for Black Women
  • DOI:
    10.1080/10926771.2019.1587657
  • 发表时间:
    2019-01-01
  • 期刊:
  • 影响因子:
    1.8
  • 作者:
    Cimino, Andrea N.;Yi, Grace;Stockman, Jamila K.
  • 通讯作者:
    Stockman, Jamila K.
Assessing trauma and related distress in refugee youth and their caregivers: should we be concerned about iatrogenic effects?
  • DOI:
    10.1007/s00787-020-01635-z
  • 发表时间:
    2021-09
  • 期刊:
  • 影响因子:
    6.4
  • 作者:
    Greene, M. Claire;Kane, Jeremy C.;Bolton, Paul;Murray, Laura K.;Wainberg, Milton L.;Yi, Grace;Sim, Amanda;Puffer, Eve;Ismael, Abdulkadir;Hall, Brian J.
  • 通讯作者:
    Hall, Brian J.

Yi, Grace的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Yi, Grace', 18)}}的其他基金

Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
  • 批准号:
    RGPIN-2018-03819
  • 财政年份:
    2022
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Data Science
数据科学
  • 批准号:
    CRC-2019-00427
  • 财政年份:
    2022
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Canada Research Chairs
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
  • 批准号:
    RGPIN-2018-03819
  • 财政年份:
    2021
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Data Science
数据科学
  • 批准号:
    CRC-2019-00427
  • 财政年份:
    2021
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Canada Research Chairs
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
  • 批准号:
    RGPIN-2018-03819
  • 财政年份:
    2020
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Data Science
数据科学
  • 批准号:
    CRC-2019-00427
  • 财政年份:
    2020
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Canada Research Chairs
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
  • 批准号:
    RGPIN-2018-03819
  • 财政年份:
    2020
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Data Science
数据科学
  • 批准号:
    CRC-2019-00427
  • 财政年份:
    2019
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Canada Research Chairs
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
  • 批准号:
    RGPIN-2018-03819
  • 财政年份:
    2019
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical Methods on Challenging Issues of Biosciences
生物科学难题的统计方法
  • 批准号:
    239733-2013
  • 财政年份:
    2017
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

基于多变量统计分析的复杂装备故障监测与诊断方法研究
  • 批准号:
    62273354
  • 批准年份:
    2022
  • 资助金额:
    54.00 万元
  • 项目类别:
    面上项目
基于高维医疗影像和复杂相依生存数据的分析与统计推断
  • 批准号:
    12271060
  • 批准年份:
    2022
  • 资助金额:
    45.00 万元
  • 项目类别:
    面上项目
基于多变量统计分析的复杂装备故障监测与诊断方法研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    54 万元
  • 项目类别:
    面上项目
基于高维医疗影像和复杂相依生存数据的分析与统计推断
  • 批准号:
    12226416
  • 批准年份:
    2022
  • 资助金额:
    45 万元
  • 项目类别:
    面上项目
基于统计分析理论的城市复杂环境下有毒气体传播预测研究
  • 批准号:
    U21A20524
  • 批准年份:
    2021
  • 资助金额:
    260 万元
  • 项目类别:

相似海外基金

Time series clustering to identify and translate time-varying multipollutant exposures for health studies
时间序列聚类可识别和转化随时间变化的多污染物暴露以进行健康研究
  • 批准号:
    10749341
  • 财政年份:
    2024
  • 资助金额:
    $ 3.28万
  • 项目类别:
REU Site: University of North Carolina at Greensboro - Complex Data Analysis using Statistical and Machine Learning Tools
REU 站点:北卡罗来纳大学格林斯伯勒分校 - 使用统计和机器学习工具进行复杂数据分析
  • 批准号:
    2244160
  • 财政年份:
    2023
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Standard Grant
Bayesian genetic association analysis of all rare diseases in the Kids First cohort
Kids First 队列中所有罕见疾病的贝叶斯遗传关联分析
  • 批准号:
    10643463
  • 财政年份:
    2023
  • 资助金额:
    $ 3.28万
  • 项目类别:
Neuroimaging Dimensions at the Extremes of the Schizophrenia Spectrum
精神分裂症谱系极端的神经影像维度
  • 批准号:
    10753887
  • 财政年份:
    2023
  • 资助金额:
    $ 3.28万
  • 项目类别:
New approaches for leveraging single-cell data to identify disease-critical genes and gene sets
利用单细胞数据识别疾病关键基因和基因集的新方法
  • 批准号:
    10768004
  • 财政年份:
    2023
  • 资助金额:
    $ 3.28万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了