Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
基本信息
- 批准号:RGPIN-2018-03819
- 负责人:
- 金额:$ 2.71万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2020
- 资助国家:加拿大
- 起止时间:2020-01-01 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
As the advancement of modern technology in acquiring data, data with diverse features are becoming more accessible than ever before. The increasing complexity of structures and the large dimension of data have posed an urgent need for the development of novel and flexible modeling and analysis tools. While many complex features may be present in different applications, this research focuses on two prevailing issues commonly present in modern data : the quality and dimensionality of data. I plan to explore important problems in the following areas.
(1) High dimensional data with measurement error and missing values
In the era of Big Data, large scale data are often available where the dimension of the variables is much larger than the number of subjects in the study. This presents a great challenge to traditional statistical methods which normally require the sample size to be bigger than the dimension of the variables. In addition, we face challenges related to data quality - measurement imprecision and missing observations. This research aims to investigate problems concerning high dimensionality, measurement error, and missing observations. The plan is to examine how measurement error and missing values may interplay in the analysis of high dimensional data. The objectives are to develop valid inference methods to handle data with all these features involved. Applications of the developed methods to survival data, image data and longitudinal data are planned.
(2) Causal inference with complex featured data
As opposed to association studies, causal inference is often the focus of empirical research. While many research methods are available for various settings, they are vulnerable to poor quality data. Most existing methods require that the data are “perfect” in the sense that no missing observations nor measurement error are present, but these assumptions are often violated in practice. Measurement error and missing observations have been a long standing concern in many studies including epidemiological, nutrition and environmental studies. However, research on causal inference with these features is rather limited and remains unexplored. I plan to explore this exciting area and develop new methods to address complex effects caused by measurement error and/or missing observation on causal inference. Furthermore, I intend to investigate the problems in the presence of large scale data where the dimension of potential confounders is high.
My primary goals are to develop original and innovative methodology in advancing foundational work and to facilitate applications. This research is anticipated to provide valuable insights into making the best use of available large scale data and to broaden the scope of existing strategies and research. It is expected to have significant impact on the statistical community as well as other fields including public health, medical studies and data science.
随着现代技术在获取数据方面的进步,具有潜水员功能的数据比以往任何时候都更容易访问。结构的复杂性日益增强和数据的巨大维度迫切需要开发新颖而灵活的建模和分析工具。尽管不同的应用程序中可能存在许多复杂的功能,但本研究的重点是现代数据中通常存在的两个主要问题:数据的质量和维度。我计划在以下领域探索重要问题。
(1)具有测量误差和缺失值的高维数据
在大数据的时代,大规模数据通常可用,而变量的尺寸大于研究中的受试者数量。这给传统的统计方法带来了巨大的挑战,该方法通常要求样本量大于变量的尺寸。此外,我们面临与数据质量相关的挑战 - 测量实现和缺失的观察结果。这项研究旨在调查有关高维度,测量误差和缺失观察结果的问题。该计划是检查测量误差和缺失值如何在高维数据的分析中相互作用。目标是开发有效的推理方法来处理具有所有这些功能的数据。计划在生存数据,图像数据和纵向数据中的应用。
(2)具有复杂特征数据的因果推断
与关联研究相反,灾难性的推论通常是实证研究的重点。尽管许多研究方法可用于各种环境,但它们容易受到质量差数据的影响。大多数现有方法都要求数据是“完美”的,因为没有缺少观察结果或测量错误,但是这些假设在实践中经常违反。在许多研究中,包括流行病学,营养和环境研究在内的许多研究中,测量误差和缺失的观察结果一直是一个长期的关注。但是,对这些特征的随意推断的研究相当有限,并且仍然出乎意料。我计划探索这个令人兴奋的领域,并开发新的方法来解决因测量误差和/或对休闲推断的观察而引起的复杂效果。此外,我打算在存在大规模数据的情况下调查潜在混杂因素较高的大规模数据的问题。
我的主要目标是在推进基础工作并促进应用程序方面开发原始和创新的方法。预计这项研究将提供有价值的见解,以充分利用可用的大规模数据,并扩大现有策略和研究的范围。预计它将对统计社区以及其他领域产生重大影响,包括公共卫生,医学研究和数据科学。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yi, Grace其他文献
Assessing trauma and related distress in refugee youth and their caregivers: should we be concerned about iatrogenic effects?
- DOI:
10.1007/s00787-020-01635-z - 发表时间:
2021-09 - 期刊:
- 影响因子:6.4
- 作者:
Greene, M. Claire;Kane, Jeremy C.;Bolton, Paul;Murray, Laura K.;Wainberg, Milton L.;Yi, Grace;Sim, Amanda;Puffer, Eve;Ismael, Abdulkadir;Hall, Brian J. - 通讯作者:
Hall, Brian J.
The Effect of Intimate Partner Violence and Probable Traumatic Brain Injury on Mental Health Outcomes for Black Women
- DOI:
10.1080/10926771.2019.1587657 - 发表时间:
2019-01-01 - 期刊:
- 影响因子:1.8
- 作者:
Cimino, Andrea N.;Yi, Grace;Stockman, Jamila K. - 通讯作者:
Stockman, Jamila K.
Yi, Grace的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yi, Grace', 18)}}的其他基金
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
- 批准号:
RGPIN-2018-03819 - 财政年份:2022
- 资助金额:
$ 2.71万 - 项目类别:
Discovery Grants Program - Individual
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
- 批准号:
RGPIN-2018-03819 - 财政年份:2021
- 资助金额:
$ 2.71万 - 项目类别:
Discovery Grants Program - Individual
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
- 批准号:
RGPIN-2018-03819 - 财政年份:2020
- 资助金额:
$ 2.71万 - 项目类别:
Discovery Grants Program - Individual
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
- 批准号:
RGPIN-2018-03819 - 财政年份:2019
- 资助金额:
$ 2.71万 - 项目类别:
Discovery Grants Program - Individual
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
- 批准号:
RGPIN-2018-03819 - 财政年份:2018
- 资助金额:
$ 2.71万 - 项目类别:
Discovery Grants Program - Individual
Statistical Methods on Challenging Issues of Biosciences
生物科学难题的统计方法
- 批准号:
239733-2013 - 财政年份:2017
- 资助金额:
$ 2.71万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
基于多变量统计分析的复杂装备故障监测与诊断方法研究
- 批准号:62273354
- 批准年份:2022
- 资助金额:54.00 万元
- 项目类别:面上项目
基于高维医疗影像和复杂相依生存数据的分析与统计推断
- 批准号:12271060
- 批准年份:2022
- 资助金额:45.00 万元
- 项目类别:面上项目
基于多变量统计分析的复杂装备故障监测与诊断方法研究
- 批准号:
- 批准年份:2022
- 资助金额:54 万元
- 项目类别:面上项目
基于高维医疗影像和复杂相依生存数据的分析与统计推断
- 批准号:12226416
- 批准年份:2022
- 资助金额:45 万元
- 项目类别:面上项目
基于统计分析理论的城市复杂环境下有毒气体传播预测研究
- 批准号:U21A20524
- 批准年份:2021
- 资助金额:260 万元
- 项目类别:
相似海外基金
Time series clustering to identify and translate time-varying multipollutant exposures for health studies
时间序列聚类可识别和转化随时间变化的多污染物暴露以进行健康研究
- 批准号:
10749341 - 财政年份:2024
- 资助金额:
$ 2.71万 - 项目类别:
REU Site: University of North Carolina at Greensboro - Complex Data Analysis using Statistical and Machine Learning Tools
REU 站点:北卡罗来纳大学格林斯伯勒分校 - 使用统计和机器学习工具进行复杂数据分析
- 批准号:
2244160 - 财政年份:2023
- 资助金额:
$ 2.71万 - 项目类别:
Standard Grant
Novel Computational Methods for Microbiome Data Analysis in Longitudinal Study
纵向研究中微生物组数据分析的新计算方法
- 批准号:
10660234 - 财政年份:2023
- 资助金额:
$ 2.71万 - 项目类别:
Integrative Data Science Approach to Advance Care Coordination of ADRD by Primary Care Providers
综合数据科学方法促进初级保健提供者对 ADRD 的护理协调
- 批准号:
10722568 - 财政年份:2023
- 资助金额:
$ 2.71万 - 项目类别:
Whole genome sequence interpretation for lipids to discover new genes and mechanisms for coronary artery disease
脂质的全基因组序列解释,以发现冠状动脉疾病的新基因和机制
- 批准号:
10722515 - 财政年份:2023
- 资助金额:
$ 2.71万 - 项目类别: