Statistical Methods for Data Integration and Applications to Genome-wide Association Studies
数据集成的统计方法及其在全基因组关联研究中的应用
基本信息
- 批准号:10889298
- 负责人:
- 金额:$ 29万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-09-01 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Abstract
Large-scale epidemiologic studies, including biobanks and genome-wide association studies
(GWAS), are now rapidly leading to the identification of novel risk factors for complex diseases.
There is now increasing opportunity to develop comprehensive models for disease risk
incorporating genetic markers, other biomarkers, life-style factors and sociodemographic
indicators. There are, however, major challenges as information on all of the potential risk
factors are often not available in a single adequately large study. Instead, information may be
available from different studies, each of which may include some subsets of the desired
variables. Further, because of logistical and privacy concerns with individual level data, only
summary-level information, i.e., estimates of model parameters, may be available from some
studies. We propose to develop a series of novel statistical methods that will allow data
integration across disparate datasets to tackle modern problems faced in genetics and more
broadly, observational epidemiologic studies. In Aim 1, we will develop a general framework for
building logistic regression models using detail covariate data from a main study, while
incorporating summary-statistics information from an external study. We will develop a series of
applications of this framework to GWAS where we will use covariate data, including high-
throughput biomarkers, from biobanks and perform combined analysis with external summary-
statistics data for powerful exploration of effect modification and mediation of genetic
associations by covariates. In Aim 2, we will extend the proposed framework of Aim 1 for
developing models with high-dimensional covariates with regularized parameter estimates. We
will develop application of the proposed method for fine-mapping and polygenic risk score
analysis conditional on covariates. In Aim 3, we will further develop multiple novel applications
of the data integration framework to account for different accuracy/depth of disease outcome
data across different studies. We will illustrate application of the proposed methods for risk
modeling of multiple cancers (breast, melanoma and lung), two cardiometabolic traits (type-2
diabetes and coronary artery disease) and a psychiatric disorder (major depression disorder)
using individual level data from the UK Biobank study and Breast Cancer Association
Consortium, and external GWAS summary-statistics. We will distribute develop and freely
distribute user friendly software.
抽象的
大规模的流行病学研究,包括生物库和全基因组关联研究
(GWAS)现在正在迅速导致鉴定出复杂疾病的新风险因素。
现在有越来越多的机会开发疾病风险的综合模型
结合遗传标记,其他生物标志物,生活方式因素和社会人口统计学
指标。但是,作为有关所有潜在风险的信息,存在重大挑战
在一项充分的大型研究中,通常无法获得因素。相反,信息可能是
从不同的研究中获得,每项研究都可能包括所需的某些子集
变量。此外,由于后勤和隐私问题,只有个人级别的数据
摘要级别的信息,即模型参数的估计值,可以从某些信息中获得
研究。我们建议开发一系列新型统计方法,以允许数据
跨不同数据集的集成,以解决遗传学和更多的现代问题
广义,观察性流行病学研究。在AIM 1中,我们将为
使用主要研究的详细协变数据构建逻辑回归模型,而
从外部研究中纳入摘要信息。我们将开发一系列
该框架的应用到GWAS,我们将使用协变量数据,包括高
吞吐物生物标志物,来自生物库并与外部摘要进行联合分析 -
统计数据,以强大的探索效果修改和遗传调解
协会协会。在AIM 2中,我们将扩展AIM 1的拟议框架
开发具有具有正则参数估计值的高维协变量的模型。我们
将开发提出的方法进行精细映射和多基因风险评分的应用
分析协变量的条件。在AIM 3中,我们将进一步开发多个新颖的应用
数据集成框架的涉及不同准确性/深度疾病结果的框架
跨不同研究的数据。我们将说明拟议的风险方法的应用
多种癌症(乳腺癌,黑色素瘤和肺)的建模,两个心脏代谢性状(2型)
糖尿病和冠状动脉疾病)和精神病(严重抑郁症)
使用英国生物银行研究和乳腺癌协会的个人级别数据
财团和外部GWAS摘要统计。我们将开发和自由分发
分发用户友好的软件。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

暂无数据
数据更新时间:2024-06-01
Nilanjan Chatterj...的其他基金
Multifactoral breast cancer risk prediction accounting for ethnic and tumor diversity
考虑种族和肿瘤多样性的多因素乳腺癌风险预测
- 批准号:1060950410609504
- 财政年份:2020
- 资助金额:$ 29万$ 29万
- 项目类别:
Multifactoral breast cancer risk prediction accounting for ethnic and tumor diversity
考虑种族和肿瘤多样性的多因素乳腺癌风险预测
- 批准号:1041606610416066
- 财政年份:2020
- 资助金额:$ 29万$ 29万
- 项目类别:
Multifactoral breast cancer risk prediction accounting for ethnic and tumor diversity
考虑种族和肿瘤多样性的多因素乳腺癌风险预测
- 批准号:1026389310263893
- 财政年份:2020
- 资助金额:$ 29万$ 29万
- 项目类别:
Robust Methods for Polygenic Analysis to Inform Disease Etiology and Enhance Risk Prediction
多基因分析的稳健方法可告知疾病病因并增强风险预测
- 批准号:99207539920753
- 财政年份:2019
- 资助金额:$ 29万$ 29万
- 项目类别:
Robust Methods for Polygenic Analysis to Inform Disease Etiology and Enhance Risk Prediction
多基因分析的稳健方法可告知疾病病因并增强风险预测
- 批准号:1035974810359748
- 财政年份:2019
- 资助金额:$ 29万$ 29万
- 项目类别:
Robust Methods for Polygenic Analysis to Inform Disease Etiology and Enhance Risk Prediction
多基因分析的稳健方法可告知疾病病因并增强风险预测
- 批准号:1011294410112944
- 财政年份:2019
- 资助金额:$ 29万$ 29万
- 项目类别:
Robust Methods for Polygenic Analysis to Inform Disease Etiology and Enhance Risk Prediction
多基因分析的稳健方法可告知疾病病因并增强风险预测
- 批准号:1057994210579942
- 财政年份:2019
- 资助金额:$ 29万$ 29万
- 项目类别:
相似国自然基金
分布式非凸非光滑优化问题的凸松弛及高低阶加速算法研究
- 批准号:12371308
- 批准年份:2023
- 资助金额:43.5 万元
- 项目类别:面上项目
资源受限下集成学习算法设计与硬件实现研究
- 批准号:62372198
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
基于物理信息神经网络的电磁场快速算法研究
- 批准号:52377005
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
考虑桩-土-水耦合效应的饱和砂土变形与流动问题的SPH模型与高效算法研究
- 批准号:12302257
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
面向高维不平衡数据的分类集成算法研究
- 批准号:62306119
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Shape-based personalized AT(N) imaging markers of Alzheimer's disease
基于形状的个性化阿尔茨海默病 AT(N) 成像标记
- 批准号:1066790310667903
- 财政年份:2023
- 资助金额:$ 29万$ 29万
- 项目类别:
NeuroMAP Phase II - Data Management and Statistics Core
NeuroMAP 第二阶段 - 数据管理和统计核心
- 批准号:1071113810711138
- 财政年份:2023
- 资助金额:$ 29万$ 29万
- 项目类别:
An Integrated Biomarker Approach to Personalized, Adaptive Deep Brain Stimulation in Parkinson Disease
帕金森病个性化、适应性深部脑刺激的综合生物标志物方法
- 批准号:1057195210571952
- 财政年份:2023
- 资助金额:$ 29万$ 29万
- 项目类别:
The microbiome associated with oral Leukoplakia: A multi-omics mechanistic study
与口腔白斑相关的微生物组:一项多组学机制研究
- 批准号:1087026810870268
- 财政年份:2023
- 资助金额:$ 29万$ 29万
- 项目类别:
Developing a novel EEG-based index for evaluating amyloid and tau burden in Alzheimer's Disease
开发一种基于脑电图的新型指数来评估阿尔茨海默病中淀粉样蛋白和 tau 蛋白的负担
- 批准号:1060205910602059
- 财政年份:2023
- 资助金额:$ 29万$ 29万
- 项目类别: