VARIABLE SELECTION IN GENETIC EPIDEMIOLOGICAL STUDIES OF CARDIOVASCULAR DISEASES
心血管疾病遗传流行病学研究中的变量选择
基本信息
- 批准号:7663792
- 负责人:
- 金额:$ 22.8万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2008
- 资助国家:美国
- 起止时间:2008-08-01 至 2011-07-30
- 项目状态:已结题
- 来源:
- 关键词:AccountingAffectAlgorithmsAmericanAreaBlood VesselsCandidate Disease GeneCardiovascular DiseasesCardiovascular systemChromosomesChronic DiseaseCodeCollectionComplexConsensusDataData AnalysesData SetDatabasesDeveloped CountriesDeveloping CountriesDevelopmentDimensionsDiseaseDisease OutcomeEnvironmentEpidemiologic StudiesEtiologyEvaluationEvaluation StudiesExclusionFactor AnalysisFunctional disorderGene ExpressionGenesGeneticGenomicsGoalsHaplotypesHeart failureHypertensionHypertrophyIndividualInvestigationKnowledgeLearningLeftLeft Ventricular HypertrophyLinkage DisequilibriumMachine LearningMeasuresMethodsModelingMolecularMorbidity - disease rateNational Heart, Lung, and Blood InstituteNoisePathway interactionsPerformanceProceduresProcessPublic DomainsResearchResearch DesignSeaSequence AnalysisSingle Nucleotide PolymorphismSolutionsSourceStatistical ComputingStructureTechniquesTestingVariantVentricularWeightbasecomputer based statistical methodscomputer programcomputerized toolsdisease phenotypeexperiencegene interactiongenome wide association studyheuristicshypertensive heart diseaseimprovedmortalityprematurepreventprogramspublic health relevancescale upsimulationtooltraitweb site
项目摘要
DESCRIPTION (provided by applicant): Cardiovascular diseases (CVD) affect millions of people in US and across the world. There is strong evidence of a genetic component in cardiovascular diseases (CVD) and related traits. An emerging consensus is that both genes and environment and, perhaps more importantly, their interactions are responsible for this complex disease. As a result, many genetic epidemiological (GE) studies of CVD use a study design that tests hundreds of thousands of genetic predictors (e.g., single nucleotide polymorphism (SNP) markers) and hundreds of (related) disease phenotypes and environmental covariates. This has brought tremendous analytical challenges, particularly the high dimensionality of the data and the obscure interactions among the many variables. As a result, searching for CVD disease genes has become a task of selecting important variables from a vast number of SNPs and other predictor variables. Our real data analyses in several ongoing large scale CVD related studies motivated us to consider new methodological solutions to the variable selection problem. This application is developed upon these positive preliminary findings. Our main idea is to develop a strategy for selecting important predictors of CVD by integrating multiple sources of information via the method of statistical learning (i.e., optimizing the selection by repeated learning from examples). In this strategy, we will first develop a method for selecting significant SNPs in moderate-dimensional data (e.g., lower thousands of SNPs, in candidate genes studies) by an integrated classifier. The method will build upon existing techniques assessing information of SNPs in haplotype similarity, imputed functional potential, and gene-gene interactions. We then scale up the new method to the high-dimensional setting of genome-wide association studies (e.g., at least hundreds of thousands of SNPs), by dimension reduction that utilizes the local linkage-disequilibrium (LD) structure in SNPs and by combining latent factor analysis of correlated CVD traits and pathway-based analysis to account for gene-environment (GxE) interactions. A fast-search algorithm will also be developed based on an existing search heuristic that was successfully applied in high-dimensional data of gene expression and genomic sequence analysis. The new methods and algorithms will be coded into R programs and distributed as tool set for an association analysis pipeline. Evaluations of the new methods will be performed by intensive simulation studies and by applying to existing datasets in ongoing studies of CVD and related diseases. Results from evaluation studies, together with the ancillary databases generated by the study such as imputed functional scores of potential or known CVD SNPs will be distributed on a dedicated project website. By doing so, we believe that the utilities resulted from the proposed research will make a significant contribution to many ongoing genetic epidemiological studies of CVD and related traits. PUBLIC HEALTH RELEVANCE: This project is aimed at timely development of computational tools for emerging large-scale genome-wide association studies of cardiovascular diseases (CVD) that affect millions of people in US and across the world. The new methods deal with the analytical challenges brought forth by the high dimensionality of the data and the obscure interactions among the many variables in these studies, and the tools will be applied to ongoing studies of CVD and related diseases. The results, together with the computer programs and ancillary databases will make a significant contribution to many ongoing and new genetic epidemiological studies of CVD and related diseases.
描述(由申请人提供):心血管疾病(CVD)会影响我们和世界各地的数百万人。有强有力的证据表明心血管疾病(CVD)和相关特征有遗传成分。一个新兴的共识是基因和环境,也许更重要的是,它们的相互作用是这种复杂疾病的原因。结果,许多CVD的遗传流行病学研究(GE)使用研究设计,该设计测试了数十万个遗传预测因子(例如,单核苷酸多态性(SNP)标记)和数百种(相关的)疾病表型和环境协变量。这带来了巨大的分析挑战,尤其是数据的高维度以及许多变量之间的晦涩的相互作用。结果,搜索CVD疾病基因已成为从大量SNP和其他预测变量中选择重要变量的任务。我们在进行的几项大规模CVD相关研究中进行的真实数据分析促使我们考虑了可变选择问题的新方法论解决方案。该应用是根据这些积极的初步发现开发的。我们的主要思想是通过通过统计学习方法整合多个信息来源(即,通过从示例重复学习来优化选择)来制定一种选择CVD重要预测的策略。在此策略中,我们将首先开发一种通过综合分类器选择中等维数据(例如,在候选基因研究中的较低的SNP)中选择重要的SNP的方法。该方法将基于现有技术,评估SNP在单倍型相似性,估算功能潜力和基因 - 基因相互作用中的信息。然后,我们通过降低尺寸来扩展全基因组关联研究的高维设置(例如,至少数十万个SNP),通过减小尺寸,利用SNP中的局部链接 - 区域结构(LD)结构,通过结合基于CVD特征和基于途径分析的基于基于基因的相关性分析的潜在因子分析,以考虑基于基因 - 基因环测(Genevironment)。快速搜索算法也将基于现有的搜索启发式,该算法成功地应用于基因表达和基因组序列分析的高维数据中。新方法和算法将被编码为R程序,并作为关联分析管道的工具设置。对新方法的评估将通过密集的仿真研究进行,并应用于正在进行的CVD和相关疾病研究中的现有数据集。评估研究的结果以及研究生成的辅助数据库(例如潜在或已知CVD SNP的估算功能得分)将在专用的项目网站上分发。通过这样做,我们认为拟议研究产生的公用事业将为CVD和相关性状的许多持续遗传流行病学研究做出重大贡献。 公共卫生相关性:该项目旨在及时开发用于新兴的大规模基因组的心血管疾病研究(CVD)的计算工具,这些研究影响了我们和世界各地的数百万人。新方法涉及数据的高维度以及这些研究中许多变量之间晦涩的相互作用所带来的分析挑战,这些工具将应用于正在进行的CVD和相关疾病的研究中。结果以及计算机程序和辅助数据库将为CVD和相关疾病的许多正在进行的和新的遗传流行病学研究做出重大贡献。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
C CHARLES GU其他文献
C CHARLES GU的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('C CHARLES GU', 18)}}的其他基金
A Summer Institute for Biostatistics Research in Disease and Genetic Epidemiology
疾病和遗传流行病学生物统计学夏季研究所
- 批准号:
7918081 - 财政年份:2009
- 资助金额:
$ 22.8万 - 项目类别:
VARIABLE SELECTION IN GENETIC EPIDEMIOLOGICAL STUDIES OF CARDIOVASCULAR DISEASES
心血管疾病遗传流行病学研究中的变量选择
- 批准号:
7845764 - 财政年份:2009
- 资助金额:
$ 22.8万 - 项目类别:
A Summer Institute for Biostatistics Research in Disease and Genetic Epidemiology
疾病和遗传流行病学生物统计学夏季研究所
- 批准号:
8116053 - 财政年份:2009
- 资助金额:
$ 22.8万 - 项目类别:
A Summer Institute for Biostatistics Research in Disease and Genetic Epidemiology
疾病和遗传流行病学生物统计学夏季研究所
- 批准号:
7758417 - 财政年份:2009
- 资助金额:
$ 22.8万 - 项目类别:
VARIABLE SELECTION IN GENETIC EPIDEMIOLOGICAL STUDIES OF CARDIOVASCULAR DISEASES
心血管疾病遗传流行病学研究中的变量选择
- 批准号:
7895861 - 财政年份:2008
- 资助金额:
$ 22.8万 - 项目类别:
相似国自然基金
基于先进算法和行为分析的江南传统村落微气候的评价方法、影响机理及优化策略研究
- 批准号:52378011
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
社交网络上观点动力学的重要影响因素与高效算法
- 批准号:62372112
- 批准年份:2023
- 资助金额:50.00 万元
- 项目类别:面上项目
员工算法规避行为的内涵结构、量表开发及多层次影响机制:基于大(小)数据研究方法整合视角
- 批准号:72372021
- 批准年份:2023
- 资助金额:40 万元
- 项目类别:面上项目
算法人力资源管理对员工算法应对行为和工作绩效的影响:基于员工认知与情感的路径研究
- 批准号:72372070
- 批准年份:2023
- 资助金额:40 万元
- 项目类别:面上项目
算法鸿沟影响因素与作用机制研究
- 批准号:72304017
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
How Does Particle Material Properties Insoluble and Partially Soluble Affect Sensory Perception Of Fat based Products
不溶性和部分可溶的颗粒材料特性如何影响脂肪基产品的感官知觉
- 批准号:
BB/Z514391/1 - 财政年份:2024
- 资助金额:
$ 22.8万 - 项目类别:
Training Grant
BRC-BIO: Establishing Astrangia poculata as a study system to understand how multi-partner symbiotic interactions affect pathogen response in cnidarians
BRC-BIO:建立 Astrangia poculata 作为研究系统,以了解多伙伴共生相互作用如何影响刺胞动物的病原体反应
- 批准号:
2312555 - 财政年份:2024
- 资助金额:
$ 22.8万 - 项目类别:
Standard Grant
RII Track-4:NSF: From the Ground Up to the Air Above Coastal Dunes: How Groundwater and Evaporation Affect the Mechanism of Wind Erosion
RII Track-4:NSF:从地面到沿海沙丘上方的空气:地下水和蒸发如何影响风蚀机制
- 批准号:
2327346 - 财政年份:2024
- 资助金额:
$ 22.8万 - 项目类别:
Standard Grant
Graduating in Austerity: Do Welfare Cuts Affect the Career Path of University Students?
紧缩毕业:福利削减会影响大学生的职业道路吗?
- 批准号:
ES/Z502595/1 - 财政年份:2024
- 资助金额:
$ 22.8万 - 项目类别:
Fellowship
Insecure lives and the policy disconnect: How multiple insecurities affect Levelling Up and what joined-up policy can do to help
不安全的生活和政策脱节:多种不安全因素如何影响升级以及联合政策可以提供哪些帮助
- 批准号:
ES/Z000149/1 - 财政年份:2024
- 资助金额:
$ 22.8万 - 项目类别:
Research Grant