Statistical Methods for Analysis of Massive Genetic and Genomic Data in Cancer Research
癌症研究中大量遗传和基因组数据分析的统计方法
基本信息
- 批准号:10676866
- 负责人:
- 金额:$ 90.88万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2015
- 资助国家:美国
- 起止时间:2015-08-05 至 2029-07-31
- 项目状态:未结题
- 来源:
- 关键词:ATAC-seqAccelerationAdvanced Malignant NeoplasmAwardBiological MarkersBiologyBreast Cancer GeneticsCancer PrognosisCellsCharacteristicsChromosome MappingClinicalClinical DataClinical ResearchCommunitiesComputer softwareDataData CommonsData ScienceEthnic OriginEtiologyEventGenesGeneticGenetic MedicineGenetic ResearchGenomeGoalsHematopoiesisHeritabilityImmunotherapyJointsLengthLeukocytesMalignant NeoplasmsMalignant neoplasm of lungMediationMedicineMendelian randomizationMethodsPathway interactionsPerformancePhenotypePopulationPopulation StudyPrevention strategyRNAReduce health disparitiesResearch PersonnelRiskSiliconStatistical MethodsUnited States National Institutes of HealthVariantanticancer researchbiobankcancer epidemiologycancer geneticscancer health disparitycancer preventioncancer subtypescausal variantcloud baseddata resourceempowermentepidemiologic dataexome sequencinggenetic analysisgenetic epidemiologygenetic variantgenome sequencinggenomic dataimprovedmachine learning methodmitochondrial dysfunctionmulti-ethnicmultiple omicsnon-geneticphenomephenotypic datapopulation basedprecision cancer preventionprecision medicineprofiles in patientsrare variantresponsestatistical and machine learningtelomeretooltreatment strategytumorwhole genome
项目摘要
Project Summary
With massive data from genome, exposome and phenome rapidly available in population and clinical studies,
data science has emerged to be critically important and provides unprecedented opportunities for new
discoveries in cancer. This competing renewal application of an NCI Outstanding Investigator Award (R35)
aims at developing and applying scalable, interpretable and transferable statistical and machine learning (ML)
methods for integrative analysis of massive germline whole genome sequencing (WGS) and somatic whole
exome sequencing (WES) data, epidemiological and clinical data, in large-scale multi-ethnic biobanks,
population and clinical studies of cancer, with experimental cell specific multi-omic functional data, such as
single cell RNA/ATAC-seq data. Our ultimate goal is to use advanced data science methods and different
types of population, clinical, and experimental data to accelerate progress in advancing from cancer gene
mapping to mechanisms to cancer prevention and medicine, discover new effective trans-ethnic precision
cancer prevention and treatment strategies, and reduce health disparities in cancer genetic research. This
application aims to meet the pressing quantitative needs for the analysis of massive data in cancer research.
Specifically, (A) for genetic cancer epidemiology, we will develop scalable, interpretable and transferable
statistical and ML methods for (1) rare variant analysis by integrating population-based WGS and experimental
single cell functional data; (2) advancing from associated variants with unknown causality and biology to causal
variants, genes and pathways using causal mediation analysis and Mendelian Randomization by integrating
genetic, cell-specific omic, biomarkers and phenotype data; (3) estimating transferable trans-ethnic polygenetic
risk scores (PRSs) and heritability using common and rare variants by integrating WGS data with experimental
in-silicon cell-specific functional annotations and non-genetic data, for actionable prevention strategies; (3)
federated and transferable trans-ethnic single phenotype and phenome-wide genetic analysis in large WGS
studies and biobanks. (B) For cancer genetic medicine, we will develop scalable and interpretable statistical
and machine learning methods for (1) joint analysis of germline WGS and tumor somatic WES data to identify
genetic variants that predispose to cancer subtypes; (2) integrative analysis of tumor somatic WES data and
clinicopathological characteristics to identify patient profiles for improved efficacy of immunotherapies; (3)
analysis of the effects of clonal hematopoiesis, mitochondrial dysfunctions, leukocyte telomere length called
from germline WGS data on tumor somatic events, cancer prognosis and responses to immunotherapies. We
will apply the proposed methods in lung cancer and breast cancer genetic epidemiological and clinical studies
and biobanks. We will develop open access cluster and cloud-based software of these methods and data
resources and make them available at NIH Data Commons to the cancer research community.
项目摘要
有来自基因组,外汇和现象的大量数据,在人群和临床研究中迅速获得
数据科学已经变得非常重要,并为新的新机会提供了前所未有的机会
癌症的发现。 NCI杰出调查员奖(R35)的竞争续约申请
旨在开发和应用可扩展,可解释和可转移的统计和机器学习(ML)
大规模种系整体基因组测序(WGS)和躯体整体的整合分析方法
大型多种族生物库中的外显子组测序(WES)数据,流行病学和临床数据,
癌症的种群和临床研究,具有实验细胞特异性多摩尼克功能数据,例如
单细胞RNA/ATAC-SEQ数据。我们的最终目标是使用高级数据科学方法和不同
人口类型,临床和实验数据以加速癌症基因进步的进展
映射到预防癌症和医学的机制,发现新的有效的跨种族精度
预防癌症和治疗策略,并减少癌症遗传研究中的健康差异。这
应用旨在满足癌症研究中大量数据分析的紧迫定量需求。
具体而言,(a)对于遗传癌流行病学,我们将开发可扩展,可解释和可转移的
(1)通过整合基于人群的WG和实验的统计和ML方法(1)稀有变体分析
单细胞功能数据; (2)从未知因果关系和生物学的相关变体转向因果
使用因果中介分析和门德尔随机化的变体,基因和途径通过整合
遗传,细胞特异性OMIC,生物标志物和表型数据; (3)估计可转移的跨种族多基因
风险评分(PRS)和遗传力是通过将WGS数据与实验相结合的常见和稀有变体的
硅内细胞特异性功能注释和非遗传数据,用于可行的预防策略; (3)
大型WGS中联合且可转移的跨种族单表型和全球遗传分析
研究和生物库。 (b)对于癌症医学,我们将开发可扩展和可解释的统计
(1)生殖线WG和肿瘤体细胞WES数据联合分析的机器学习方法以识别
易于癌症亚型的遗传变异; (2)肿瘤体细胞数据的综合分析和
诊断患者特征的临床病理特征提高了免疫疗法的功效; (3)
克隆造血,线粒体功能障碍,白细胞端粒的作用分析称为
从肿瘤体细胞事件,癌症预后和对免疫疗法的反应的生殖线WGS数据。我们
将在肺癌和乳腺癌遗传流行病学和临床研究中应用所提出的方法
和生物库。我们将开发这些方法和数据的开放访问群集和基于云的软件
资源并使它们可以在NIH数据共享下提供给癌症研究界。
项目成果
期刊论文数量(156)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Unmeasured confounding and hazard scales: sensitivity analysis for total, direct, and indirect effects.
- DOI:10.1007/s10654-013-9770-6
- 发表时间:2013-02
- 期刊:
- 影响因子:13.6
- 作者:VanderWeele, Tyler J.
- 通讯作者:VanderWeele, Tyler J.
A trans-omics assessment of gene-gene interaction in early-stage NSCLC.
- DOI:10.1002/1878-0261.13345
- 发表时间:2023-01
- 期刊:
- 影响因子:6.6
- 作者:Chen, Jiajin;Song, Yunjie;Li, Yi;Wei, Yongyue;Shen, Sipeng;Zhao, Yang;You, Dongfang;Su, Li;Bjaanaes, Maria Moksnes;Karlsson, Anna;Planck, Maria;Staaf, Johan;Helland, Aslaug;Esteller, Manel;Shen, Hongbing;Christiani, David C. C.;Zhang, Ruyang;Chen, Feng
- 通讯作者:Chen, Feng
Infective endocarditis and cancer in the elderly.
- DOI:10.1007/s10654-015-0111-9
- 发表时间:2016-01
- 期刊:
- 影响因子:13.6
- 作者:García-Albéniz X;Hsu J;Lipsitch M;Logan RW;Hernández-Díaz S;Hernán MA
- 通讯作者:Hernán MA
Associations of genetic risk, BMI trajectories, and the risk of non-small cell lung cancer: a population-based cohort study.
- DOI:10.1186/s12916-022-02400-6
- 发表时间:2022-06-06
- 期刊:
- 影响因子:9.3
- 作者:You, Dongfang;Wang, Danhua;Wu, Yaqian;Chen, Xin;Shao, Fang;Wei, Yongyue;Zhang, Ruyang;Lange, Theis;Ma, Hongxia;Xu, Hongyang;Hu, Zhibin;Christiani, David C.;Shen, Hongbing;Chen, Feng;Zhao, Yang
- 通讯作者:Zhao, Yang
Detecting rare variant effects using extreme phenotype sampling in sequencing association studies.
- DOI:10.1002/gepi.21699
- 发表时间:2013-02
- 期刊:
- 影响因子:2.1
- 作者:Barnett, Ian J.;Lee, Seunggeun;Lin, Xihong
- 通讯作者:Lin, Xihong
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
XIHONG LIN其他文献
XIHONG LIN的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('XIHONG LIN', 18)}}的其他基金
Statistical Methods for Integrative Analysis of Large-Scale Multi-Ethnic Whole Genome Sequencing Studies and Biobanks of Common Diseases
大规模多民族全基因组测序研究和常见疾病生物样本库综合分析的统计方法
- 批准号:
10622567 - 财政年份:2022
- 资助金额:
$ 90.88万 - 项目类别:
Powering whole genome sequence-based genetic discovery for common human diseases- Extended 2021-2022.
为常见人类疾病提供基于全基因组序列的基因发现 - 延期 2021-2022 年。
- 批准号:
10355760 - 财政年份:2021
- 资助金额:
$ 90.88万 - 项目类别:
Powering whole genome sequence-based genetic discovery for common human diseases
为常见人类疾病提供基于全基因组序列的基因发现
- 批准号:
10085285 - 财政年份:2020
- 资助金额:
$ 90.88万 - 项目类别:
Powering whole genome sequence-based genetic discovery for common human diseases
为常见人类疾病提供基于全基因组序列的基因发现
- 批准号:
10168752 - 财政年份:2020
- 资助金额:
$ 90.88万 - 项目类别:
Statistical Methods for Analysis of Massive Genetic and Genomic Data in Cancer Research
癌症研究中大量遗传和基因组数据分析的统计方法
- 批准号:
9120850 - 财政年份:2015
- 资助金额:
$ 90.88万 - 项目类别:
Statistical Methods for Analysis of Massive Genetic and Genomic Data in Cancer Research
癌症研究中大量遗传和基因组数据分析的统计方法
- 批准号:
9321418 - 财政年份:2015
- 资助金额:
$ 90.88万 - 项目类别:
Statistical Methods for Analysis of Massive Genetic and Genomic Data in Cancer Research
癌症研究中大量遗传和基因组数据分析的统计方法
- 批准号:
9980301 - 财政年份:2015
- 资助金额:
$ 90.88万 - 项目类别:
Statistical Methods for Analysis of Massive Genetic and Genomic Data in Cancer Research
癌症研究中大量遗传和基因组数据分析的统计方法
- 批准号:
9752258 - 财政年份:2015
- 资助金额:
$ 90.88万 - 项目类别:
Statistical Methods for Analysis of Massive Genetic and Genomic Data in Cancer Research
癌症研究中大量遗传和基因组数据分析的统计方法
- 批准号:
8955524 - 财政年份:2015
- 资助金额:
$ 90.88万 - 项目类别:
相似国自然基金
基于腔光机械效应的石墨烯光纤加速度计研究
- 批准号:62305039
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于自持相干放大的高精度微腔光力加速度计研究
- 批准号:52305621
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
位移、加速度双控式自复位支撑-高层钢框架结构的抗震设计方法及韧性评估研究
- 批准号:52308484
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
高离心加速度行星排滚针轴承多场耦合特性与保持架断裂失效机理研究
- 批准号:52305047
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
基于偏心光纤包层光栅的矢量振动加速度传感技术研究
- 批准号:62305269
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Surface exosome integrin profiling to predict organotropic metastasis of breast cancer
表面外泌体整合素分析预测乳腺癌的器官转移
- 批准号:
10654221 - 财政年份:2023
- 资助金额:
$ 90.88万 - 项目类别:
Washington University (WU) ROBIN Center: MicroEnvironment and Tumor Effects Of Radiotherapy (METEOR)
华盛顿大学 (WU) 罗宾中心:放射治疗的微环境和肿瘤效应 (METEOR)
- 批准号:
10715019 - 财政年份:2023
- 资助金额:
$ 90.88万 - 项目类别:
Cause and Effect Relationships Between Glycation and the Ancestry Specific Tumor Stroma
糖化与祖先特异性肿瘤基质之间的因果关系
- 批准号:
10586185 - 财政年份:2023
- 资助金额:
$ 90.88万 - 项目类别:
Accelerating genomic analysis for time critical clinical applications
加速时间紧迫的临床应用的基因组分析
- 批准号:
10593480 - 财政年份:2023
- 资助金额:
$ 90.88万 - 项目类别:
Investigating the roles of oncogenic extrachromosomal circular DNAs in cancer
研究致癌染色体外环状 DNA 在癌症中的作用
- 批准号:
10718423 - 财政年份:2023
- 资助金额:
$ 90.88万 - 项目类别: