Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures

复杂结构高维遗传数据的变量选择与预测

基本信息

  • 批准号:
    RGPIN-2020-05133
  • 负责人:
  • 金额:
    $ 1.31万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2020
  • 资助国家:
    加拿大
  • 起止时间:
    2020-01-01 至 2021-12-31
  • 项目状态:
    已结题

项目摘要

The challenge of precision medicine is to appropriately fit treatments or recommendations to each individual. Large amounts of resources are being used to generate genetic data with the hope that it will provide tailored decision making. Since genotyping costs have now dropped below those of several routine clinical tests, at least seven large health care systems have invested in genome-wide genotyping of a large proportion of their population, within whom electronic health record data are available. This data is being used to develop polygenic risk scores (PRS), which can predict complex diseases on the basis of genetic data, and thus have the potential to improve clinical care via precision medicine which would be of great interest to the Canadian population. Analytic tools increasing prediction accuracy are needed to maximize the productivity of these investments. In the context of clinical decision making, there is also a need to understand which variables are driving these predictions. Indeed, there is a reluctance among substantive experts to use so-called black-box algorithms from the machine learning literature because there is a lack of interpretability and transparency. It is difficult to know how the algorithm is making its decisions which can have serious ethical consequences. On the other hand, while many of the models developed in the statistical literature are interpretable and provide measures of uncertainty around their parameter estimates, they are not scalable to the massive amounts of data being generated today. It is becoming increasingly important for statisticians to not only develop theoretically justified methods, but also consider practical issues such as computational algorithms, data format and software. Considering each component in tandem is a step towards more appropriate methods being used in practice. To this end, the goal of this proposal is focused around three Themes: 1) to develop the theory and computational algorithms for new high-dimensional linear mixed models for variable selection and prediction in correlated or groped data; 2) propose interaction models between a key exposure and a high-dimensional dataset (e.g. gene-environment interactions); and 3) develop prediction tools from high-dimensional data for survival time endpoints. Our methods will be implemented in user friendly software, with careful considerations of data format and storage, in order to promote wider uptake of more complex models by data analysts. The results from this project will help to establish me as a new researcher with expertise in variable selection and prediction models for high-dimensional data with complex structures and make me competitive nationally and internationally.
精确医学的挑战是适当适合每个人的治疗方法或建议。大量资源用于生成遗传数据,希望它将提供量身定制的决策。由于现在的基因分型成本已降至几个常规临床测试的基因分类成本以下,因此至少有七个大型医疗保健系统投资于大部分人群的全基因组基因分型,其中电子健康记录数据可获得。这些数据用于开发多基因风险评分(PRS),该评分可以根据遗传数据来预测复杂的疾病,因此有可能通过精确医学改善临床护理,这对加拿大人群引起了极大的兴趣。需要提高预测准确性的分析工具,以最大程度地提高这些投资的生产率。在临床决策的背景下,还需要了解哪些变量正在推动这些预测。确实,实质专家不愿使用机器学习文献中所谓的黑盒算法,因为缺乏解释性和透明度。很难知道该算法是如何做出可能带来严重道德后果的决定的。另一方面,尽管统计文献中开发的许多模型都是可以解释的,并围绕其参数估计提供了不确定性的衡量标准,但它们无法扩展到当今生成的大量数据。 对于统计学家来说,不仅要开发理论上合理的方法,而且考虑到计算算法,数据格式和软件等实用问题变得越来越重要。考虑串联中的每个组件是迈向实践中使用更合适的方法的一步。为此,该提案的目标集中在三个主题围绕:1)开发新的高维线性混合模型的理论和计算算法,用于可变选择和相关或摸索数据的可变选择和预测; 2)提出了关键暴露和高维数据集之间的相互作用模型(例如基因 - 环境相互作用); 3)从高维数据中开发预测工具,以获取生存时间终点。 我们的方法将在用户友好的软件中实施,并仔细考虑数据格式和存储,以促进数据分析师对更复杂模型的更广泛的吸收。该项目的结果将有助于将我确立为新的研究人员,该研究人员在可变选择和预测模型的高维数据中具有复杂结构的专业知识,并使我在国内和国际上具有竞争力。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Bhatnagar, Sahir其他文献

Results of a RCT on a Transition Support Program for Adults with ASD: Effects on Self-Determination and Quality of Life
  • DOI:
    10.1002/aur.2027
  • 发表时间:
    2018-12-01
  • 期刊:
  • 影响因子:
    4.7
  • 作者:
    Nadig, Aparna;Flanagan, Tara;Bhatnagar, Sahir
  • 通讯作者:
    Bhatnagar, Sahir

Bhatnagar, Sahir的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Bhatnagar, Sahir', 18)}}的其他基金

Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
  • 批准号:
    RGPIN-2020-05133
  • 财政年份:
    2022
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Discovery Grants Program - Individual
Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
  • 批准号:
    RGPIN-2020-05133
  • 财政年份:
    2021
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Discovery Grants Program - Individual
Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
  • 批准号:
    DGECR-2020-00344
  • 财政年份:
    2020
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Discovery Launch Supplement

相似国自然基金

基于环肽的穴状大环超分子体系构筑及其分子选择键合研究
  • 批准号:
    22371148
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
设计合成二维导电COF构建对钠/钾离子高选择性的TFT传感器
  • 批准号:
    52303322
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
无创超声刺激促进缺血性脑卒中康复的神经血管耦合机制和参数选择规律研究
  • 批准号:
    82302879
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
调节聚焦理论视角下企业环境战略选择机制及其绩效影响研究
  • 批准号:
    72362031
  • 批准年份:
    2023
  • 资助金额:
    27 万元
  • 项目类别:
    地区科学基金项目
基于数学模型的媒介寄主选择偏好在柑橘黄龙病传播中的作用揭示
  • 批准号:
    12361097
  • 批准年份:
    2023
  • 资助金额:
    27 万元
  • 项目类别:
    地区科学基金项目

相似海外基金

Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
  • 批准号:
    RGPIN-2020-05133
  • 财政年份:
    2022
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Discovery Grants Program - Individual
Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
  • 批准号:
    RGPIN-2020-05133
  • 财政年份:
    2021
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Discovery Grants Program - Individual
Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
  • 批准号:
    DGECR-2020-00344
  • 财政年份:
    2020
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Discovery Launch Supplement
Variable Selection in Multivariate Time Series Models and Its Applications to the Prediction of Productions of Pacific Fisheries
多元时间序列模型中的变量选择及其在太平洋渔业产量预测中的应用
  • 批准号:
    410683-2011
  • 财政年份:
    2013
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Variable Selection in Multivariate Time Series Models and Its Applications to the Prediction of Productions of Pacific Fisheries
多元时间序列模型中的变量选择及其在太平洋渔业产量预测中的应用
  • 批准号:
    410683-2011
  • 财政年份:
    2012
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Alexander Graham Bell Canada Graduate Scholarships - Doctoral
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了