Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
基本信息
- 批准号:RGPIN-2020-05133
- 负责人:
- 金额:$ 1.31万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2020
- 资助国家:加拿大
- 起止时间:2020-01-01 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The challenge of precision medicine is to appropriately fit treatments or recommendations to each individual. Large amounts of resources are being used to generate genetic data with the hope that it will provide tailored decision making. Since genotyping costs have now dropped below those of several routine clinical tests, at least seven large health care systems have invested in genome-wide genotyping of a large proportion of their population, within whom electronic health record data are available. This data is being used to develop polygenic risk scores (PRS), which can predict complex diseases on the basis of genetic data, and thus have the potential to improve clinical care via precision medicine which would be of great interest to the Canadian population. Analytic tools increasing prediction accuracy are needed to maximize the productivity of these investments. In the context of clinical decision making, there is also a need to understand which variables are driving these predictions. Indeed, there is a reluctance among substantive experts to use so-called black-box algorithms from the machine learning literature because there is a lack of interpretability and transparency. It is difficult to know how the algorithm is making its decisions which can have serious ethical consequences. On the other hand, while many of the models developed in the statistical literature are interpretable and provide measures of uncertainty around their parameter estimates, they are not scalable to the massive amounts of data being generated today.
It is becoming increasingly important for statisticians to not only develop theoretically justified methods, but also consider practical issues such as computational algorithms, data format and software. Considering each component in tandem is a step towards more appropriate methods being used in practice. To this end, the goal of this proposal is focused around three Themes: 1) to develop the theory and computational algorithms for new high-dimensional linear mixed models for variable selection and prediction in correlated or groped data; 2) propose interaction models between a key exposure and a high-dimensional dataset (e.g. gene-environment interactions); and 3) develop prediction tools from high-dimensional data for survival time endpoints.
Our methods will be implemented in user friendly software, with careful considerations of data format and storage, in order to promote wider uptake of more complex models by data analysts. The results from this project will help to establish me as a new researcher with expertise in variable selection and prediction models for high-dimensional data with complex structures and make me competitive nationally and internationally.
精准医学的挑战是为每个人提供适当的治疗或建议。大量资源被用来生成遗传数据,希望它能提供量身定制的决策。由于基因分型成本现已低于几种常规临床测试的成本,至少有七个大型医疗保健系统已投资对其大部分人口进行全基因组基因分型,其中电子健康记录数据可用。这些数据被用来开发多基因风险评分(PRS),它可以根据遗传数据预测复杂的疾病,从而有可能通过精准医疗改善临床护理,这将引起加拿大民众的极大兴趣。为了最大限度地提高这些投资的生产力,需要提高预测准确性的分析工具。在临床决策的背景下,还需要了解哪些变量正在驱动这些预测。事实上,实质性专家不愿意使用机器学习文献中的所谓黑盒算法,因为缺乏可解释性和透明度。很难知道算法如何做出可能产生严重道德后果的决策。另一方面,虽然统计文献中开发的许多模型都是可解释的,并提供了参数估计的不确定性度量,但它们无法扩展到当今生成的大量数据。
对于统计学家来说,不仅要开发理论上合理的方法,还要考虑计算算法、数据格式和软件等实际问题,变得越来越重要。串联考虑每个组件是朝着在实践中使用更合适的方法迈出的一步。为此,该提案的目标集中在三个主题:1)开发新的高维线性混合模型的理论和计算算法,用于相关或摸索数据中的变量选择和预测; 2)提出关键暴露和高维数据集之间的交互模型(例如基因-环境交互); 3)根据高维数据开发生存时间终点的预测工具。
我们的方法将在用户友好的软件中实施,并仔细考虑数据格式和存储,以促进数据分析师更广泛地采用更复杂的模型。该项目的结果将有助于使我成为一名新的研究人员,在具有复杂结构的高维数据的变量选择和预测模型方面具有专业知识,并使我在国内和国际上具有竞争力。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Bhatnagar, Sahir其他文献
Results of a RCT on a Transition Support Program for Adults with ASD: Effects on Self-Determination and Quality of Life
- DOI:
10.1002/aur.2027 - 发表时间:
2018-12-01 - 期刊:
- 影响因子:4.7
- 作者:
Nadig, Aparna;Flanagan, Tara;Bhatnagar, Sahir - 通讯作者:
Bhatnagar, Sahir
Bhatnagar, Sahir的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Bhatnagar, Sahir', 18)}}的其他基金
Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
- 批准号:
RGPIN-2020-05133 - 财政年份:2022
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
- 批准号:
RGPIN-2020-05133 - 财政年份:2021
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
- 批准号:
DGECR-2020-00344 - 财政年份:2020
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Launch Supplement
相似国自然基金
基于可见光催化氢原子转移(HAT)的多位点可调控的选择性分子编辑
- 批准号:22378334
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
自由基介导的京尼平苷的选择性修饰及其抗肺纤维化机制和构效关系研究
- 批准号:22367018
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
非共轭二烯烃选择性聚合催化剂的开发与应用研究
- 批准号:22371157
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
电化学耦合氨浸选择性分离回收废钢中铜的调控机理
- 批准号:52304426
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
选择性BRD4抑制剂的设计、合成及其多囊肾抑制活性研究
- 批准号:22377103
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
相似海外基金
Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
- 批准号:
RGPIN-2020-05133 - 财政年份:2022
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
- 批准号:
RGPIN-2020-05133 - 财政年份:2021
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Variable Selection and Prediction for High-Dimensional Genetic Data with Complex Structures
复杂结构高维遗传数据的变量选择与预测
- 批准号:
DGECR-2020-00344 - 财政年份:2020
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Launch Supplement
Variable Selection in Multivariate Time Series Models and Its Applications to the Prediction of Productions of Pacific Fisheries
多元时间序列模型中的变量选择及其在太平洋渔业产量预测中的应用
- 批准号:
410683-2011 - 财政年份:2013
- 资助金额:
$ 1.31万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Variable Selection in Multivariate Time Series Models and Its Applications to the Prediction of Productions of Pacific Fisheries
多元时间序列模型中的变量选择及其在太平洋渔业产量预测中的应用
- 批准号:
410683-2011 - 财政年份:2012
- 资助金额:
$ 1.31万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral