Computational and Inferential Tools for Machine Learning Methods in Biostatistical Research

生物统计研究中机器学习方法的计算和推理工具

基本信息

  • 批准号:
    RGPIN-2017-06586
  • 负责人:
  • 金额:
    $ 1.02万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2019
  • 资助国家:
    加拿大
  • 起止时间:
    2019-01-01 至 2020-12-31
  • 项目状态:
    已结题

项目摘要

Modern machine learning methods, such as boosting, support vector machines, or neural networks, have made great impact on statistical research and application mostly in terms of improved predictive and prognostic accuracy. Their enhanced abilities to model complex interactions and non-linear effects could also be utilized to explain the underlying physical or physiological phenomena and to generate specific scientific hypothesis for further study. In non-strictly predictive applications, use of many modern methods, however, is hampered by their black-box nature and by the lack of inferential tools that would allow to obtain statistical confidence measures on inferred relationships. The simplest statistical inference which is universal in classical models pertains to statements on individual covariates. For example, is covariate "Gender" an important factor in a model of disease progression? In classical models this is answered by calculating statistical inference quantities (p-values, confidence intervals) on a parameter (or small set of parameters) that are connected with "Gender" in a model. In contrast, machine learning methods utilize a non-parametric approach where covariates influence on the outcome is not controlled by a small set of parameters. Hence the classical approach is not applicable and an importance of any particular covariate in the model of the outcome is not easily tested. While many model-specific or approximate measures have been proposed, in particular Variable Importance Metric in a Random Forest model, there is no universal, statistically coherent approach present in literature. We propose to develop, validate, apply and disseminate - in the form of freely available software packages - a set of tools for classical inference that will allow researchers to test the importance and influence of covariates of interest in the non-parametric machine learning models of the outcome.
现代机器学习方法,例如Boosting、支持向量机或神经网络,对统计研究和应用产生了巨大影响,主要体现在提高预测和预后准确性方面。它们增强的模拟复杂相互作用和非线性效应的能力也可以用来解释潜在的物理或生理现象,并生成具体的科学假设以供进一步研究。然而,在非严格预测应用中,许多现代方法的使用受到其黑盒性质和缺乏能够获得推断关系的统计置信度测量的推断工具的阻碍。经典模型中普遍存在的最简单的统计推断涉及对个体协变量的陈述。例如,协变量“性别”是疾病进展模型中的重要因素吗?在经典模型中,这是通过计算与模型中的“性别”相关的参数(或一小组参数)的统计推断量(p 值、置信区间)来回答的。相比之下,机器学习方法采用非参数方法,其中协变量对结果的影响不受一小部分参数的控制。因此,经典方法不适用,并且结果模型中任何特定协变量的重要性都不容易测试。虽然已经提出了许多特定于模型的或近似的度量,特别是随机森林模型中的变量重要性度量,但文献中没有通用的、统计上一致的方法。我们建议以免费软件包的形式开发、验证、应用和传播一套经典推理工具,使研究人员能够测试非参数机器学习模型中感兴趣的协变量的重要性和影响。结果。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Kustra, Rafal其他文献

The quantitative evaluation of functional neuroimaging experiments: the NPAIRS data analysis framework.
功能性神经影像实验的定量评估:NPAIRS 数据分析框架。
  • DOI:
  • 发表时间:
    2002-04
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Strother, Stephen C;Anderson, Jon;Hansen, Lars Kai;Kjems, Ulrik;Kustra, Rafal;Sidtis, John;Frutiger, Sally;Muley, Suraj;LaConte, Stephen;Rottenberg, David
  • 通讯作者:
    Rottenberg, David
Predictive modeling in case-control single-nucleotide polymorphism studies in the presence of population stratification: a case study using Genetic Analysis Workshop 16 Problem 1 dataset.
存在群体分层的病例对照单核苷酸多态性研究中的预测建模:使用遗传分析研讨会 16 问题 1 数据集的案例研究。
  • DOI:
  • 发表时间:
    2009-12-15
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Arshadi, Niloofar;Chang, Billy;Kustra, Rafal
  • 通讯作者:
    Kustra, Rafal
Predictors of all‐cause mortality among patients hospitalized with influenza, respiratory syncytial virus, or SARS‐CoV‐2
因流感、呼吸道合胞病毒或 SARS-CoV-2 住院的患者全因死亡率的预测因素
  • DOI:
    10.1111/irv.13004
  • 发表时间:
    2022-11
  • 期刊:
  • 影响因子:
    4.4
  • 作者:
    Hamilton, Mackenzie A.;Liu, Ying;Calzavara, Andrew;Sundaram, Maria E.;Djebli, Mohamed;Darvin, Dariya;Baral, Stefan;Kustra, Rafal;Kwong, Jeffrey C.;Mishra, Sharmistha
  • 通讯作者:
    Mishra, Sharmistha
5-hmC in the brain is abundant in synaptic genes and shows differences at the exon-intron boundary
大脑中的 5-hmC 富含突触基因,并且在外显子-内含子边界处表现出差异
  • DOI:
    10.1038/nsmb.2372
  • 发表时间:
    2012-10
  • 期刊:
  • 影响因子:
    16.8
  • 作者:
    Khare, Tarang;Pai, Shraddha;Koncevicius, Karolis;Pal, Mrinal;Kriukiene, Edita;Liutkeviciute, Zita;Irimia, Manuel;Jia, Peixin;Ptak, Carolyn;Xia, Menghang;Tice, Raymond;Tochigi, Mamoru;Morera, Solange;Nazarians, Anaies;Belsham, Denise;Wong, Albert H. C.;Blencowe, Benjamin J.;Wang, Sun Chong;Kapranov, Philipp;Kustra, Rafal;Labrie, Viviane;Klimasauskas, Saulius;Petronis, Arturas
  • 通讯作者:
    Petronis, Arturas
A disproportionate epidemic: COVID-19 cases and deaths among essential workers in Toronto, Canada.
不成比例的流行病:加拿大多伦多基本工作人员中的 COVID-19 病例和死亡人数。
  • DOI:
  • 发表时间:
    2021-11
  • 期刊:
  • 影响因子:
    5.6
  • 作者:
    Rao, Amrita;Ma, Huiting;Moloney, Gary;Kwong, Jeffrey C;Jüni, Peter;Sander, Beate;Kustra, Rafal;Baral, Stefan D;Mishra, Sharmistha
  • 通讯作者:
    Mishra, Sharmistha

Kustra, Rafal的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Kustra, Rafal', 18)}}的其他基金

Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2010
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2010
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2009
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2009
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2008
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2008
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2007
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2007
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2006
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2006
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual

相似海外基金

Computational and Inferential Tools for Machine Learning Methods in Biostatistical Research
生物统计研究中机器学习方法的计算和推理工具
  • 批准号:
    RGPIN-2017-06586
  • 财政年份:
    2021
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Computational and Inferential Tools for Machine Learning Methods in Biostatistical Research
生物统计研究中机器学习方法的计算和推理工具
  • 批准号:
    RGPIN-2017-06586
  • 财政年份:
    2021
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Computational and Inferential Tools for Machine Learning Methods in Biostatistical Research
生物统计研究中机器学习方法的计算和推理工具
  • 批准号:
    RGPIN-2017-06586
  • 财政年份:
    2020
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Computational and Inferential Tools for Machine Learning Methods in Biostatistical Research
生物统计研究中机器学习方法的计算和推理工具
  • 批准号:
    RGPIN-2017-06586
  • 财政年份:
    2020
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Computational and Inferential Tools for Machine Learning Methods in Biostatistical Research
生物统计研究中机器学习方法的计算和推理工具
  • 批准号:
    RGPIN-2017-06586
  • 财政年份:
    2018
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了