Bayesian Variable Selection in Generalized Linear Models with Missing Varibles
缺失变量的广义线性模型中的贝叶斯变量选择
基本信息
- 批准号:8471550
- 负责人:
- 金额:$ 23万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2011
- 资助国家:美国
- 起止时间:2011-08-11 至 2014-08-30
- 项目状态:已结题
- 来源:
- 关键词:AccountingAddressAlgorithmsArchivesAutistic DisorderBayesian MethodBinomial DistributionBiomedical ResearchChild AbuseClinical TrialsComplexComputer softwareDataData AnalysesData SetDependenceDevelopmentDropoutEffectivenessEquationEvaluationGeneric DrugsGenesGenetic TranscriptionImmunobiologyIndividualLibrariesLinear ModelsLinear RegressionsMarkov ChainsMedical ResearchMethodologyMethodsModelingObservational StudyOutcomePathway interactionsPerformancePhenotypePoisson DistributionProblem behaviorProceduresProcessResearchResearch PersonnelResortSchemeSideSocial ProblemsSocietiesSolutionsStatistical Data InterpretationStructureTestingUncertaintybasebehavior measurementclinically relevantcytokineempoweredflexibilityimprovedsmoking cessationsoftware developmenttool
项目摘要
DESCRIPTION (provided by applicant): In conducting medical research, especially with behavioral and social problems, a challenge for statistical data analysis comes from the problems introduced by missing values. Missing values may be caused by subjective (e.g., nonresponse and dropout) and technical reasons (e.g., censoring over/below quantization level). Generalized linear models (GLMs) are popularly applied in biomedical data analysis where a fundamental task is to interpret or predict an outcome variable by a subset of potentially explanatory variables. Given an incomplete data set, practitioners frequently resort to the strategy of case-deletion where individuals are excluded from consideration if they miss any of the variables targeted for analysis. This is the default option used in many software packages. Yet, case-deletion may not only sacrifice useful information, but also give rise to biased estimates because it requires strong assumptions on the missingness mechanisms. A more satisfactory solution for missing data problems involves multiple imputation, where several imputations are created for the same set of missing values. The variance between imputations reflects the uncertainty due to missingness. Across multiply imputed data sets, however, traditional variable selection methods (based on significance tests or various criteria) often result in models with different selected predictors, thus presenting a problem of combining the models to make final inferences. In this R01 proposal with a 3-year research plan, we aim to develop two alternative strategies of variable selection for GLMs with missing values by drawing on a Bayesian framework. One approach, which we call "impute, then select" (ITS) involves initially performing multiple imputation and then applying Bayesian variable selection to the multiply imputed data sets. The second strategy - "simultaneously impute and select" (SIAS) - is to conduct Bayesian variable selection and missing data imputation simultaneously within one Markov Chain Monte Carlo (MCMC) process. ITS and SIAS offer two generic frameworks within which various Bayesian variable selection algorithms and missing data imputation algorithms can be implemented. Both strategies will be developed, evaluated, and implemented into an R library for normal regression, binomial regression, and other GLMs with categorical and/or continuous explanatory variables. Practical data sets from several studies on substances abuse and childhood autism will be used to address the effectiveness and flexibility of the proposed strategies. Development of these procedures and contribution of the software to statisticians and researchers in medical research would significantly improve the quality of evaluation of important and clinically relevant data.
描述(由申请人提供):在进行医学研究时,特别是在行为和社会问题方面,统计数据分析的挑战来自于缺失值引入的问题。缺失值可能是由主观(例如,无响应和丢失)和技术原因(例如,审查高于/低于量化水平)引起的。广义线性模型 (GLM) 广泛应用于生物医学数据分析,其基本任务是通过潜在解释变量的子集来解释或预测结果变量。鉴于数据集不完整,从业者经常采取案例删除策略,如果个人错过了任何分析目标变量,则将其排除在考虑范围之外。这是许多软件包中使用的默认选项。然而,案例删除不仅可能会牺牲有用的信息,而且还会导致估计偏差,因为它需要对缺失机制进行强有力的假设。对于缺失数据问题,更令人满意的解决方案涉及多重插补,即为同一组缺失值创建多个插补。插补之间的方差反映了由于缺失而导致的不确定性。然而,在多重插补数据集中,传统的变量选择方法(基于显着性检验或各种标准)通常会产生具有不同选定预测变量的模型,从而提出了组合模型以做出最终推论的问题。在这项为期 3 年的研究计划的 R01 提案中,我们的目标是利用贝叶斯框架为具有缺失值的 GLM 开发两种替代变量选择策略。一种方法,我们称之为“插补,然后选择”(ITS),首先执行多重插补,然后将贝叶斯变量选择应用于多重插补数据集。第二种策略——“同时插补和选择”(SIAS)——是在一个马尔可夫链蒙特卡罗(MCMC)过程中同时进行贝叶斯变量选择和缺失数据插补。 ITS 和 SIAS 提供了两个通用框架,可以在其中实现各种贝叶斯变量选择算法和缺失数据插补算法。这两种策略都将被开发、评估并实施到 R 库中,用于正态回归、二项式回归和其他具有分类和/或连续解释变量的 GLM。来自多项关于药物滥用和儿童自闭症研究的实用数据集将用于解决拟议策略的有效性和灵活性。这些程序的开发以及该软件对医学研究中统计学家和研究人员的贡献将显着提高重要和临床相关数据的评估质量。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A PLSPM-based test statistic for detecting gene-gene co-association in genome-wide association study with case-control design.
基于 PLSPM 的检验统计量,用于通过病例对照设计检测全基因组关联研究中的基因-基因共关联。
- DOI:
- 发表时间:2013
- 期刊:
- 影响因子:3.7
- 作者:Zhang, Xiaoshuai;Yang, Xiaowei;Yuan, Zhongshang;Liu, Yanxun;Li, Fangyu;Peng, Bin;Zhu, Dianwen;Zhao, Jinghua;Xue, Fuzhong
- 通讯作者:Xue, Fuzhong
An integrative framework for Bayesian variable selection with informative priors for identifying genes and pathways.
贝叶斯变量选择的综合框架,具有用于识别基因和途径的信息先验。
- DOI:
- 发表时间:2013
- 期刊:
- 影响因子:3.7
- 作者:Peng, Bin;Zhu, Dianwen;Ander, Bradley P;Zhang, Xiaoshuai;Xue, Fuzhong;Sharp, Frank R;Yang, Xiaowei
- 通讯作者:Yang, Xiaowei
Integrative Bayesian variable selection with gene-based informative priors for genome-wide association studies.
综合贝叶斯变量选择与基于基因的信息先验,用于全基因组关联研究。
- DOI:
- 发表时间:2014
- 期刊:
- 影响因子:2.9
- 作者:Zhang, Xiaoshuai;Xue, Fuzhong;Liu, Hong;Zhu, Dianwen;Peng, Bin;Wiemels, Joseph L;Yang, Xiaowei
- 通讯作者:Yang, Xiaowei
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
XIAOWEI YANG其他文献
XIAOWEI YANG的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('XIAOWEI YANG', 18)}}的其他基金
Bayesian Variable Selection in Generalized Linear Models with Missing Varibles
缺失变量的广义线性模型中的贝叶斯变量选择
- 批准号:
8317303 - 财政年份:2011
- 资助金额:
$ 23万 - 项目类别:
Bayesian Variable Selection in Generalized Linear Models with Missing Varibles
缺失变量的广义线性模型中的贝叶斯变量选择
- 批准号:
8194802 - 财政年份:2011
- 资助金额:
$ 23万 - 项目类别:
Bayesian Variable Selection in Generalized Linear Models with Missing Varibles
缺失变量的广义线性模型中的贝叶斯变量选择
- 批准号:
8543193 - 财政年份:2011
- 资助金额:
$ 23万 - 项目类别:
iPhone-based Real-time Data Solution for Drug Abuse and Other Medical Research
基于 iPhone 的药物滥用和其他医学研究实时数据解决方案
- 批准号:
7672825 - 财政年份:2009
- 资助金额:
$ 23万 - 项目类别:
Transition Model for Incomplete Longitudinal Binary Data
不完整纵向二进制数据的转换模型
- 批准号:
6676189 - 财政年份:2003
- 资助金额:
$ 23万 - 项目类别:
DEVELOPMENT OF AN AUTOMATED NEURAL SPIKE DISCRIMINATOR
自动神经尖峰鉴别器的开发
- 批准号:
3504570 - 财政年份:1991
- 资助金额:
$ 23万 - 项目类别:
相似国自然基金
本体驱动的地址数据空间语义建模与地址匹配方法
- 批准号:41901325
- 批准年份:2019
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
时空序列驱动的神经形态视觉目标识别算法研究
- 批准号:61906126
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
针对内存攻击对象的内存安全防御技术研究
- 批准号:61802432
- 批准年份:2018
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
大容量固态硬盘地址映射表优化设计与访存优化研究
- 批准号:61802133
- 批准年份:2018
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
IP地址驱动的多径路由及流量传输控制研究
- 批准号:61872252
- 批准年份:2018
- 资助金额:64.0 万元
- 项目类别:面上项目
相似海外基金
Bayesian approaches to identify persons with osteoarthritis in electronic health records and administrative health data in the absence of a perfect reference standard
在缺乏完美参考标准的情况下,贝叶斯方法在电子健康记录和管理健康数据中识别骨关节炎患者
- 批准号:
10665905 - 财政年份:2023
- 资助金额:
$ 23万 - 项目类别:
Predicting firearm suicide in military veterans outside the VA health system using linked civilian electronic health record data
使用链接的民用电子健康记录数据预测退伍军人管理局卫生系统外退伍军人的枪支自杀
- 批准号:
10655968 - 财政年份:2023
- 资助金额:
$ 23万 - 项目类别:
Health and Financial Costs of Unequal Care: Colorectal Cancer as a Case Study
不平等护理的健康和财务成本:结直肠癌案例研究
- 批准号:
10656807 - 财政年份:2023
- 资助金额:
$ 23万 - 项目类别:
A mobile health framework for left ventricular end diastolic pressure diagnostics and monitoring.
用于左心室舒张末压诊断和监测的移动健康框架。
- 批准号:
10601929 - 财政年份:2023
- 资助金额:
$ 23万 - 项目类别:
Fair risk profiles and predictive models for outcomes of obstructive sleep apnea through electronic medical record data
通过电子病历数据对阻塞性睡眠呼吸暂停结果进行公平的风险概况和预测模型
- 批准号:
10678108 - 财政年份:2023
- 资助金额:
$ 23万 - 项目类别: