Bayesian Variable Selection in Generalized Linear Models with Missing Varibles
缺失变量的广义线性模型中的贝叶斯变量选择
基本信息
- 批准号:8543193
- 负责人:
- 金额:$ 9.54万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2011
- 资助国家:美国
- 起止时间:2011-08-11 至 2014-04-30
- 项目状态:已结题
- 来源:
- 关键词:AccountingAddressAlgorithmsArchivesAutistic DisorderBayesian MethodBinomial DistributionBiomedical ResearchChild AbuseClinical TrialsComplexComputer softwareDataData AnalysesData SetDependenceDevelopmentDropoutEffectivenessEquationEvaluationGeneric DrugsGenesGenetic TranscriptionImmunobiologyIndividualLibrariesLinear ModelsLinear RegressionsMarkov ChainsMedical ResearchMethodologyMethodsModelingObservational StudyOutcomePathway interactionsPerformancePhenotypePoisson DistributionProblem behaviorProceduresProcessResearchResearch PersonnelResortSchemeSideSocial ProblemsSocietiesSolutionsStatistical Data InterpretationStructureTestingUncertaintybasebehavior measurementclinically relevantcytokineempoweredflexibilityimprovedsmoking cessationsoftware developmenttool
项目摘要
DESCRIPTION (provided by applicant): In conducting medical research, especially with behavioral and social problems, a challenge for statistical data analysis comes from the problems introduced by missing values. Missing values may be caused by subjective (e.g., nonresponse and dropout) and technical reasons (e.g., censoring over/below quantization level). Generalized linear models (GLMs) are popularly applied in biomedical data analysis where a fundamental task is to interpret or predict an outcome variable by a subset of potentially explanatory variables. Given an incomplete data set, practitioners frequently resort to the strategy of case-deletion where individuals are excluded from consideration if they miss any of the variables targeted for analysis. This is the default option used in many software packages. Yet, case-deletion may not only sacrifice useful information, but also give rise to biased estimates because it requires strong assumptions on the missingness mechanisms. A more satisfactory solution for missing data problems involves multiple imputation, where several imputations are created for the same set of missing values. The variance between imputations reflects the uncertainty due to missingness. Across multiply imputed data sets, however, traditional variable selection methods (based on significance tests or various criteria) often result in models with different selected predictors, thus presenting a problem of combining the models to make final inferences. In this R01 proposal with a 3-year research plan, we aim to develop two alternative strategies of variable selection for GLMs with missing values by drawing on a Bayesian framework. One approach, which we call "impute, then select" (ITS) involves initially performing multiple imputation and then applying Bayesian variable selection to the multiply imputed data sets. The second strategy - "simultaneously impute and select" (SIAS) - is to conduct Bayesian variable selection and missing data imputation simultaneously within one Markov Chain Monte Carlo (MCMC) process. ITS and SIAS offer two generic frameworks within which various Bayesian variable selection algorithms and missing data imputation algorithms can be implemented. Both strategies will be developed, evaluated, and implemented into an R library for normal regression, binomial regression, and other GLMs with categorical and/or continuous explanatory variables. Practical data sets from several studies on substances abuse and childhood autism will be used to address the effectiveness and flexibility of the proposed strategies. Development of these procedures and contribution of the software to statisticians and researchers in medical research would significantly improve the quality of evaluation of important and clinically relevant data.
描述(由申请人提供):在进行医学研究时,尤其是在行为和社会问题时,统计数据分析的挑战来自缺失值引入的问题。缺失值可能是由主观(例如,无响应和辍学)和技术原因(例如,对量化级别进行审查)引起的。广义线性模型(GLM)通常应用于生物医学数据分析中,其中基本任务是通过潜在的解释变量的子集解释或预测结果变量。如果数据集不完整,从业人员经常诉诸案例删除策略,如果个人错过任何针对分析的变量,则将其排除在外。这是许多软件包中使用的默认选项。然而,案例删除不仅可能牺牲有用的信息,而且会产生偏见的估计,因为它需要对丢失机制的强烈假设。对于缺少数据问题的一个更令人满意的解决方案涉及多个插补,其中为同一一组缺失值创建了几个插图。归精之间的差异反映了由于缺失而导致的不确定性。但是,在乘积数据集中,传统的变量选择方法(基于显着性测试或各种标准)通常会导致具有不同选择预测变量的模型,从而提出了将模型结合起来以做出最终推论的问题。在此R01提案中,我们的研究计划为期3年,我们旨在通过利用贝叶斯框架来制定具有缺失价值的可变选择的替代策略。我们称之为“插入,然后选择”的一种方法涉及最初执行多个插补,然后将贝叶斯变量选择应用于多重估算的数据集。第二种策略 - “同时进行并选择”(SIAS) - 是在一个马尔可夫链蒙特卡洛(MCMC)过程中同时进行贝叶斯变量选择和缺失数据。它的SIA和SIA提供了两个通用框架,可以在其中实现各种贝叶斯变量选择算法和缺少数据插入算法的框架。两种策略都将被开发,评估和实施到R库中,以进行正常回归,二项式回归以及其他具有分类和/或连续解释变量的GLM。来自有关滥用药物和儿童自闭症的几项研究的实用数据集将用于解决拟议策略的有效性和灵活性。这些程序对统计学家和医学研究中的研究人员的开发和贡献将显着提高重要和临床相关数据的评估质量。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
XIAOWEI YANG其他文献
XIAOWEI YANG的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('XIAOWEI YANG', 18)}}的其他基金
Bayesian Variable Selection in Generalized Linear Models with Missing Varibles
缺失变量的广义线性模型中的贝叶斯变量选择
- 批准号:
8317303 - 财政年份:2011
- 资助金额:
$ 9.54万 - 项目类别:
Bayesian Variable Selection in Generalized Linear Models with Missing Varibles
缺失变量的广义线性模型中的贝叶斯变量选择
- 批准号:
8471550 - 财政年份:2011
- 资助金额:
$ 9.54万 - 项目类别:
Bayesian Variable Selection in Generalized Linear Models with Missing Varibles
缺失变量的广义线性模型中的贝叶斯变量选择
- 批准号:
8194802 - 财政年份:2011
- 资助金额:
$ 9.54万 - 项目类别:
iPhone-based Real-time Data Solution for Drug Abuse and Other Medical Research
基于 iPhone 的药物滥用和其他医学研究实时数据解决方案
- 批准号:
7672825 - 财政年份:2009
- 资助金额:
$ 9.54万 - 项目类别:
Transition Model for Incomplete Longitudinal Binary Data
不完整纵向二进制数据的转换模型
- 批准号:
6676189 - 财政年份:2003
- 资助金额:
$ 9.54万 - 项目类别:
DEVELOPMENT OF AN AUTOMATED NEURAL SPIKE DISCRIMINATOR
自动神经尖峰鉴别器的开发
- 批准号:
3504570 - 财政年份:1991
- 资助金额:
$ 9.54万 - 项目类别:
相似国自然基金
时空序列驱动的神经形态视觉目标识别算法研究
- 批准号:61906126
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
本体驱动的地址数据空间语义建模与地址匹配方法
- 批准号:41901325
- 批准年份:2019
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
大容量固态硬盘地址映射表优化设计与访存优化研究
- 批准号:61802133
- 批准年份:2018
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
针对内存攻击对象的内存安全防御技术研究
- 批准号:61802432
- 批准年份:2018
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
IP地址驱动的多径路由及流量传输控制研究
- 批准号:61872252
- 批准年份:2018
- 资助金额:64.0 万元
- 项目类别:面上项目
相似海外基金
Bayesian Statistical Learning for Robust and Generalizable Causal Inferences in Alzheimer Disease and Related Disorders Research
贝叶斯统计学习在阿尔茨海默病和相关疾病研究中进行稳健且可推广的因果推论
- 批准号:
10590913 - 财政年份:2023
- 资助金额:
$ 9.54万 - 项目类别:
Deep Learning Based Natural Language Processing Markers of Anxiety and Depression
基于深度学习的自然语言处理的焦虑和抑郁标记
- 批准号:
10723819 - 财政年份:2023
- 资助金额:
$ 9.54万 - 项目类别:
Predicting firearm suicide in military veterans outside the VA health system using linked civilian electronic health record data
使用链接的民用电子健康记录数据预测退伍军人管理局卫生系统外退伍军人的枪支自杀
- 批准号:
10655968 - 财政年份:2023
- 资助金额:
$ 9.54万 - 项目类别:
Fair risk profiles and predictive models for outcomes of obstructive sleep apnea through electronic medical record data
通过电子病历数据对阻塞性睡眠呼吸暂停结果进行公平的风险概况和预测模型
- 批准号:
10678108 - 财政年份:2023
- 资助金额:
$ 9.54万 - 项目类别:
Mining minority enriched AllofUs data for innovative ethnic specific risk prediction modeling
挖掘少数族裔丰富的 AllofUs 数据,用于创新的种族特定风险预测模型
- 批准号:
10798514 - 财政年份:2023
- 资助金额:
$ 9.54万 - 项目类别: