Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
基本信息
- 批准号:RGPIN-2018-04558
- 负责人:
- 金额:$ 2.62万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Discrete data in the form of counts or proportions often arise in many fields of study, such as, epidemiology, biostatistics, medical and public health sciences, environmental studies and social sciences. These data often encounter over-dispersion (variance is larger than what can be predicted by a simple model, such as, the binomial or the Poisson model) and zero-inflation (more zero counts than what can be predicted by a simple model). Regression analysis of discrete data can be further complicated by the existence of missing values in the response variable and/or in the explanatory variables (covariates). If the missingness does not depend on observed data, then the missing data are called missing completely at random (MCAR). If the missing data mechanism depends only on observed data, then the data are missing at random (MAR). The MAR is also known as ignorable missing. That is, in this case, the missing data mechanism can be ignored. If the missing data mechanism depends on both observed and unobserved data, that is, failure to observe a value depends on the value that would have been observed, then the data are called missing not at random (MNAR) in which case the missingness is nonignorable. Longitudinal data (count/binary/continuous/survival) are frequently encountered in many subject-matter areas such as epidemiology, biostatistics, medical and public health sciences, environmental studies and social sciences. Longitudinal studies are characterized by observing the same variables repeatedly over a period of time. Usually the subjects are assumed to be independent, while the collected observations of the same subject are correlated.Further, model selection (selecting regression variables that contribution most) procedures in large (big) data sets with many explanatory variables is important, as in practice interpreting results from a simple model is much easier. In this research I we will develop estimation procedures in discrete data regression models (involving over-dispersion, zero-inflation, missing responses, measurement errors in covariates), model selection, and small sample bias correction of parameter estimates in longitudinal set up or otherwise.In many applied fields sometimes it is necessary to compare effectiveness of one procedure over another (two drugs, two teaching methods, two fertilizers etc.). For example, under two biologically different conditions we are often interested in identifying differentially expressed genes. It is often the case that the assumption of equal variances of the two groups is violated for many genes where a large number of them are required to be filtered or ranked. In these cases exact tests are unavailable. In this research I plan to develop approximate procedures and compare them with existing procedures under different assumptions regarding the data distribution (normal, negative binomial, beta-binomial, Weibull, Gamma etc.).
以计数或比例形式的离散数据通常在许多研究领域中出现,例如流行病学,生物统计学,医学和公共卫生科学,环境研究和社会科学。这些数据通常会遇到过度分散的(方差要大于简单模型可以预测的差异,例如二项式或泊松模型)和零通气(零计数(零计数要比简单模型可以预测的)。离散数据的回归分析可能会因响应变量和/或解释变量(协变量)中的缺失值而更加复杂。如果丢失性不取决于观察到的数据,则在随机(MCAR)中完全将缺失的数据称为丢失。如果丢失的数据机制仅取决于观察到的数据,则随机缺少数据(MAR)。三月也被称为无知的失踪。也就是说,在这种情况下,丢失的数据机制可以忽略。如果丢失的数据机制取决于观察到的数据和未观察到的数据,即未观察到值取决于要观察到的值,则数据被称为丢失,而不是随机(mnar),在这种情况下,丢失是不可降值的。在流行病学,生物统计学,医学和公共卫生科学,环境研究和社会科学等许多学科领域,经常遇到纵向数据(计数/二进制/连续/生存)。纵向研究的特征是在一段时间内反复观察相同的变量。通常,假设受试者是独立的,而对同一受试者的收集观察结果是相关的。在大的(大)数据集中,模型选择(选择大多数贡献的回归变量)具有许多解释变量很重要,就像实践中一样重要。解释简单模型的结果要容易得多。在这项研究中,我将在离散数据回归模型中制定估计程序(涉及过度分散,零通货膨胀,缺失的响应,协变量中的测量误差),模型选择和小样本偏差校正纵向设置或其他方式的参数估计值在许多应用领域,有时有必要比较一种程序对另一种程序的有效性(两种药物,两种教学方法,两种肥料等)。例如,在两个生物学上不同的条件下,我们通常有兴趣识别差异表达的基因。通常,对于许多基因,需要大量的基因违反两组的相等差异的假设。在这些情况下,确切的测试是不可用的。在这项研究中,我计划制定近似程序,并将其与有关数据分布的不同假设(正常,负二项式,β-二项式,Weibull,Weibull,Gamma等)进行比较。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Paul, Sudhir其他文献
Constitutive Production of Catalytic Antibodies to a Staphylococcus aureus Virulence Factor and Effect of Infection
- DOI:
10.1074/jbc.m111.330043 - 发表时间:
2012-03-23 - 期刊:
- 影响因子:4.8
- 作者:
Brown, Eric L.;Nishiyama, Yasuhiro;Paul, Sudhir - 通讯作者:
Paul, Sudhir
Catalytic immunoglobulin gene delivery in a mouse model of Alzheimer's disease: prophylactic and therapeutic applications.
- DOI:
10.1007/s12035-014-8691-z - 发表时间:
2015-02 - 期刊:
- 影响因子:5.1
- 作者:
Kou, Jinghong;Yang, Junling;Lim, Jeong-Eun;Pattanayak, Abhinandan;Song, Min;Planque, Stephanie;Paul, Sudhir;Fukuchi, Ken-ichiro - 通讯作者:
Fukuchi, Ken-ichiro
A covalent HIV vaccine: is there hope for the future?
- DOI:
10.2217/17460794.4.1.7 - 发表时间:
2009-01-01 - 期刊:
- 影响因子:3.1
- 作者:
Paul, Sudhir;Planque, Stephanie A.;Hanson, Carl V. - 通讯作者:
Hanson, Carl V.
The generalized linear model and extensions: a review and some biological and environmental applications
- DOI:
10.1002/env.849 - 发表时间:
2007-06-01 - 期刊:
- 影响因子:1.7
- 作者:
Paul, Sudhir;Saha, Krishna K. - 通讯作者:
Saha, Krishna K.
Catalytic antibodies to amyloid β peptide in defense against Alzheimer disease
- DOI:
10.1016/j.autrev.2008.03.004 - 发表时间:
2008-05-01 - 期刊:
- 影响因子:13.6
- 作者:
Taguchi, Hiroaki;Planque, Stephanie;Paul, Sudhir - 通讯作者:
Paul, Sudhir
Paul, Sudhir的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Paul, Sudhir', 18)}}的其他基金
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2021
- 资助金额:
$ 2.62万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2020
- 资助金额:
$ 2.62万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2019
- 资助金额:
$ 2.62万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2018
- 资助金额:
$ 2.62万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2017
- 资助金额:
$ 2.62万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2016
- 资助金额:
$ 2.62万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2015
- 资助金额:
$ 2.62万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2014
- 资助金额:
$ 2.62万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2013
- 资助金额:
$ 2.62万 - 项目类别:
Discovery Grants Program - Individual
Generalized linear models with zero-inflation and/or ever-dispersion with covariate measurement errors, methods for longitudinal and clustered data and finite mixture models
具有协变量测量误差的零膨胀和/或不断离散的广义线性模型、纵向和聚类数据的方法以及有限混合模型
- 批准号:
8593-2008 - 财政年份:2012
- 资助金额:
$ 2.62万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
随机缺失下纵向数据的多重稳健估计
- 批准号:12361057
- 批准年份:2023
- 资助金额:27 万元
- 项目类别:地区科学基金项目
异质性纵向数据的混合增长曲线模型:均值与协方差联合建模
- 批准号:12301362
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
纵向动态数据融合驱动的重症患者VTE精准管理决策模型的构建及评价
- 批准号:72304221
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于纵向非线性数据构建亚临床动脉粥样硬化的神经网络预测模型研究
- 批准号:82304244
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于递推重抽样的纵向微生物数据中介分析方法研究
- 批准号:82373681
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
相似海外基金
Peripheral Artery Disease: Long-term Survival & Outcomes Study (PEARLS)
外周动脉疾病:长期生存
- 批准号:
10734991 - 财政年份:2023
- 资助金额:
$ 2.62万 - 项目类别:
Multi-Dimensional Religiosity and Pregnancy-Related Behaviors during the Transition to Adulthood
向成年过渡期间的多维宗教信仰和怀孕相关行为
- 批准号:
10649080 - 财政年份:2023
- 资助金额:
$ 2.62万 - 项目类别:
Analysis of Alzheimer's disease studies that feature truncated or interval-censored covariates
对具有截断或区间删失协变量的阿尔茨海默病研究的分析
- 批准号:
10725225 - 财政年份:2023
- 资助金额:
$ 2.62万 - 项目类别:
Early Developmental Determinants and Pathways in Down syndrome
唐氏综合症的早期发育决定因素和途径
- 批准号:
10882081 - 财政年份:2023
- 资助金额:
$ 2.62万 - 项目类别:
Developing TranStat: A user-friendly R package for the analysis of infectious disease transmission and control among close contacts
开发 TranStat:一个用户友好的 R 包,用于分析密切接触者之间的传染病传播和控制
- 批准号:
10703508 - 财政年份:2022
- 资助金额:
$ 2.62万 - 项目类别: