Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
基本信息
- 批准号:RGPIN-2018-04558
- 负责人:
- 金额:$ 1.31万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2019
- 资助国家:加拿大
- 起止时间:2019-01-01 至 2020-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Discrete data in the form of counts or proportions often arise in many fields of study, such as, epidemiology, biostatistics, medical and public health sciences, environmental studies and social sciences. These data often encounter over-dispersion (variance is larger than what can be predicted by a simple model, such as, the binomial or the Poisson model) and zero-inflation (more zero counts than what can be predicted by a simple model). ******Regression analysis of discrete data can be further complicated by the existence of missing values in the response variable and/or in the explanatory variables (covariates). If the missingness does not depend on observed data, then the missing data are called missing completely at random (MCAR). If the missing data mechanism depends only on observed data, then the data are missing at random (MAR). The MAR is also known as ignorable missing. That is, in this case, the missing data mechanism can be ignored. If the missing data mechanism depends on both observed and unobserved data, that is, failure to observe a value depends on the value that would have been observed, then the data are called missing not at random (MNAR) in which case the missingness is nonignorable. ******Longitudinal data (count/binary/continuous/survival) are frequently encountered in many subject-matter areas such as epidemiology, biostatistics, medical and public health sciences, environmental studies and social sciences. Longitudinal studies are characterized by observing the same variables repeatedly over a period of time. Usually the subjects are assumed to be independent, while the collected observations of the same subject are correlated.***Further, model selection (selecting regression variables that contribution most) procedures in large (big) data sets with many explanatory variables is important, as in practice interpreting results from a simple model is much easier. *********In this research I we will develop estimation procedures in discrete data regression models (involving over-dispersion, zero-inflation, missing responses, measurement errors in covariates), model selection, and small sample bias correction of parameter estimates in longitudinal set up or otherwise.******In many applied fields sometimes it is necessary to compare effectiveness of one procedure over another (two drugs, two teaching methods, two fertilizers etc.). For example, under two biologically different conditions we are often interested in identifying differentially expressed genes. It is often the case that the assumption of equal variances of the two groups is violated for many genes where a large number of them are required to be filtered or ranked. In these cases exact tests are unavailable. In this research I plan to develop approximate procedures and compare them with existing procedures under different assumptions regarding the data distribution (normal, negative binomial, beta-binomial, Weibull, Gamma etc.).*****
计数或比例形式的离散数据经常出现在许多研究领域,例如流行病学、生物统计学、医学和公共卫生科学、环境研究和社会科学。这些数据经常遇到过度分散(方差大于简单模型(例如二项式或泊松模型)可以预测的数据)和零膨胀(零计数比简单模型可以预测的数据更多)。 ******离散数据的回归分析可能因响应变量和/或解释变量(协变量)中存在缺失值而进一步复杂化。如果缺失不依赖于观察到的数据,则缺失的数据称为完全随机缺失(MCAR)。如果缺失数据机制仅依赖于观察到的数据,则数据是随机缺失(MAR)。 MAR 也称为可忽略缺失。也就是说,在这种情况下,丢失数据的机制可以被忽略。如果缺失数据机制取决于观察到的数据和未观察到的数据,即未能观察到的值取决于本来可以观察到的值,则该数据称为非随机缺失(MNAR),在这种情况下,缺失是不可忽略的。 ******纵向数据(计数/二进制/连续/生存)在流行病学、生物统计学、医学和公共卫生科学、环境研究和社会科学等许多主题领域中经常遇到。纵向研究的特点是在一段时间内重复观察相同的变量。通常假设受试者是独立的,而同一受试者收集的观察结果是相关的。***此外,在具有许多解释变量的大(大)数据集中进行模型选择(选择贡献最大的回归变量)程序很重要,在实践中,解释简单模型的结果要容易得多。 *********在这项研究中,我们将开发离散数据回归模型的估计程序(涉及过度离散、零膨胀、缺失响应、协变量的测量误差)、模型选择和小样本偏差校正纵向设置或其他方式的参数估计。******在许多应用领域,有时有必要比较一种程序与另一种程序(两种药物、两种教学方法、两种肥料等)的有效性。例如,在两种不同的生物学条件下,我们通常对识别差异表达的基因感兴趣。通常情况下,对于许多基因来说,两组的方差相等的假设被违反,其中大量基因需要被过滤或排序。在这些情况下,无法进行精确的测试。在这项研究中,我计划开发近似程序,并在有关数据分布的不同假设(正态、负二项式、β-二项式、威布尔、伽马等)下将它们与现有程序进行比较。*****
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Paul, Sudhir其他文献
Constitutive Production of Catalytic Antibodies to a Staphylococcus aureus Virulence Factor and Effect of Infection
- DOI:
10.1074/jbc.m111.330043 - 发表时间:
2012-03-23 - 期刊:
- 影响因子:4.8
- 作者:
Brown, Eric L.;Nishiyama, Yasuhiro;Paul, Sudhir - 通讯作者:
Paul, Sudhir
Catalytic immunoglobulin gene delivery in a mouse model of Alzheimer's disease: prophylactic and therapeutic applications.
- DOI:
10.1007/s12035-014-8691-z - 发表时间:
2015-02 - 期刊:
- 影响因子:5.1
- 作者:
Kou, Jinghong;Yang, Junling;Lim, Jeong-Eun;Pattanayak, Abhinandan;Song, Min;Planque, Stephanie;Paul, Sudhir;Fukuchi, Ken-ichiro - 通讯作者:
Fukuchi, Ken-ichiro
A covalent HIV vaccine: is there hope for the future?
- DOI:
10.2217/17460794.4.1.7 - 发表时间:
2009-01-01 - 期刊:
- 影响因子:3.1
- 作者:
Paul, Sudhir;Planque, Stephanie A.;Hanson, Carl V. - 通讯作者:
Hanson, Carl V.
The generalized linear model and extensions: a review and some biological and environmental applications
- DOI:
10.1002/env.849 - 发表时间:
2007-06-01 - 期刊:
- 影响因子:1.7
- 作者:
Paul, Sudhir;Saha, Krishna K. - 通讯作者:
Saha, Krishna K.
Estimation for zero-inflated beta-binomial regression model with missing response data
- DOI:
10.1002/sim.7845 - 发表时间:
2018-11-20 - 期刊:
- 影响因子:2
- 作者:
Luo, Rong;Paul, Sudhir - 通讯作者:
Paul, Sudhir
Paul, Sudhir的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Paul, Sudhir', 18)}}的其他基金
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2022
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2021
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2020
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2018
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2017
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2016
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2015
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2014
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2013
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Generalized linear models with zero-inflation and/or ever-dispersion with covariate measurement errors, methods for longitudinal and clustered data and finite mixture models
具有协变量测量误差的零膨胀和/或不断离散的广义线性模型、纵向和聚类数据的方法以及有限混合模型
- 批准号:
8593-2008 - 财政年份:2012
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
随机缺失下纵向数据的多重稳健估计
- 批准号:12361057
- 批准年份:2023
- 资助金额:27 万元
- 项目类别:地区科学基金项目
异质性纵向数据的混合增长曲线模型:均值与协方差联合建模
- 批准号:12301362
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
纵向动态数据融合驱动的重症患者VTE精准管理决策模型的构建及评价
- 批准号:72304221
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于纵向非线性数据构建亚临床动脉粥样硬化的神经网络预测模型研究
- 批准号:82304244
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于递推重抽样的纵向微生物数据中介分析方法研究
- 批准号:82373681
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
相似海外基金
Peripheral Artery Disease: Long-term Survival & Outcomes Study (PEARLS)
外周动脉疾病:长期生存
- 批准号:
10734991 - 财政年份:2023
- 资助金额:
$ 1.31万 - 项目类别:
Multi-Dimensional Religiosity and Pregnancy-Related Behaviors during the Transition to Adulthood
向成年过渡期间的多维宗教信仰和怀孕相关行为
- 批准号:
10649080 - 财政年份:2023
- 资助金额:
$ 1.31万 - 项目类别:
Analysis of Alzheimer's disease studies that feature truncated or interval-censored covariates
对具有截断或区间删失协变量的阿尔茨海默病研究的分析
- 批准号:
10725225 - 财政年份:2023
- 资助金额:
$ 1.31万 - 项目类别:
Early Developmental Determinants and Pathways in Down syndrome
唐氏综合症的早期发育决定因素和途径
- 批准号:
10882081 - 财政年份:2023
- 资助金额:
$ 1.31万 - 项目类别:
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2022
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual