Bayesian Differential Causal Network and Clustering Methods for Single-Cell Data
单细胞数据的贝叶斯差分因果网络和聚类方法
基本信息
- 批准号:10707494
- 负责人:
- 金额:$ 29.79万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-21 至 2026-08-31
- 项目状态:未结题
- 来源:
- 关键词:AddressBayesian ModelingBiologicalCell Differentiation processCellsComparative StudyDataDependenceDevelopmentEtiologyFeedbackGene Expression RegulationGenesInterventionJointsKnowledgeMeasuresMethodsModelingMolecularNational Institute of General Medical SciencesNatureNeighborhoodsPathologicPredispositionPreventionProceduresRegulator GenesResearchSample SizeSamplingTechnologyTestingTimeTissuesTranslatingUncertaintyWorkcausal variantcell typedifferential expressiondisease diagnosisexperimental groupgene regulatory networkimprovednetwork modelssingle-cell RNA sequencingtooltranscriptome sequencingtreatment group
项目摘要
Project Description
DMS/NIGMS 2: Bayesian Differential Causal Network and Clustering Methods for Single-Cell Data
A Significance
A.1 Importance of the Problem to Be Addressed
Single-cell RNA-sequencing (scRNA-seq) technologies have facilitated new biological discoveries that were
impossible with bulk RNA-seq, such as discovering at the single-cell level new gene regulatory activities and
cell types. However, in order to translate the fundamental biological knowledge advanced by the scRNA-
seq to improved disease diagnosis, treatment, and prevention, new methods are required to comparatively
study the molecular differences between normal and pathological cells/tissues, and between control and
case/treatment groups. Although identification of differentially expressed genes across two sample groups
has been extensively studied, to date, the vast majority of the existing methods for identifying gene regu-
latory networks (GRNs) and cell types have, so far, focused on scRNA-seq data generated under a single
experimental condition. In principle, these methods can be applied to one experimental condition at a time,
based on which post hoc comparisons can be made in order to find the differences caused by experimental
interventions. However, compared to joint modeling approaches, this two-step procedure is deemed less
efficient and more susceptible to false discoveries due to lack of proper uncertainty propagation from the
first step to the second. Moreover, most scRNA-seq network models are correlative in nature and do not
infer causal gene regulatory relationships. There is, therefore, a critical need to develop new models for
identifying the effects of experimental interventions on causal gene regulation and cell composition by jointly
modeling scRNA-seq data across experimental groups. In the absence of such tools, mechanistically un-
derstanding gene regulation and cell differentiation, and fully realizing the translational values of scRNA-seq
studies will likely remain difficult.
A.2 Rigor of Prior Research
Aim 1. Many existing scRNA-seq network approaches adapt standard association measures to zero-
inflated scRNA-seq data, e.g. Pearson correlation [1] and mutual information [2]. A common limitation
of these methods is that they only quantify marginal dependencies, which is susceptible to spurious indirect
associations [3]. Graphical models which deal with conditional associations are powerful alternatives to
the marginal association measures. Numerous methods have been proposed for general purposes [4, 5]
including the development on non-Gaussian data [6–9]. Specifically for scRNA-seq data, two undirected
graphical models including Co-I Cai's work [10, 11] were recently proposed based on neighborhood selec-
tion which, however, do not infer causal gene regulation. To identify causal relationships, several alternative
methods [12, 13] were developed. However, these methods either ignore the count nature of scRNA-seq
data, require a known pseudotime (which is rarely known in real scRNA-seq data), or do not theoretically in-
vestigate causal identifiability for cross-sectional observations. For differential networks, many approaches
[14–18] including the PI's prior work [19] have been developed for bulk RNA-seq data which showed great
advantages of joint analyses over independent analyses. However, there exist much fewer differential net-
work methods for scRNA-seq data, e.g., PT [20] and scdNet [21] . The common limitation of PT and
scdNet is that they only consider marginal dependence (hence susceptible to false discoveries) and do not
discover causality. Results from our preliminary results (§C.1) demonstrate that the proposed Bayesian
network model is capable of identifying causal gene regulatory relationships in cross-sectional scRNA-seq
data and often outperforms the state-of-the-art alternative methods.
Aim 2. Very few methods are available to construct cell-specific networks because it is difficult to estimate
networks with, in essence, sample size one. Recently, a hypothesis testing approach [22] was developed
to estimate cell-specific networks. The method makes approximate network inference of each cell based
on its neighbors. However, it only considers symmetric (undirected) marginal dependence, and therefore
cannot infer causal regulatory relationships and is susceptible to spurious associations. The PI's prior work
[23] addressed the "sample-size-one" problem in bulk RNA-seq data assuming the causal networks are
smooth functions of additional covariates. However, the method is not applicable without covariates and
does not allow feedback loops, a common motif in GRN. Existing work [24, 25] including the PI's [19] has
1
项目描述
DMS/NIGMS 2:单细胞数据的贝叶斯差分因果网络和聚类方法
意义
A.1 待解决问题的重要性
单细胞 RNA 测序 (scRNA-seq) 技术促进了新的生物学发现
批量 RNA-seq 是不可能实现的,例如在单细胞水平发现新的基因调控活性,
然而,为了转化 scRNA 先进的基础生物学知识,
为了改善疾病的诊断、治疗和预防,需要新的方法来相对
研究正常和病理细胞/组织之间以及对照和组织之间的分子差异
尽管鉴定了两个样本组之间的差异表达基因。
迄今为止,绝大多数现有的识别基因调控的方法已经得到了广泛的研究。
到目前为止,实验室网络(GRN)和细胞类型主要关注在单一网络下生成的 scRNA-seq 数据。
原则上,这些方法一次可以应用于一种实验条件,
在此基础上可以进行事后比较,以发现实验造成的差异
然而,与联合建模方法相比,这种两步程序被认为较少。
由于缺乏适当的不确定性传播,效率更高并且更容易受到错误发现的影响
此外,大多数 scRNA-seq 网络模型本质上是相关的,并且不相关。
因此,迫切需要开发新的模型来推断因果基因调控关系。
通过联合确定实验干预对因果基因调控和细胞组成的影响
在缺乏此类工具的情况下,机械地对跨实验组的 scRNA-seq 数据进行建模。
了解基因调控和细胞分化,充分实现scRNA-seq的翻译价值
研究可能仍然很困难。
A.2 先前研究的严谨性
目标 1. 许多现有的 scRNA-seq 网络方法将标准关联测量方法调整为零
夸大的 scRNA-seq 数据,例如 Pearson 相关性 [1] 和互信息 [2]。
这些方法的一个特点是它们仅量化边际依赖性,这很容易受到虚假间接的影响
处理条件关联的图形模型是强大的替代方案。
出于通用目的,人们提出了许多方法[4, 5]。
包括非高斯数据的开发 [6-9],特别是 scRNA-seq 数据,两个无向数据。
最近提出了基于邻域选择的图形模型,包括 Co-I Cai 的工作 [10, 11]
然而,这并不能推断因果基因调控。为了确定因果关系,有几种选择。
开发了方法 [12, 13] 然而,这些方法要么忽略了 scRNA-seq 的计数性质。
数据,需要已知的伪时间(在真实的 scRNA-seq 数据中很少知道),或者理论上不需要
对于差分网络,有许多方法可以研究横截面观察的因果可识别性。
[14-18] 包括 PI 之前的工作 [19] 已经针对批量 RNA-seq 数据进行了开发,这些数据显示出很好的效果
联合分析相对于独立分析的优点然而,存在的微分网络要少得多。
scRNA-seq 数据的工作方法,例如 PT [20] 和 scdNet [21]。
scdNet 的缺点是他们只考虑边际依赖性(因此容易受到错误发现的影响)并且不考虑
发现因果关系。我们的初步结果 (§C.1) 表明所提出的贝叶斯模型。
网络模型能够识别横截面 scRNA-seq 中的因果基因调控关系
数据,并且通常优于最先进的替代方法。
目标 2. 构建细胞特异性网络的方法很少,因为很难估计
最近,开发了一种假设检验方法 [22]。
估计特定于小区的网络。该方法基于每个小区进行近似网络推断。
然而,它只考虑对称(无向)边际依赖,因此
无法推断因果监管关系,并且容易受到 PI 先前工作的虚假关联的影响。
[23] 解决了批量 RNA-seq 数据中的“样本大小一”问题,假设因果网络是
附加协变量的平滑函数但是,如果没有协变量,则该方法不适用。
不允许反馈循环,这是 GRN 中的一个常见主题 [24, 25],包括 PI 的 [19]。
1
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yang Ni其他文献
Yang Ni的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yang Ni', 18)}}的其他基金
Bayesian Differential Causal Network and Clustering Methods for Single-Cell Data
单细胞数据的贝叶斯差分因果网络和聚类方法
- 批准号:
10592720 - 财政年份:2022
- 资助金额:
$ 29.79万 - 项目类别:
相似国自然基金
脑电数据的动态因果建模及其贝叶斯统计推断
- 批准号:12201158
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
贝叶斯框架下基于变分推理的全波形反演速度建模及不确定性评价方法研究
- 批准号:
- 批准年份:2022
- 资助金额:56 万元
- 项目类别:面上项目
面向贝叶斯网络建模的地铁建设工程安全风险降低策略研究
- 批准号:
- 批准年份:2022
- 资助金额:45 万元
- 项目类别:面上项目
基于非参数贝叶斯方法的变点检测研究:建模与推断
- 批准号:12201422
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
机动和形态演化耦合的扩展目标非参数贝叶斯混合系统建模与估计
- 批准号:62271193
- 批准年份:2022
- 资助金额:54 万元
- 项目类别:面上项目
相似海外基金
Characterizing the genetic etiology of delayed puberty with integrative genomic techniques
利用综合基因组技术表征青春期延迟的遗传病因
- 批准号:
10663605 - 财政年份:2023
- 资助金额:
$ 29.79万 - 项目类别:
A mega-analysis framework for delineating autism neurosubtypes
描述自闭症神经亚型的大型分析框架
- 批准号:
10681965 - 财政年份:2023
- 资助金额:
$ 29.79万 - 项目类别:
Integrative analysis of whole genomes and transcriptomes from multiple cell types in rare disease patients
罕见病患者多种细胞类型的全基因组和转录组的综合分析
- 批准号:
10587683 - 财政年份:2023
- 资助金额:
$ 29.79万 - 项目类别:
Characterizing the genetic etiology of delayed puberty with integrative genomic techniques
利用综合基因组技术表征青春期延迟的遗传病因
- 批准号:
10663605 - 财政年份:2023
- 资助金额:
$ 29.79万 - 项目类别:
Development of a multi-RNA signature in blood towards a rapid diagnostic test to robustly distinguish patients with acute myocardial infarction
开发血液中的多 RNA 特征以进行快速诊断测试,以强有力地区分急性心肌梗死患者
- 批准号:
10603548 - 财政年份:2023
- 资助金额:
$ 29.79万 - 项目类别: