Single-cell high-throughput chromatin conformation capture methodologies (scHi-C) enable profiling of long-range genomic interactions. However, data from these technologies are prone to technical noise and biases that hinder downstream analysis. We develop a normalization approach, BandNorm, and a deep generative modeling framework, scVI-3D, to account for scHi-C specific biases. In benchmarking experiments, BandNorm yields leading performances in a time and memory efficient manner for cell-type separation, identification of interacting loci, and recovery of cell-type relationships, while scVI-3D exhibits advantages for rare cell types and under high sparsity scenarios. Application of BandNorm coupled with gene-associating domain analysis reveals scRNA-seq validated sub-cell type identification.
The online version contains supplementary material available at 10.1186/s13059-022-02774-z.
单细胞高通量染色质构象捕获方法(scHi - C)能够对长距离基因组相互作用进行分析。然而,这些技术产生的数据容易受到技术噪声和偏差的影响,从而阻碍下游分析。我们开发了一种归一化方法BandNorm以及一个深度生成建模框架scVI - 3D,以解决scHi - C特有的偏差问题。在基准实验中,BandNorm在细胞类型分离、相互作用位点识别以及细胞类型关系恢复方面,以高效利用时间和内存的方式取得了领先的性能,而scVI - 3D在稀有细胞类型以及高稀疏性情景下具有优势。BandNorm与基因相关结构域分析相结合的应用揭示了经单细胞RNA测序(scRNA - seq)验证的亚细胞类型识别。
在线版本包含补充材料,可在10.1186/s13059 - 2022 - 02774 - z获取。