Deep tensor genomic imputation
深度张量基因组插补
基本信息
- 批准号:10557916
- 负责人:
- 金额:$ 38.38万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-02-01 至 2025-01-31
- 项目状态:未结题
- 来源:
- 关键词:3-DimensionalArchitectureAvocadoAwarenessBindingBiochemicalBiological AssayCell LineCellsCellular AssayChromatinChromatin Interaction Analysis by Paired-End Tag SequencingCollectionComplexComputer softwareCouplesDNADNA MethylationDNA SequenceDNA sequencingDataData SetDiseaseEpigenetic ProcessEvaluationFutureGene ExpressionGenetic TranscriptionGenetic VariationGenomeGenomicsGenotype-Tissue Expression ProjectGoalsHealthHi-CHigh-Throughput Nucleotide SequencingHumanIndividualInternetInvestigationJointsLearningMachine LearningMeasurementMeasuresMethodologyMethodsMethylationModelingMolecularPatternPositioning AttributeProcessPropertyRegulatory ElementResolutionResourcesSamplingScientistSystemTechniquesTechnologyTissuesTrainingUnited States National Institutes of HealthUntranslated RNAValidationVariantWorkbiological systemscell typecostdata standardsdeep neural networkexperimental studygenetic manipulationgenome-widegenomic datagenomic locushistone modificationimprovedin silicoinventionlarge datasetsmodel organismnext generationopen sourcepredictive modelingsyntaxtranscription factorweb portal
项目摘要
Project Summary/Abstract
High-throughput sequencing assays allow scientists to measure biochemical properties like transcription factor
binding, histone modifications, and gene expression in nearly any cell line or primary tissue (“biosample”).
Unfortunately, measuring all possible biochemical properties in every biosample is infeasible, both because of
limited sample availability and because the cost would be prohibitive. We have previously developed a state-of-
the-art imputation method, called Avocado, that can fill in the holes in such data sets. Avocado couples tensor
factorization with a deep neural network. The method is scalable to large data sets and provides more accurate
imputations than competing methods such as ChromImpute or PREDICTD. We have already applied Avocado
systematically to the NIH ENCODE data set and made the imputations publicly available via the ENCODE web
por tal.
Here, we propose to extend Avocado in four important ways. First, we will extend Avocado to handle single-cell
data sets, thereby effectively turning each single-cell experiment into an in silico co-assay that measures multiple
properties of each cell in parallel. Second, we will extend Avocado to work with data such as Hi-C, which measures
three-dimensional properties of DNA. The extension involves converting Avocado's 3D tensor (biosample assay
genomic position) to a 4D tensor with two genomic position axes. This extension will apply to a wide variety
of data types, including various types of Hi-C data, SPRITE, GAM, ChIA-PET and PLAC-seq. Third, we will
enhance Avocado to use variant aware genomic sequence to enable high-resolution imputation of regulatory
profiles. Finally, we will leverage the imputed data to infer cis-regulatory sequence annotations and the molecular
impact of regulatory non-coding variants in one of the most comprehensive collections of cellular contexts.
All of the software produced by this project will be open source, and all of the imputed data and latent
factorizations will be made publicly available via the web portals associated with the NIH 4D Nucleome and
ENCODE Consortia, providing a valuable public resource for users of these data sets.
项目概要/摘要
高通量测序分析使科学家能够测量转录因子等生化特性
几乎所有细胞系或原代组织(“生物样本”)中的结合、组蛋白修饰和基因表达。
不幸的是,测量每个生物样本中所有可能的生化特性是不可行的,这都是因为
样品可用性有限,而且成本过高,我们之前已经开发了一种状态。
最先进的插补方法,称为 Avocado,可以填补此类数据集中的漏洞。
该方法可扩展到大型数据集并提供更准确的结果。
与 ChromImpute 或 PREDICTD 等竞争方法相比,我们已经应用了 Avocado。
系统地添加到 NIH ENCODE 数据集,并通过 ENCODE 网络公开提供估算结果
门户网站。
在这里,我们建议以四种重要方式扩展 Avocado:首先,我们将扩展 Avocado 以处理单细胞。
数据集,从而有效地将每个单细胞实验转变为可测量多个的计算机联合测定
其次,我们将扩展 Avocado 来处理 Hi-C 等数据,它可以测量数据。
DNA 的三维特性。扩展涉及转换鳄梨的 3D 张量(生物样本测定)。
基因组位置)到具有两个基因组位置轴的 4D 张量此扩展将适用于多种情况。
数据类型,包括各种类型的Hi-C数据、SPRITE、GAM、ChIA-PET和PLAC-seq。
增强鳄梨以使用变异感知基因组序列来实现监管的高分辨率估算
最后,我们将利用估算数据来推断顺式调控序列注释和分子。
最全面的细胞环境集合之一中监管非编码变体的影响。
该项目生产的所有软件都将开源,并且所有估算数据和潜在数据
因子分解将通过与 NIH 4D Nucleome 相关的门户网站公开提供,
ENCODE Consortia,为这些数据集的用户提供宝贵的公共资源。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
William Stafford Noble其他文献
Prospects & Overviews Multiple dimensions of epigenetic gene regulation in the malaria parasite Plasmodium falciparum
前景
- DOI:
- 发表时间:
2024-09-14 - 期刊:
- 影响因子:0
- 作者:
F. Ay;E. Bunnik;N. Varoquaux;Jean;William Stafford Noble;K. Roch - 通讯作者:
K. Roch
architecture and gene expression genomethe erythrocytic cycle reveals a strong connection between genome during P . falciparum Three-dimensional modeling of the Material Supplemental
结构和基因表达基因组红细胞周期揭示了 P 期间基因组之间的紧密联系。
- DOI:
- 发表时间:
2024-09-14 - 期刊:
- 影响因子:0
- 作者:
F. Ay;E. Bunnik;N. Varoquaux;Sebastiaan Bol;J. Prudhomme;Jean;William Stafford Noble;K. Roch - 通讯作者:
K. Roch
Classification of clear-cell sarcoma as a subtype of melanoma by genomic profiling.
通过基因组分析将透明细胞肉瘤分类为黑色素瘤的亚型。
- DOI:
10.1200/jco.2003.10.108 - 发表时间:
2003-05-01 - 期刊:
- 影响因子:0
- 作者:
N. Segal;P. Pavlidis;William Stafford Noble;C. Antonescu;A. Viale;U. Wesley;K. Busam;H. Gallardo;D. Desantis;M. Brennan;C. Cordon;J. Wolchok;A. Houghton - 通讯作者:
A. Houghton
Exploring Gene Expression Data with Class Scores
使用类别分数探索基因表达数据
- DOI:
10.1142/9789812799623_0044 - 发表时间:
2001-12-01 - 期刊:
- 影响因子:0
- 作者:
P. Pavlidis;Darrin P. Lewis;William Stafford Noble - 通讯作者:
William Stafford Noble
A flexible workflow for building spectral libraries from narrow window data independent acquisition mass spectrometry data
用于从窄窗口数据独立采集质谱数据构建谱库的灵活工作流程
- DOI:
10.1101/2021.11.22.469568 - 发表时间:
2021-11-22 - 期刊:
- 影响因子:0
- 作者:
Lilian R. Heil;William E. Fondrie;Christopher D. McGann;Ale;er J. Federation;er;William Stafford Noble;M. MacCoss;U. Keich - 通讯作者:
U. Keich
William Stafford Noble的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('William Stafford Noble', 18)}}的其他基金
Optimization and joint modeling for peptide detection by tandem mass spectrometry
串联质谱肽检测的优化和联合建模
- 批准号:
9214942 - 财政年份:2017
- 资助金额:
$ 38.38万 - 项目类别:
Project 2: UW-CNOF Data Analysis and Modeling
项目 2:UW-CNOF 数据分析和建模
- 批准号:
9021413 - 财政年份:2015
- 资助金额:
$ 38.38万 - 项目类别:
University of Washington Center for Nuclear Organization and Function
华盛顿大学核组织与功能中心
- 批准号:
9916567 - 财政年份:2015
- 资助金额:
$ 38.38万 - 项目类别:
University of Washington Center for Nuclear Organization and Function
华盛顿大学核组织与功能中心
- 批准号:
9353379 - 财政年份:2015
- 资助金额:
$ 38.38万 - 项目类别:
University of Washington Center for Nuclear Organization and Function
华盛顿大学核组织与功能中心
- 批准号:
9983850 - 财政年份:2015
- 资助金额:
$ 38.38万 - 项目类别:
Machine learning methods to impute and annotate epigenomic maps
用于估算和注释表观基因组图谱的机器学习方法
- 批准号:
8925082 - 财政年份:2014
- 资助金额:
$ 38.38万 - 项目类别:
Machine learning methods to impute and annotate epigenomic maps
用于估算和注释表观基因组图谱的机器学习方法
- 批准号:
8814095 - 财政年份:2014
- 资助金额:
$ 38.38万 - 项目类别:
BIGDATA: DA: Interpreting massive genomic data sets via summarization
BIGDATA:DA:通过汇总解释海量基因组数据集
- 批准号:
8840551 - 财政年份:2013
- 资助金额:
$ 38.38万 - 项目类别:
BIGDATA: DA: Interpreting massive genomic data sets via summarization
BIGDATA:DA:通过汇总解释海量基因组数据集
- 批准号:
8642168 - 财政年份:2013
- 资助金额:
$ 38.38万 - 项目类别:
相似国自然基金
“共享建筑学”的时空要素及表达体系研究
- 批准号:
- 批准年份:2019
- 资助金额:63 万元
- 项目类别:面上项目
基于城市空间日常效率的普通建筑更新设计策略研究
- 批准号:51778419
- 批准年份:2017
- 资助金额:61.0 万元
- 项目类别:面上项目
宜居环境的整体建筑学研究
- 批准号:51278108
- 批准年份:2012
- 资助金额:68.0 万元
- 项目类别:面上项目
The formation and evolution of planetary systems in dense star clusters
- 批准号:11043007
- 批准年份:2010
- 资助金额:10.0 万元
- 项目类别:专项基金项目
新型钒氧化物纳米组装结构在智能节能领域的应用
- 批准号:20801051
- 批准年份:2008
- 资助金额:18.0 万元
- 项目类别:青年科学基金项目
相似海外基金
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
- 批准号:
2348066 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Standard Grant
Collaborative Research: Merging Human Creativity with Computational Intelligence for the Design of Next Generation Responsive Architecture
协作研究:将人类创造力与计算智能相结合,设计下一代响应式架构
- 批准号:
2329759 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Standard Grant
Collaborative Research: Merging Human Creativity with Computational Intelligence for the Design of Next Generation Responsive Architecture
协作研究:将人类创造力与计算智能相结合,设计下一代响应式架构
- 批准号:
2329760 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Standard Grant
CAREER: Creating Tough, Sustainable Materials Using Fracture Size-Effects and Architecture
职业:利用断裂尺寸效应和架构创造坚韧、可持续的材料
- 批准号:
2339197 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
- 批准号:
2339310 - 财政年份:2024
- 资助金额:
$ 38.38万 - 项目类别:
Continuing Grant