Machine learning methods to impute and annotate epigenomic maps
用于估算和注释表观基因组图谱的机器学习方法
基本信息
- 批准号:8925082
- 负责人:
- 金额:$ 28.29万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2014
- 资助国家:美国
- 起止时间:2014-09-10 至 2017-08-31
- 项目状态:已结题
- 来源:
- 关键词:AccountingAdultArchitectureBig DataBiochemicalBiological AssayBiological ProcessCell NucleusCellsChIP-seqChromatinCollectionComputer softwareComputing MethodologiesDNADNase I hypersensitive sites sequencingDataData AnalysesData SetDependencyDiseaseElementsEvolutionExhibitsFutureGenerationsGenomeGenomicsGoalsGraphHealthHumanHuman GenomeIn VitroInvestigationJointsLabelLinkMachine LearningMapsMeasuresMethodologyMethodsModelingPatternPhenotypePositron-Emission TomographyProcessRecording of previous eventsResourcesRoleSchemeSystemTechniquesTimeTissuesUnited States National Institutes of HealthVirtual LibraryWorkassay developmentbasecell typecomputer based statistical methodsepigenomeepigenomicsfetalfunctional genomicsgenome annotationhistone modificationhuman diseaseimprovedinsightmarkov modelnovelpressureprogramsresearch studythree dimensional structuretranscriptome sequencingvirtual
项目摘要
DESCRIPTION (provided by applicant): The NIH Roadmap Epigenomics Program has produced reference epigenomic maps derived from a variety of human primary cells and tissues, including pluripotent cell types and in vitro differentiated forms, highly purified primar cells, and a range of fetal and adult tissues. The goal of the proposed project is to develop, validate and apply unsupervised machine learning methods to the joint analysis of these epigenomic maps along with (1) data generated by the NIH ENCODE Consortium, (2) a variety of publicly available data sets that characterize the three-dimensional structure of DNA in the nucleus, and (3) information about evolutionary conservation, represented by cross-species DNA alignments. The first aim of the project will use data imputation methods to carry out virtual functional genomics experiments. The proposed method is based on techniques developed in the context of recommender systems, but is extended to model dependencies along the genomic axis. By simultaneously analyzing the pattern of biochemical activity across a range of cell types and assay types, the proposed imputation method will accurately predict the results of an assay, such as ChIP-seq for a particular histone modification in a particular cell type, that has not yet been carried out. We will systematically apply this method to Roadmap Epigenomics and ENCODE data, filling in missing experiments in the matrix of cell types and assay types. The remaining three specific aims extend and apply our existing system for semi-automated genome annotation, Segway, which integrates a wide variety of functional genomics data into a human interpretable labeling of genomic elements. These analyses will be performed on real data as well as the virtual experiments from Aim 1. We propose a novel, graph-based regularization scheme and show how, using this approach, we can use Segway to perform integrated analysis of data across cell types and integrate 3D genome architecture information from assays such as Hi-C. We also propose a post-processing method to exploit patterns of evolutionary conservation to identify functionally important labels in the resulting annotations. The primary deliverables will include novel software for imputation and annotation, as well as publicly available sets of virtual experiments and genome annotations.
描述(由申请人提供):NIH 表观基因组路线图计划已制作了源自多种人类原代细胞和组织的参考表观基因组图谱,包括多能细胞类型和体外分化形式、高度纯化的原代细胞以及一系列胎儿和成人组织。拟议项目的目标是开发、验证和应用无监督机器学习方法来联合分析这些表观基因组图谱以及 (1) NIH ENCODE 联盟生成的数据,(2) 各种公开可用的数据集,这些数据集表征细胞核中 DNA 的三维结构,以及(3)有关进化保守的信息,以跨物种 DNA 比对表示。该项目的第一个目标将使用数据插补方法进行虚拟功能基因组学实验。所提出的方法基于在推荐系统背景下开发的技术,但扩展到沿基因组轴建模依赖关系。通过同时分析一系列细胞类型和检测类型的生化活性模式,所提出的插补方法将准确预测检测结果,例如针对特定细胞类型中特定组蛋白修饰的 ChIP-seq尚未进行。我们将系统地将这种方法应用于路线图表观基因组学和编码数据,填补细胞类型和分析类型矩阵中缺失的实验。其余三个具体目标扩展并应用我们现有的半自动基因组注释系统 Segway,该系统将各种功能基因组学数据集成到人类可解释的基因组元素标记中。这些分析将在真实数据以及目标 1 的虚拟实验上进行。我们提出了一种新颖的、基于图形的正则化方案,并展示了如何使用这种方法,使用 Segway 对跨细胞类型和数据进行综合分析。整合来自 Hi-C 等检测的 3D 基因组架构信息。我们还提出了一种后处理方法,利用进化保守模式来识别结果注释中功能重要的标签。主要交付成果将包括用于插补和注释的新颖软件,以及公开可用的虚拟实验和基因组注释集。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
William Stafford Noble其他文献
Gut Microbial Protein Expression in Response to Dietary Patterns in a Controlled Feeding Study: A Metaproteomic Approach
控制喂养研究中肠道微生物蛋白表达对饮食模式的反应:宏蛋白质组学方法
- DOI:
10.3390/microorganisms8030379 - 发表时间:
2020-03-01 - 期刊:
- 影响因子:4.5
- 作者:
Sheng Pan;M. Hullar;L. Lai;Hong Peng;Damon H. May;William Stafford Noble;D. Raftery;S;i L Navarro;i;M. Neuhouser;P. Lampe;J. Lampe;Ru Chen - 通讯作者:
Ru Chen
Classification of clear-cell sarcoma as a subtype of melanoma by genomic profiling.
通过基因组分析将透明细胞肉瘤分类为黑色素瘤的亚型。
- DOI:
10.1200/jco.2003.10.108 - 发表时间:
2003-05-01 - 期刊:
- 影响因子:0
- 作者:
N. Segal;P. Pavlidis;William Stafford Noble;C. Antonescu;A. Viale;U. Wesley;K. Busam;H. Gallardo;D. Desantis;M. Brennan;C. Cordon;J. Wolchok;A. Houghton - 通讯作者:
A. Houghton
Exploring Gene Expression Data with Class Scores
使用类别分数探索基因表达数据
- DOI:
10.1142/9789812799623_0044 - 发表时间:
2001-12-01 - 期刊:
- 影响因子:0
- 作者:
P. Pavlidis;Darrin P. Lewis;William Stafford Noble - 通讯作者:
William Stafford Noble
Prospects & Overviews Multiple dimensions of epigenetic gene regulation in the malaria parasite Plasmodium falciparum
前景
- DOI:
- 发表时间:
2024-09-14 - 期刊:
- 影响因子:0
- 作者:
F. Ay;E. Bunnik;N. Varoquaux;Jean;William Stafford Noble;K. Roch - 通讯作者:
K. Roch
architecture and gene expression genomethe erythrocytic cycle reveals a strong connection between genome during P . falciparum Three-dimensional modeling of the Material Supplemental
结构和基因表达基因组红细胞周期揭示了 P 期间基因组之间的紧密联系。
- DOI:
- 发表时间:
2024-09-14 - 期刊:
- 影响因子:0
- 作者:
F. Ay;E. Bunnik;N. Varoquaux;Sebastiaan Bol;J. Prudhomme;Jean;William Stafford Noble;K. Roch - 通讯作者:
K. Roch
William Stafford Noble的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('William Stafford Noble', 18)}}的其他基金
Optimization and joint modeling for peptide detection by tandem mass spectrometry
串联质谱肽检测的优化和联合建模
- 批准号:
9214942 - 财政年份:2017
- 资助金额:
$ 28.29万 - 项目类别:
Project 2: UW-CNOF Data Analysis and Modeling
项目 2:UW-CNOF 数据分析和建模
- 批准号:
9021413 - 财政年份:2015
- 资助金额:
$ 28.29万 - 项目类别:
University of Washington Center for Nuclear Organization and Function
华盛顿大学核组织与功能中心
- 批准号:
9916567 - 财政年份:2015
- 资助金额:
$ 28.29万 - 项目类别:
University of Washington Center for Nuclear Organization and Function
华盛顿大学核组织与功能中心
- 批准号:
9353379 - 财政年份:2015
- 资助金额:
$ 28.29万 - 项目类别:
University of Washington Center for Nuclear Organization and Function
华盛顿大学核组织与功能中心
- 批准号:
9983850 - 财政年份:2015
- 资助金额:
$ 28.29万 - 项目类别:
Machine learning methods to impute and annotate epigenomic maps
用于估算和注释表观基因组图谱的机器学习方法
- 批准号:
8814095 - 财政年份:2014
- 资助金额:
$ 28.29万 - 项目类别:
BIGDATA: DA: Interpreting massive genomic data sets via summarization
BIGDATA:DA:通过汇总解释海量基因组数据集
- 批准号:
8840551 - 财政年份:2013
- 资助金额:
$ 28.29万 - 项目类别:
BIGDATA: DA: Interpreting massive genomic data sets via summarization
BIGDATA:DA:通过汇总解释海量基因组数据集
- 批准号:
8642168 - 财政年份:2013
- 资助金额:
$ 28.29万 - 项目类别:
相似国自然基金
基于动态信息的深度学习辅助设计成人脊柱畸形手术方案的研究
- 批准号:82372499
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
SMC4/FoxO3a介导的CD38+HLA-DR+CD8+T细胞增殖在成人斯蒂尔病MAS发病中的作用研究
- 批准号:82302025
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
单核细胞产生S100A8/A9放大中性粒细胞炎症反应调控成人Still病发病及病情演变的机制研究
- 批准号:82373465
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
SERPINF1/SRSF6/B7-H3信号通路在成人B-ALL免疫逃逸中的作用及机制研究
- 批准号:82300208
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
MRI融合多组学特征量化高级别成人型弥漫性脑胶质瘤免疫微环境并预测术后复发风险的研究
- 批准号:82302160
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Integrating Genomic Risk Assessment for Chronic Disease Management in a Diverse Population
整合基因组风险评估以进行不同人群的慢性病管理
- 批准号:
10852376 - 财政年份:2023
- 资助金额:
$ 28.29万 - 项目类别:
Computational Imaging of Renal Structures for Diagnosing DiabeticNephropathy
用于诊断糖尿病肾病的肾脏结构计算成像
- 批准号:
10665182 - 财政年份:2022
- 资助金额:
$ 28.29万 - 项目类别:
Outcomes for Children with Asthma on Medicaid: Elucidating Key Determinants at the Policy, Plan, Neighborhood, and Person Levels to Address Disparities.
哮喘儿童医疗补助的结果:阐明政策、计划、社区和个人层面的关键决定因素,以解决差异。
- 批准号:
10609527 - 财政年份:2022
- 资助金额:
$ 28.29万 - 项目类别:
Outcomes for Children with Asthma on Medicaid: Elucidating Key Determinants at the Policy, Plan, Neighborhood, and Person Levels to Address Disparities.
哮喘儿童医疗补助的结果:阐明政策、计划、社区和个人层面的关键决定因素,以解决差异。
- 批准号:
10429507 - 财政年份:2022
- 资助金额:
$ 28.29万 - 项目类别:
Genetics and epigenetics of pediatric germ cell tumors
儿童生殖细胞肿瘤的遗传学和表观遗传学
- 批准号:
10364222 - 财政年份:2022
- 资助金额:
$ 28.29万 - 项目类别: