III: Small: Integrating and Interpreting Heterogeneous Genomic Data Through Deep Learning
III:小:通过深度学习整合和解释异质基因组数据
基本信息
- 批准号:1715017
- 负责人:
- 金额:$ 47.08万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-09-01 至 2021-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Comprehensive identification of all functional elements encoded in genomes is a fundamental need in both basic and applied biological research. Although the coding regions of genomes are well understood, the noncoding regions, representing over 98% of mammalian genomes, are far less studied, but hold the key to understanding gene regulation, evolution, genetic basis of complex phenotypes, etc. The goal of this project is to develop computational methods to infer the function of noncoding sequences by leveraging the plethora of data from publicly available genomic data and state-of-the-art algorithms from machine learning. These algorithms can greatly expand the utility of existing genomic data, improving the accuracy of annotating pathogenicity of noncoding variants, and offering a new way of studying grammars of gene regulation encoded by noncoding sequence. The project will additionally create opportunities to facilitate interactions between biologists and computer scientists, and offer interdisciplinary training for both undergraduate and graduate students, especially those from traditionally underrepresented groups.The goal of this project is to develop a new computational framework based on deep learning to understand noncoding sequences. Over the past few years, researchers have generated thousands of genome-scale datasets on chromatin accessibility, histone modifications, DNA methylation, protein-binding, and others, spanning a broad range of tissue and cell types. This project will integrate these heterogeneous datasets to derive a comprehensive characterization of noncoding sequence through innovative machine learning algorithms based on convolutional and recurrent neural nets, and deep generative models. The PI will develop deep learning algorithms to map the relationship between noncoding sequences and the diverse genomic measurements, learn chromatin states and discover novel functional elements from these measurements, and predict effects of noncoding genetic variants. Training a flexible and scalable learning model with large amounts of data provides a way of characterizing noncoding sequences in an unbiased and robust fashion, and offers a better chance of extracting complex regulatory rules encoded within noncoding sequences than conventional methods. This project will provide the genomics community with a versatile, modular, open-source toolbox of software packages, with the goal of greatly improving the accuracy of current genome analyses.
全面鉴定基因组中编码的所有功能元件是基础和应用生物学研究的基本需求。尽管基因组的编码区已为人们所熟知,但对占哺乳动物基因组 98% 以上的非编码区的研究却少之又少,但它们却是理解基因调控、进化、复杂表型的遗传基础等的关键。该项目的目的是开发计算方法,通过利用来自公开基因组数据的大量数据和来自机器学习的最先进算法来推断非编码序列的功能。 这些算法可以极大地扩展现有基因组数据的效用,提高非编码变异致病性注释的准确性,并为研究非编码序列编码的基因调控语法提供新的途径。该项目还将创造机会促进生物学家和计算机科学家之间的互动,并为本科生和研究生提供跨学科培训,特别是那些来自传统上代表性不足群体的学生。该项目的目标是开发一种基于深度学习的新计算框架理解非编码序列。在过去的几年里,研究人员已经生成了数千个关于染色质可及性、组蛋白修饰、DNA 甲基化、蛋白质结合等的基因组规模数据集,涵盖广泛的组织和细胞类型。该项目将整合这些异构数据集,通过基于卷积和循环神经网络以及深度生成模型的创新机器学习算法,得出非编码序列的全面表征。 PI 将开发深度学习算法来绘制非编码序列与不同基因组测量之间的关系,学习染色质状态并从这些测量中发现新的功能元素,并预测非编码遗传变异的影响。使用大量数据训练灵活且可扩展的学习模型提供了一种以无偏且稳健的方式表征非编码序列的方法,并且比传统方法提供了更好的机会提取非编码序列中编码的复杂监管规则。该项目将为基因组学界提供一个多功能、模块化、开源软件包工具箱,目标是大大提高当前基因组分析的准确性。
项目成果
期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Learning Sample-Specific Policies for Sequential Image Augmentation
- DOI:10.1145/3474085.3475602
- 发表时间:2021-10
- 期刊:
- 影响因子:0
- 作者:Pu Li;Xiaobai Liu;Xiaohui Xie
- 通讯作者:Pu Li;Xiaobai Liu;Xiaohui Xie
SynergyNet: A Fusion Framework for Multiple Sclerosis Brain MRI Segmentation with Local Refinement
- DOI:10.1109/isbi45749.2020.9098610
- 发表时间:2020-04
- 期刊:
- 影响因子:0
- 作者:Y. S. Vang;Yingxin Cao;P. Chang;D. Chow;A. Brandt;F. Paul;M. Scheel;Xiaohui Xie
- 通讯作者:Y. S. Vang;Yingxin Cao;P. Chang;D. Chow;A. Brandt;F. Paul;M. Scheel;Xiaohui Xie
Undistillable: Making A Nasty Teacher That CANNOT teach students
- DOI:
- 发表时间:2021-05
- 期刊:
- 影响因子:0
- 作者:Haoyu Ma;Tianlong Chen;Ting-Kuei Hu;Chenyu You;Xiaohui Xie;Zhangyang Wang
- 通讯作者:Haoyu Ma;Tianlong Chen;Ting-Kuei Hu;Chenyu You;Xiaohui Xie;Zhangyang Wang
FactorNet: A deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data
- DOI:10.1016/j.ymeth.2019.03.020
- 发表时间:2019-08-15
- 期刊:
- 影响因子:4.8
- 作者:Quang, Daniel;Xie, Xiaohui
- 通讯作者:Xie, Xiaohui
SAILER: scalable and accurate invariant representation learning for single-cell ATAC-seq processing and integration.
- DOI:10.1093/bioinformatics/btab303
- 发表时间:2021-07-12
- 期刊:
- 影响因子:0
- 作者:Cao Y;Fu L;Wu J;Peng Q;Nie Q;Zhang J;Xie X
- 通讯作者:Xie X
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Xiaohui Xie其他文献
Exact solution of the nonlinear dynamics of recurrent neural mechanisms for direction selectivity
方向选择性递归神经机制非线性动力学的精确解
- DOI:
10.1016/s0925-2312(02)00394-6 - 发表时间:
2002 - 期刊:
- 影响因子:6
- 作者:
M. Giese;Xiaohui Xie - 通讯作者:
Xiaohui Xie
Evaluation of the Diagnostic Accuracy of the CareStart™ Glucose-6-Phosphate Dehydrogenase Deficiency Rapid Diagnostic Test among Chinese Newborns.
CareStart™ 6-磷酸葡萄糖脱氢酶缺乏症快速诊断检测在中国新生儿中的诊断准确性评估。
- DOI:
10.1093/tropej/fmaa003 - 发表时间:
2020 - 期刊:
- 影响因子:2
- 作者:
Feng;Sufen Zhang;Binhuan Chen;Yu;Chengjie Ma;Siyuan Yang;Yun;D. Huang;Xiaohui Xie;Qi;Linghang Wang - 通讯作者:
Linghang Wang
A pediatric case of generalized lichen aureus
儿科全身性金黄色苔藓一例
- DOI:
10.1111/dth.13265 - 发表时间:
2020 - 期刊:
- 影响因子:3.6
- 作者:
Xiaohui Xie;Jin;Yi Xu;M. Qi - 通讯作者:
M. Qi
A novel nonsense PKD1L1 variant cause heterotaxy syndrome with congenital asplenia in a Han Chinese patient
一种新的无意义 PKD1L1 变异导致汉族患者患有先天性无脾异位综合征
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:3.5
- 作者:
H. Gu;Zhuang;Xiaohui Xie;Yi;Z. Tan - 通讯作者:
Z. Tan
Representation Recovering for Self-Supervised Pre-training on Medical Images
医学图像自监督预训练的表示恢复
- DOI:
10.1109/wacv56688.2023.00271 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Xiangyi Yan;Junayed Naushad;Shanlin Sun;Kun Han;Hao Tang;Deying Kong;Haoyu Ma;Chenyu You;Xiaohui Xie - 通讯作者:
Xiaohui Xie
Xiaohui Xie的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Xiaohui Xie', 18)}}的其他基金
CAREER: Computational Tools for Interpreting Genomes
职业:解释基因组的计算工具
- 批准号:
0846218 - 财政年份:2009
- 资助金额:
$ 47.08万 - 项目类别:
Standard Grant
相似国自然基金
员工算法规避行为的内涵结构、量表开发及多层次影响机制:基于大(小)数据研究方法整合视角
- 批准号:72372021
- 批准年份:2023
- 资助金额:40 万元
- 项目类别:面上项目
小整合膜蛋白SMIM24通过PON2介导的GLUT1质膜转位调控胃癌糖酵解和侵袭转移的机制研究
- 批准号:
- 批准年份:2022
- 资助金额:51 万元
- 项目类别:面上项目
整合深度学习和分子对接的RNA-小分子建模研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
整合素ITGB3促进PD-L1表达在非小细胞肺癌脊柱转移瘤治疗中的作用及机制研究
- 批准号:
- 批准年份:2021
- 资助金额:55 万元
- 项目类别:面上项目
PNPT1及其小分子抑制剂在非小细胞肺癌整合应激反应中的功能和应用研究
- 批准号:
- 批准年份:2021
- 资助金额:58 万元
- 项目类别:面上项目
相似海外基金
III : Small : Integrating and Learning on Spatial Data via Multi-Agent Simulation
III:小:通过多智能体模拟集成和学习空间数据
- 批准号:
2311954 - 财政年份:2023
- 资助金额:
$ 47.08万 - 项目类别:
Standard Grant
Integrating Radiomics into S0819 and Lung-MAP, Biomarker Driven Clinical Trials for Lung Cancer
将放射组学整合到 S0819 和 Lung-MAP、生物标志物驱动的肺癌临床试验中
- 批准号:
10177883 - 财政年份:2018
- 资助金额:
$ 47.08万 - 项目类别:
Integrating Radiomics into S0819 and Lung-MAP, Biomarker Driven Clinical Trials for Lung Cancer
将放射组学整合到 S0819 和 Lung-MAP、生物标志物驱动的肺癌临床试验中
- 批准号:
10417115 - 财政年份:2018
- 资助金额:
$ 47.08万 - 项目类别:
III: Small: Integrating Casual Discovery and Feature Selection with Streaming Features
III:小:将休闲发现和特征选择与流媒体功能相结合
- 批准号:
1613950 - 财政年份:2016
- 资助金额:
$ 47.08万 - 项目类别:
Standard Grant
III: Small: Integrating Casual Discovery and Feature Selection with Streaming Features
III:小:将休闲发现和特征选择与流媒体功能相结合
- 批准号:
1652107 - 财政年份:2016
- 资助金额:
$ 47.08万 - 项目类别:
Standard Grant