Cross-platform structural variant discovery with deep learning
通过深度学习跨平台结构变体发现
基本信息
- 批准号:10453237
- 负责人:
- 金额:$ 59.39万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-01 至 2027-06-30
- 项目状态:未结题
- 来源:
- 关键词:AlgorithmsAlzheimer&aposs DiseaseArchitectureAutoimmune DiseasesBenchmarkingCardiovascular DiseasesClinicalCommunitiesComplexComputer Vision SystemsComputer softwareConsensusCoupledDataData ReportingData SetDetectionDevelopmentDiagnosisDimensionsDiseaseEngineeringEnsureEvaluationFormulationGenerationsGeneticGenetic DiseasesGenetic VariationGenomeGenotypeGoalsHandHand functionsHi-CHuman GeneticsHuman GenomeHybridsImageLearningLinkMachine LearningMalignant NeoplasmsManualsMedicineMethodologyMethodsMindModelingPatternPerformancePlayPropertyResearchResolutionRoleSamplingScienceSequence AlignmentSignal TransductionSourceStatistical ModelsStructural ModelsStructureTechniquesTechnologyTrainingVariantWorkautism spectrum disorderblindcancer genomeconvolutional neural networkdeep learningdeep learning modeldeep neural networkdesigndiverse dataengineering designexperimental studyflexibilitygenome sequencinggenomic dataheuristicsimprovedmethod developmentnervous system disorderneural networkprecision medicineprototypesequencing platformsimulationtumorvariant detectionwhole genome
项目摘要
Structural variants (SV) are a major driver of the genetic diversity and disease in the human genome and their
discovery is imperative to advances in precision medicine and our understanding of human genetics. Due to
revolutionary breakthroughs in whole-genome sequencing technologies, we now have access to genomic data at an
unprecedented scale and resolution. However, despite tremendous effort and progress in SV calling methodology,
general SV discovery still remains unsolved. Existing techniques use hand-engineered features and heuristics to
model SV classes, relying heavily on developer expertise, which cannot scale to the vast diversity of SV types and
sequencing platforms nor fully harness all the information available in raw sequencing data. As a result, these
methods are usually tightly coupled to the properties of a particular sequencing technology and operate optimally
only on certain SV types and sizes, rendering us blind to many other classes of SVs and their role in disease. Deep
neural networks have the ability to learn complex abstractions automatically from the data and hence offer a
promising avenue for general SV discovery. Deep learning has recently transformed the field of machine learning
and led to remarkable advances in science and medicine. In this proposal we aim to leverage the potential of deep
learning for the problem of SV detection. We lay out how to efficiently formulate SV detection as a deep learning
task, and propose the development of a comprehensive framework to call and genotype SVs of different size and
type, including complex and subclonal SVs, given data from a range of sequencing platforms. In particular, we
demonstrate that state-of-the-art results can be obtained using our approach for short, linked, and long read
datasets. In order to ensure that our models generalize across different datasets, an important goal of our proposal
is also to assemble diverse and representative training data and perform extensive evaluation using publicly-
available multi-platform datasets to accurately assess model performance. Our software will be built with
extensibility and scalability in mind, and will be released, along with pretrained models and callsets, freely to the
community.
结构变异(SV)是人类基因组遗传多样性和疾病的主要驱动力
对于精确医学的进步和我们对人类遗传学的理解至关重要。由于
全基因组测序技术的革命性突破,我们现在可以访问一个基因组数据
空前的规模和解决方案。但是,尽管SV通话方法巨大的努力和进展,但
SV发现将军仍然未解决。现有技术使用手工设计的功能和启发式方法
模型SV类,严重依赖开发人员专业知识,这些专业知识无法扩展到SV类型的大量多样性和
测序平台或完全利用原始测序数据中可用的所有信息。结果,这些
方法通常紧密耦合到特定测序技术的属性,并最佳地运行
仅在某些SV类型和大小上,使我们对许多其他类别的SV及其在疾病中的作用视而不见。深的
神经网络能够从数据自动学习复杂的摘要,因此提供了
通用SV发现的有希望的大道。深度学习最近改变了机器学习领域
并在科学和医学方面取得了显着进步。在此提案中,我们旨在利用深厚的潜力
学习SV检测问题。我们布置了如何有效地提出SV检测作为深度学习
任务,并建议开发一个综合框架,以调用不同规模的svs和基因型SV
类型,包括复杂和亚克隆SVS,给出了来自一系列测序平台的数据。特别是我们
证明可以使用我们的方法来简短,链接和长阅读来获得最新的结果
数据集。为了确保我们的模型在不同数据集中概括,这是我们建议的重要目标
还要组装多样化和代表性的培训数据,并使用公开进行广泛的评估
可用的多平台数据集可准确评估模型性能。我们的软件将与
可扩展性和可伸缩性,并将与验证的型号和呼叫一起释放,自由地向
社区。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Victoria Popic其他文献
Victoria Popic的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Victoria Popic', 18)}}的其他基金
Cross-platform structural variant discovery with deep learning
通过深度学习跨平台结构变体发现
- 批准号:
10686879 - 财政年份:2022
- 资助金额:
$ 59.39万 - 项目类别:
相似海外基金
Fluency from Flesh to Filament: Collation, Representation, and Analysis of Multi-Scale Neuroimaging data to Characterize and Diagnose Alzheimer's Disease
从肉体到细丝的流畅性:多尺度神经影像数据的整理、表示和分析,以表征和诊断阿尔茨海默病
- 批准号:
10462257 - 财政年份:2023
- 资助金额:
$ 59.39万 - 项目类别:
A software tool to facilitate variable-level equivalency and harmonization in research data: Leveraging the NIH Common Data Elements Repository to link concepts and measures in an open format
促进研究数据中变量级别等效性和协调性的软件工具:利用 NIH 通用数据元素存储库以开放格式链接概念和测量
- 批准号:
10821517 - 财政年份:2023
- 资助金额:
$ 59.39万 - 项目类别:
Clinical Decision Support System for Early Detection of Cognitive Decline Using Electronic Health Records and Deep Learning
利用电子健康记录和深度学习早期检测认知衰退的临床决策支持系统
- 批准号:
10603902 - 财政年份:2023
- 资助金额:
$ 59.39万 - 项目类别:
Brain Digital Slide Archive: An Open Source Platform for data sharing and analysis of digital neuropathology
Brain Digital Slide Archive:数字神经病理学数据共享和分析的开源平台
- 批准号:
10735564 - 财政年份:2023
- 资助金额:
$ 59.39万 - 项目类别:
SCH: Dementia Early Detection for Under-represented Populations via Fair Multimodal Self-Supervised Learning
SCH:通过公平的多模式自我监督学习对代表性不足的人群进行痴呆症早期检测
- 批准号:
10816864 - 财政年份:2023
- 资助金额:
$ 59.39万 - 项目类别: