Integrative genomic and epigenomic analysis of cancer using long read sequencing
使用长读长测序对癌症进行综合基因组和表观基因组分析
基本信息
- 批准号:10187808
- 负责人:
- 金额:$ 38.35万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-05-01 至 2024-04-30
- 项目状态:已结题
- 来源:
- 关键词:AccountingAddressAlgorithmic AnalysisAlgorithmsAllelesAutomobile DrivingBasic ScienceBioinformaticsBiological SciencesCancer BiologyCancerousCatalogingCharacteristicsClinicalCommunitiesComplexComputing MethodologiesCopy Number PolymorphismCytosineDNA Sequence RearrangementDNA Transposable ElementsDataData SetDetectionDevelopmentDiagnosticDiseaseEnsureEpigenetic ProcessGene Expression ProfilingGenerationsGenesGeneticGenetic VariationGenetic studyGenomeGenomicsGenotypeGoalsGraphGrowthHealthIndividualJointsKaryotypeMachine LearningMalignant NeoplasmsMediatingMethodsMethylationMinisatellite RepeatsModelingMonitorMosaicismMutationNatureNormal tissue morphologyOncogenesOutcomePathogenicityPatientsPhasePopulationPrognostic MarkerProtein IsoformsRecurrenceRepetitive SequenceResearchResearch PersonnelResolutionResourcesRoleSample SizeSamplingSignal TransductionSomatic MutationStatistical MethodsStructureSystemTandem Repeat SequencesTechnologyTissuesTumor Suppressor ProteinsVariantWorkbasecancer genomecancer genomicscancer initiationcancer riskcancer therapycancer typecohortdisorder riskdriver mutationepigenetic profilingepigenetic variationepigenomeepigenomicsexperiencefusion genegenetic pedigreegenetic variantgenome analysisgenome sequencingimprovedindexinginsightinstrumentmethylomenanoporenovelnovel strategiesopen sourcepower analysispremalignantrisk variantsequencing platformsingle moleculetranscriptometranscriptomicstumortumor heterogeneitytumor progression
项目摘要
PROJECT SUMMARY
The last twenty years have experienced extensive growth in the sequencing of cancer genomes, leading to a
dramatically increased understanding of the role of genetic and epigenetic mutations in cancer. This has largely
been enabled by developments in high-throughput “second-generation” sequencing technology and analysis
that characterize cancer genomes using short-reads. Recently, a new generation of high-throughput long-read
sequencing instruments, primarily from Pacific Biosciences and Oxford Nanopore, have become available that
are poised to displace short-read sequencing for many applications. We and others have used these
technologies to discover tens of thousands of variants per cancer genome that are not detectable using
short-reads, including structural variants and differentially methylated regions in known oncogenes and cancer
risk genes. These technologies carry the potential to address many open questions in cancer biology, however,
the analysis of long-read sequencing data is computationally demanding and needs specialized algorithms that
are either too inefficient to use at scale or do not yet exist. In this proposal, we will address several gaps in the
application of long-read technology for basic research and clinical use in cancer genomics. First, we will
develop improved methods for finding structural variants and complex repeat expansions from long-reads,
both of which are major diagnostic and prognostic indicators of disease, yet are not accurately identified using
existing methods. Leveraging the improved phasing capabilities of long reads, this work will include the
detection of mosaic variants, revealing tumor heterogeneity and variants in precancerous tissues. Next, we will
apply machine learning and systems level advances to accelerate and improve the comparison of variants
across large patient cohorts. Critically, this will compensate for the error prone nature of single molecule
long-read sequencing to make these comparisons more accurate when comparing tumor-normal samples or
pedigrees of related patients so that recurrent driving mutations can be accurately identified. Finally, we will
develop integrative methods for the joint analysis of genome, transcriptome, and epigenetic profiling of cancer
genomes. These advances will improve the identification of fusion genes, and allow for entirely new forms of
epigenetic analysis, such as the allele-specific analysis of methylation across transposable elements and other
repetitive elements. Synthesizing the many thousands of novel variants we will detect using our methods, we
will then develop algorithms that will identify and evaluate recurrent genetic or epigenetic variations as
putative driving mutations. All methods will be released open-source and will empower us, our ITCR
collaborators, and the cancer genomics community at large to study genetic and epigenetic variants with near
perfect accuracy and thereby unlock many new associations to treatment and disease.
项目概要
过去二十年,癌症基因组测序取得了长足发展,导致了
这在很大程度上提高了人们对遗传和表观遗传突变在癌症中的作用的认识。
高通量“第二代”测序技术和分析的发展已成为可能
最近,新一代高通量长读长技术用于表征癌症基因组。
主要来自 Pacific Biosciences 和 Oxford Nanopore 的测序仪器已经上市,
我们和其他人已经准备好在许多应用中取代短读长测序。
技术可发现数以万计的癌症基因组,而这些基因组是使用无法检测到的
短读,包括已知癌基因和癌症中的结构变异和差异甲基化区域
这些技术有可能解决癌症生物学中的许多悬而未决的问题。
长读长测序数据的分析对计算的要求很高,并且需要专门的算法
要么效率太低而无法大规模使用,要么尚不存在。在本提案中,我们将解决其中的几个差距。
首先,我们将长读技术应用于癌症基因组学的基础研究和临床。
开发改进的方法来从长读中查找结构变异和复杂的重复扩展,
两者都是疾病的主要诊断和预后指标,但尚未通过使用准确识别
利用现有方法改进的长读取定相能力,这项工作将包括
检测镶嵌变异,揭示肿瘤异质性和癌前组织的变异。
应用机器学习和系统级进步来加速和改进变体的比较
至关重要的是,这将弥补单分子容易出错的性质。
长读长测序使这些比较在比较肿瘤与正常样本或
最后,我们将分析相关患者的家谱,以便准确识别反复发生的驱动突变。
开发联合分析癌症基因组、转录组和表观遗传图谱的综合方法
这些进步将改善融合基因的识别,并允许全新形式的融合。
表观遗传分析,例如跨转座元件的甲基化的等位基因特异性分析和其他
合成我们将使用我们的方法检测到的数千种新变体。
然后将开发算法来识别和评估反复出现的遗传或表观遗传变异
假定的驱动突变。所有方法都将开源,并将赋予我们 ITCR 权力。
合作者和整个癌症基因组学界,以近乎接近的方式研究遗传和表观遗传变异
完美的准确性,从而解锁许多治疗和疾病的新关联。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
MICHAEL SCHATZ其他文献
MICHAEL SCHATZ的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('MICHAEL SCHATZ', 18)}}的其他基金
EXPANDING THE GENOMIC DATA SCIENCE COMMUNITY NETWORK FOR NHGRI.
扩大 NHGRI 的基因组数据科学社区网络。
- 批准号:
10944109 - 财政年份:2023
- 资助金额:
$ 38.35万 - 项目类别:
Optimized workflows for structural variant analysis of the Kids First genomes using short and long reads
使用短读长和长读长对 Kids First 基因组进行结构变异分析的优化工作流程
- 批准号:
10432507 - 财政年份:2022
- 资助金额:
$ 38.35万 - 项目类别:
Optimized workflows for structural variant analysis of the Kids First genomes using short and long reads
使用短读长和长读长对 Kids First 基因组进行结构变异分析的优化工作流程
- 批准号:
10602532 - 财政年份:2022
- 资助金额:
$ 38.35万 - 项目类别:
Integrative genomic and epigenomic analysis of cancer using long read sequencing
使用长读长测序对癌症进行综合基因组和表观基因组分析
- 批准号:
10396074 - 财政年份:2021
- 资助金额:
$ 38.35万 - 项目类别:
Integrative genomic and epigenomic analysis of cancer using long read sequencing
使用长读长测序对癌症进行综合基因组和表观基因组分析
- 批准号:
10599150 - 财政年份:2021
- 资助金额:
$ 38.35万 - 项目类别:
相似国自然基金
时空序列驱动的神经形态视觉目标识别算法研究
- 批准号:61906126
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
本体驱动的地址数据空间语义建模与地址匹配方法
- 批准号:41901325
- 批准年份:2019
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
大容量固态硬盘地址映射表优化设计与访存优化研究
- 批准号:61802133
- 批准年份:2018
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
针对内存攻击对象的内存安全防御技术研究
- 批准号:61802432
- 批准年份:2018
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
IP地址驱动的多径路由及流量传输控制研究
- 批准号:61872252
- 批准年份:2018
- 资助金额:64.0 万元
- 项目类别:面上项目
相似海外基金
Integrative genomic and epigenomic analysis of cancer using long read sequencing
使用长读长测序对癌症进行综合基因组和表观基因组分析
- 批准号:
10396074 - 财政年份:2021
- 资助金额:
$ 38.35万 - 项目类别:
Machine Learning for Integrative Modeling of the Immune System in Clinical Settings
临床环境中免疫系统综合建模的机器学习
- 批准号:
10251069 - 财政年份:2020
- 资助金额:
$ 38.35万 - 项目类别:
Machine Learning for Integrative Modeling of the Immune System in Clinical Settings
临床环境中免疫系统综合建模的机器学习
- 批准号:
10028766 - 财政年份:2020
- 资助金额:
$ 38.35万 - 项目类别:
Machine Learning for Integrative Modeling of the Immune System in Clinical Settings
临床环境中免疫系统综合建模的机器学习
- 批准号:
10461194 - 财政年份:2020
- 资助金额:
$ 38.35万 - 项目类别:
Machine Learning for Integrative Modeling of the Immune System in Clinical Settings
临床环境中免疫系统综合建模的机器学习
- 批准号:
10682328 - 财政年份:2020
- 资助金额:
$ 38.35万 - 项目类别: