Novel bioinformatics methods for integrative detection of structural variants from long-read sequencing
用于从长读长测序中综合检测结构变异的新型生物信息学方法
基本信息
- 批准号:10752265
- 负责人:
- 金额:$ 4.77万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-09-15 至 2026-09-14
- 项目状态:未结题
- 来源:
- 关键词:AddressAreaAwarenessBase PairingBioinformaticsBiomedical EngineeringBiomedical ResearchCollectionCommunicationCommunitiesComplexDataData ScienceDetectionDevelopmentDiseaseEducationEthnic PopulationFutureGenetic VariationGenomeGenomic DNAGenomicsGoalsGraphHaplotypesHumanHuman GenomeLeadLengthMapsMethodsModelingOpticsOralPerformancePopulationRepetitive SequenceResearch TrainingResolutionResourcesSequence AlignmentSiteSourceStructureTechnologyTimeVariantWorkWritingbasecandidate identificationcareercomputerized toolscontigdata integrationdisease phenotypedoctoral studentfile formatgenome sequencinggenome-widegenomic platformgenomic variationhuman pangenomehuman reference genomeinsertion/deletion mutationmachine learning modelnovelpan-genomereference genomerestriction enzymescaffoldsequencing platformskillsstatisticstechnology developmenttoolvariant detectionwhole genome
项目摘要
Project Summary/Abstract
Structural variants (SVs) are the largest source of variations in the human genome and are frequently
associated with disease phenotypes. Thus, the identification and characterization of SVs are essential for
understanding human genome structure and function. The goal of this proposal is to develop a generalized SV
calling pipeline that can leverage information from the latest developments in sequencing technology and
human reference genome representations to discover and resolve SVs at high accuracy. I will first integrate
information across sequencing platforms to increase SV calling accuracy. Multiple sequencing and mapping
platforms are now used to detect SVs from human genome data. My pipeline will increase the accuracy of SV
calling with a data integration model that handles a diverse set of genomic platforms. I will next develop a novel
SV scoring model based on genomic context and coverage. Several factors, such as the generally low
sequence coverage in typical long-read studies, as well as alignment errors due to highly repetitive sequences,
can result in a potentially high rates of false positives for SVs when using parameters for high-sensitivity
calling. I use two sets of important features of SVs, genomic context and coverage, into a machine-learning
model to compute confidence in SV calls for downstream analysis. Finally, I will add support for graph genome
alignments by implementing support for sequence data aligned to graph genome assemblies in GFA file
format. Unlike single reference genomes, pangenomes are particularly useful for characterizing large-scale
structural differences in genomes between different ethnicity groups. Pangenomes would bring us closer to
capturing the full extent of human genomic variation, and thus represent an important resource to leverage for
SV calling. In summary, in this project I will develop a generalized SV calling pipeline capable of integrating
multiple technical platforms for discovering SVs and providing support for future developments in pangenome
graph assemblies. With the research training plan, I will 1) gain expertise in genomics and bioinformatics, 2)
promote diversity in biomedical research though involvement in educational efforts in the community, 3)
develop oral and written communication skills, and 4) prepare a scientific career focused on the study and
education of human genome variation.
项目概要/摘要
结构变异(SV)是人类基因组变异的最大来源,并且经常发生
与疾病表型相关。因此,SV 的识别和表征对于
了解人类基因组的结构和功能。该提案的目标是开发一个通用的 SV
调用管道可以利用测序技术最新发展的信息,
人类参考基因组表示,以高精度发现和解析 SV。我先整合一下
跨测序平台的信息以提高 SV 调用的准确性。多重测序和作图
平台现在用于从人类基因组数据中检测SV。我的管道将提高 SV 的准确性
使用处理各种基因组平台的数据集成模型进行调用。我接下来要开发小说
基于基因组背景和覆盖度的 SV 评分模型。受多种因素影响,比如普遍偏低
典型长读研究中的序列覆盖率,以及由于高度重复序列导致的比对错误,
当使用高灵敏度参数时,可能会导致 SV 误报率较高
打电话。我将 SV 的两组重要特征(基因组背景和覆盖范围)用于机器学习
计算 SV 置信度的模型需要下游分析。最后,我将添加对图基因组的支持
通过实现对与 GFA 文件中的图形基因组组件对齐的序列数据的支持来进行对齐
格式。与单一参考基因组不同,泛基因组对于表征大规模基因组特别有用
不同种族群体之间基因组的结构差异。泛基因组将使我们更接近
捕获人类基因组变异的全部范围,因此代表了可利用的重要资源
SV 来电。总之,在这个项目中,我将开发一个通用的 SV 调用管道,能够集成
多个技术平台用于发现SV并为泛基因组的未来发展提供支持
图形组件。通过研究培训计划,我将 1) 获得基因组学和生物信息学方面的专业知识,2)
通过参与社区教育工作促进生物医学研究的多样性,3)
培养口头和书面沟通技巧,以及 4)准备专注于研究和
人类基因组变异教育。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jonathan Perdomo其他文献
Jonathan Perdomo的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
蛋白法尼基化修饰对水稻边界区域和腋生分生组织发育的调控机制
- 批准号:32300312
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于能动性-结构组态效应的区域创业活动空间依赖与突破
- 批准号:42371173
- 批准年份:2023
- 资助金额:46 万元
- 项目类别:面上项目
包含低序列复杂度区域蛋白质相分离的跨尺度构象关联性研究
- 批准号:22303060
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于原位检-监测协同的大面积表层混凝土传输性能劣化区域快速识别方法研究
- 批准号:52378218
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
面向复杂应急区域的移动基站信号覆盖问题研究
- 批准号:72301209
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Implementation of Innovative Treatment for Moral Injury Syndrome: A Hybrid Type 2 Study
道德伤害综合症创新治疗的实施:2 型混合研究
- 批准号:
10752930 - 财政年份:2024
- 资助金额:
$ 4.77万 - 项目类别:
Implicit racial bias in pediatric emergency medicine: A foundational investigation of physician behaviors
儿科急诊医学中的隐性种族偏见:对医生行为的基础调查
- 批准号:
10722681 - 财政年份:2023
- 资助金额:
$ 4.77万 - 项目类别:
Mentored research in the intersection of kidney and cardiovascular disease
肾脏和心血管疾病交叉领域的指导研究
- 批准号:
10795588 - 财政年份:2023
- 资助金额:
$ 4.77万 - 项目类别:
Developing a Young Adult-Mediated Intervention to Increase Colorectal Cancer Screening among Rural Screening Age-Eligible Adults
制定年轻人介导的干预措施,以增加农村符合筛查年龄的成年人的结直肠癌筛查
- 批准号:
10653464 - 财政年份:2023
- 资助金额:
$ 4.77万 - 项目类别: