III: Small: Collaborative Research: A Scalable and Efficient Optical Map Assembler
III:小型:协作研究:可扩展且高效的光学地图组装器
基本信息
- 批准号:1618814
- 负责人:
- 金额:$ 38.4万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-10-01 至 2021-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Optical mapping is a laboratory technique for constructing ordered high-resolution optical maps from stained molecules of DNA. The popularity of this type of data has amplified because the commercial production of the data has improved in terms of quality, expense, and throughput. For example, BioNano Genomics released a new generation of optical mapping technology called the Irys System in 2015, which has been used to uncover diploid variation in the human genome. However, the raw optical mapping data, called Rmaps, is not inherently useful by itself and must be first assembled into a genome-wide optical map; a computational process that has very few nonproprietary solutions. The optical map assembly problem that aims to stich together the Rmap data into a genome-wide optical map has some similarities to genome assembly that aims to build contiguous sequences corresponding to the genome of interest from short sequence reads. Although this similarity exists, there are some significant differences that have prevented the direct application of genome assemblers to this latter problem. The main objective of this proposed work is to build a scalable and efficient optical map assembler through the exploration and adaptation of genome assembly algorithms and methods. This research objective will be enhanced by an education plan that aims to broaden the participation in computer science by creating research opportunities for female graduate students, as well as, high school senior students. The plan is to redevelop the existing bioinformatics graduate course so that it: (1) can be cross-listed with the Department of Biology and thus, increase the female enrolment, and (2) will be project based and thus, produce supportive relationships between female students. The team will conduct comprehensive surveys of graduate students in the redeveloped course, as well as the other graduate courses in computer science, to determine whether the changes are impactful. We plan to disseminate our findings to other institutions.Even with significantly high coverage and various insert sizes, genome assembly and structural variation detection are tenuous computational processes using short read data alone due to repetitive regions in the genome. Optical maps, which are ordered genome-wide high-resolution restriction maps that specify the positions of occurrence of one or more short nucleotide sequences, are one such type of data. The raw optical mapping data identified by the image processing is an ordered sequence of fragment lengths. The (unassembled) optical mapping data produced by the system are referred to as Rmaps and are synonymous with sequence reads in standard genome sequencing. Optical mapping has gained popularity due to (1) the ability to automate the generation of the data; and (2) its use in several large sequencing projects---including the sequencing projects of goat, budgerigar, and Ambler trichopoda. With this popularity comes the growing need for means to analyze optical mapping data. The goal of this proposed work is to build an optical map assembler that is efficient for genomes of various sizes that will accept as input a set of Rmaps and return an assembled genome-wide optical map. Thus, we have divided this larger goal into the following intermediate research objectives that we plan to tackle in succession: (1) Develop a Rmap error correction method; (2) create a robust alignment algorithm; and (3) create a succinct graph representation of a genome wide optical map. Our algorithmic contributions are not limited to optical mapping data but can be extended to other type of data, e.g., PacBio and genetic linkage maps. We will actively engage in knowledge transfer through these research objectives and our outreach and education plan. These efforts will redevelop a bioinformatics course in order to attract a greater number of female students and foster collaborative relationships, and continue an ongoing high school outreach program. In addition to these activities, we will survey the students in the redeveloped course and other graduate classes to see whether the changes were impactful, and mentor a senior high school student researcher during the summers.
光学映射是一种实验室技术,用于从DNA染色分子中构建有序的高分辨率光学图。 这种类型的数据的普及之所以放大,是因为数据的商业生产在质量,费用和吞吐量方面有所改善。例如,Bionano基因组学在2015年发布了一种新一代的光学映射技术,称为IRYS系统,该技术已用于发现人类基因组中的二倍体变异。 但是,称为RMAP的原始光学映射数据本身并不是本质上有用的,必须首先组装成全基因组的光学图。一个非专有解决方案的计算过程。旨在将RMAP数据固定到全基因组光学图中的光学图组件问题与基因组组件具有一定的相似性,该组合旨在构建与短序列读取相对于感兴趣的基因组相对应的连续序列。 尽管存在这种相似性,但存在一些显着的差异,阻止了基因组组装者直接应用在后一种问题上。 这项提出的工作的主要目的是通过探索和适应基因组组装算法和方法来构建可扩展有效的光学图组件。一项旨在通过为女研究生以及高中高中生创造研究机会来扩大计算机科学的参与的教育计划将增强这一研究目标。 该计划是重新开发现有的生物信息学研究生课程,以便它:(1)可以与生物学系交叉上市,从而增加女性入学率,(2)将基于项目,从而产生女学生之间的支持性关系。 该团队将对重建课程以及计算机科学领域的其他研究生课程进行全面的研究,以确定这些变化是否有影响力。 我们计划将我们的发现传播到其他机构。即使覆盖率明显高和各种插入尺寸,基因组组装和结构变化检测是由于基因组中的重复区域而仅使用简短读取数据而进行的微弱计算过程。 光学图是一个有序的全基因组高分辨率限制图,这些图指定了一个或多个短核苷酸序列的发生位置,就是一种类型的数据。图像处理确定的原始光学映射数据是片段长度的有序序列。该系统产生的(未组装的)光学映射数据称为rmaps,在标准基因组测序中的序列读取是代名词。 由于(1)能够自动化数据的能力,因此光学映射已获得流行; (2)它在几个大型测序项目中的使用 - 包括山羊,Budgerigar和Ambler Trichopoda的测序项目。 随着这种流行,人们对分析光学映射数据的方法的需求越来越大。 这项提出的工作的目的是构建一个光学图组件,该光学图组件对各种大小的基因组有效,这些基因组将作为输入一组RMAP并返回组装的全基因组光学图。 因此,我们将这个更大的目标分为以下我们计划连续解决的中间研究目标:(1)开发RMAP误差校正方法; (2)创建一个健壮的对齐算法; (3)创建基因组宽光学图的简洁图表。我们的算法贡献不仅限于光学映射数据,而且可以扩展到其他类型的数据,例如PACBIO和遗传链接图。我们将通过这些研究目标以及我们的外展和教育计划积极参与知识转移。这些努力将重新开发生物信息学课程,以吸引更多的女学生并培养合作关系,并继续进行一项持续的高中外展计划。除这些活动外,我们还将在重建课程和其他研究生课程中对学生进行调查,以查看这些变化是否有影响力,并在夏季指导高中生的学生研究员。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Christina Boucher其他文献
Data Structures for SMEM-Finding in the PBWT
PBWT 中 SMEM 查找的数据结构
- DOI:
10.1007/978-3-031-43980-3_8 - 发表时间:
2023 - 期刊:
- 影响因子:5.4
- 作者:
Paola Bonizzoni;Christina Boucher;D. Cozzi;Travis Gagie;Dominik Köppl;Massimiliano Rossi - 通讯作者:
Massimiliano Rossi
Solving the Minimal Positional Substring Cover Problem in Sublinear Space
解决次线性空间中的最小位置子串覆盖问题
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Paola Bonizzoni;Christina Boucher;D. Cozzi;Travis Gagie;Yuri Pirola - 通讯作者:
Yuri Pirola
ONeSAMP 3.0: Effective Population Size via SNP Data for One Population Sample
ONeSAMP 3.0:通过一个群体样本的 SNP 数据获得有效群体规模
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Aaron Hong;R. G. Cheek;Kingshuk Mukherjee;Isha Yooseph;Marco Oliva;Mark Heim;W. C. Funk;David Tallmon;Christina Boucher - 通讯作者:
Christina Boucher
Parametric and nonparametric probability distribution estimators of sample maximum
样本最大值的参数和非参数概率分布估计器
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Christina Boucher;Travis Gagie;Tomohiro I;Dominik Koeppl;Ben Langmead;Giovanni Manzini;Gonzalo Navarro;Alejandro Pacheco;Massimiliano Rossi;Moriyama Taku - 通讯作者:
Moriyama Taku
Cliffy: robust 16S rRNA classification based on a compressed LCA index
Cliffy:基于压缩 LCA 索引的稳健 16S rRNA 分类
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Omar Ahmed;Christina Boucher;Ben Langmead - 通讯作者:
Ben Langmead
Christina Boucher的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Christina Boucher', 18)}}的其他基金
Collaborative Research: EAGER: Solving the bait learning problem for large-scale DNA enrichment
合作研究:EAGER:解决大规模 DNA 富集的诱饵学习问题
- 批准号:
2118251 - 财政年份:2021
- 资助金额:
$ 38.4万 - 项目类别:
Standard Grant
SCH: INT: Enabling real time surveillance of antimicrobial resistance
SCH:INT:实现抗菌药物耐药性的实时监测
- 批准号:
2013998 - 财政年份:2021
- 资助金额:
$ 38.4万 - 项目类别:
Standard Grant
IIBR Informatics: An Efficient Pangenomics Graph Aligner
IIBR 信息学:高效的泛基因组图对齐器
- 批准号:
2029552 - 财政年份:2020
- 资助金额:
$ 38.4万 - 项目类别:
Standard Grant
相似国自然基金
基于超宽频技术的小微型无人系统集群协作关键技术研究与应用
- 批准号:
- 批准年份:2020
- 资助金额:57 万元
- 项目类别:面上项目
异构云小蜂窝网络中基于协作预编码的干扰协调技术研究
- 批准号:61661005
- 批准年份:2016
- 资助金额:30.0 万元
- 项目类别:地区科学基金项目
密集小基站系统中的新型接入理论与技术研究
- 批准号:61301143
- 批准年份:2013
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
ScFVCD3-9R负载Bcl-6靶向小干扰RNA治疗EAMG的试验研究
- 批准号:81072465
- 批准年份:2010
- 资助金额:31.0 万元
- 项目类别:面上项目
基于小世界网络的传感器网络研究
- 批准号:60472059
- 批准年份:2004
- 资助金额:21.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: III: Small: High-Performance Scheduling for Modern Database Systems
协作研究:III:小型:现代数据库系统的高性能调度
- 批准号:
2322973 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: High-Performance Scheduling for Modern Database Systems
协作研究:III:小型:现代数据库系统的高性能调度
- 批准号:
2322974 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: A DREAM Proactive Conversational System
合作研究:III:小型:一个梦想的主动对话系统
- 批准号:
2336769 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: A DREAM Proactive Conversational System
合作研究:III:小型:一个梦想的主动对话系统
- 批准号:
2336768 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Standard Grant
III: Small: Multiple Device Collaborative Learning in Real Heterogeneous and Dynamic Environments
III:小:真实异构动态环境中的多设备协作学习
- 批准号:
2311990 - 财政年份:2023
- 资助金额:
$ 38.4万 - 项目类别:
Standard Grant