CAREER: Algorithmic Aspects of Pan-genomic Data Modeling, Indexing and Querying
职业:泛基因组数据建模、索引和查询的算法方面
基本信息
- 批准号:2146003
- 负责人:
- 金额:$ 59.53万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-02-01 至 2023-02-28
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This award is funded in whole or in part under the American Rescue Plan Act of 2021 (Public Law 117-2).This project aims to address the following question: How to model the combined information of a pan-genome collection succinctly (and in a biologically meaningful way) such that the genomic analysis on that representation is both easy-to-compute and accurate? This is one of the active lines of bioinformatics research currently. Pan-genome collections may be represented as high-scoring Multiple Sequence Alignment (MSA) data, indexed text data, or the more popular graph-based representations (pan-genome graphs). One of the fundamental objectives of these models is to support read mapping queries efficiently. To that end, this research will lead to a new class of string/graph algorithms for the analysis of pan-genomic data that are closely tied to critical applications that have the potential to make a lasting impact on the theory and practice of data-driven bioinformatics. The key findings will be disseminated through peer-reviewed conferences, journals, and workshop tutorials. Several Ph.D. and Master's theses will also evolve from this research. The investigator is committed to ensuring the participation of women, students from underrepresented minority groups, and undergraduates in this research. The novel aspects of this project include new graph algorithms parameterized by different graph parameters and techniques for indexing (highly similar, possibly dynamic) texts and graphs. The parameterization of the string-to-graph matching and related problems might lead to several efficient, practical solutions for restricted graph classes. Such problems are well motivated by pressing applications, including read-mapping and (reference-based) genome assembly. Additionally, some fundamental computational biology problems, like the Multiple Sequence Alignment, will be revisited in the context of pan-genomics for faster approximations and better heuristics. A deeper investigation on popular heuristics like co-linear chaining is another direction. The goal here is to provide solid mathematical reasoning on why such methods work well in practice. These research objectives align with the investigator's background in string algorithms and his long-term career goals of transforming foundational research into practice.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该奖项的全部或部分资金来源于《2021 年美国救援计划法案》(公法 117-2)。该项目旨在解决以下问题:如何简洁地对泛基因组集合的组合信息进行建模(并在一种具有生物学意义的方式),以便对该表示进行基因组分析既易于计算又准确?这是当前生物信息学研究的活跃方向之一。泛基因组集合可以表示为高分多序列比对 (MSA) 数据、索引文本数据或更流行的基于图形的表示(泛基因组图)。这些模型的基本目标之一是有效支持读取映射查询。为此,这项研究将产生一类新的字符串/图算法,用于分析泛基因组数据,这些算法与关键应用程序密切相关,有可能对数据驱动的理论和实践产生持久影响。生物信息学。主要研究结果将通过同行评审的会议、期刊和研讨会教程进行传播。多名博士硕士论文也将从这项研究中发展而来。研究者致力于确保女性、代表性不足的少数群体的学生和本科生参与这项研究。该项目的新颖之处包括由不同图形参数参数化的新图形算法以及用于索引(高度相似,可能动态)文本和图形的技术。字符串到图匹配和相关问题的参数化可能会为受限图类带来几种有效、实用的解决方案。这些问题是由紧迫的应用程序引起的,包括读取映射和(基于参考的)基因组组装。此外,一些基本的计算生物学问题,如多序列比对,将在泛基因组学的背景下重新审视,以获得更快的近似和更好的启发式方法。对共线链接等流行启发式进行更深入的研究是另一个方向。这里的目标是提供可靠的数学推理来解释为什么这些方法在实践中效果很好。这些研究目标与研究者在字符串算法方面的背景以及将基础研究转化为实践的长期职业目标相一致。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Feasibility of flow decomposition with subpath constraints in linear time
线性时间内子路径约束流分解的可行性
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Gibney, Daniel;Thankachan;Sharma V;Aluru, S.
- 通讯作者:Aluru, S.
Co-linear Chaining with Overlaps and Gap Costs
- DOI:10.1007/978-3-031-04749-7_15
- 发表时间:2022-01-01
- 期刊:
- 影响因子:0
- 作者:Jain, Chirag;Gibney, Daniel;Thankachan, Sharma V.
- 通讯作者:Thankachan, Sharma V.
Quantum Time Complexity and Algorithms for Pattern Matching on Labeled Graphs
标记图上模式匹配的量子时间复杂度和算法
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Darbari, Parisa;Gibney, Daniel;Thankachan, Sharma V.
- 通讯作者:Thankachan, Sharma V.
The Complexity of Approximate Pattern Matching on de Bruijn Graphs
- DOI:10.1007/978-3-031-04749-7_16
- 发表时间:2022-01-01
- 期刊:
- 影响因子:0
- 作者:Gibney, Daniel;Thankachan, Sharma V.;Aluru, Srinivas
- 通讯作者:Aluru, Srinivas
Suffix-Prefix Queries on a Dictionary
字典上的后缀-前缀查询
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Loukides, Grigorios;Pissis, Solon
- 通讯作者:Pissis, Solon
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Sharma Thankachan其他文献
Sharma Thankachan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Sharma Thankachan', 18)}}的其他基金
REU Site: Algorithm Design --- Theory and Engineering
REU网站:算法设计---理论与工程
- 批准号:
2349179 - 财政年份:2024
- 资助金额:
$ 59.53万 - 项目类别:
Standard Grant
AF: Small: Theoretical Aspects of Repetition-Aware Text Compression and Indexing
AF:小:重复感知文本压缩和索引的理论方面
- 批准号:
2315822 - 财政年份:2023
- 资助金额:
$ 59.53万 - 项目类别:
Standard Grant
CAREER: Algorithmic Aspects of Pan-genomic Data Modeling, Indexing and Querying
职业:泛基因组数据建模、索引和查询的算法方面
- 批准号:
2316691 - 财政年份:2023
- 资助金额:
$ 59.53万 - 项目类别:
Continuing Grant
AF: Small: Theoretical Aspects of Repetition-Aware Text Compression and Indexing
AF:小:重复感知文本压缩和索引的理论方面
- 批准号:
2112643 - 财政年份:2021
- 资助金额:
$ 59.53万 - 项目类别:
Standard Grant
NSF Student Travel Grant for Workshop on String Algorithms in Bioinformatics (StringBio), 2019
NSF 学生生物信息学字符串算法研讨会 (StringBio) 旅行补助金,2019
- 批准号:
1946289 - 财政年份:2019
- 资助金额:
$ 59.53万 - 项目类别:
Standard Grant
NSF Student Travel Grant for 2018 International Workshop on String Algorithms in Bioinformatics (StringBio)
NSF 学生旅费资助 2018 年生物信息学字符串算法国际研讨会 (StringBio)
- 批准号:
1849136 - 财政年份:2018
- 资助金额:
$ 59.53万 - 项目类别:
Standard Grant
AF: Medium: Collaborative Research: Sequential and Parallel Algorithms for Approximate Sequence Matching with Applications to Computational Biology
AF:媒介:协作研究:近似序列匹配的顺序和并行算法及其在计算生物学中的应用
- 批准号:
1703489 - 财政年份:2017
- 资助金额:
$ 59.53万 - 项目类别:
Standard Grant
相似国自然基金
地表与大气层顶短波辐射多分量一体化遥感反演算法研究
- 批准号:42371342
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
高速铁路柔性列车运行图集成优化模型及对偶分解算法
- 批准号:72361020
- 批准年份:2023
- 资助金额:27 万元
- 项目类别:地区科学基金项目
随机密度泛函理论的算法设计和分析
- 批准号:12371431
- 批准年份:2023
- 资助金额:43.5 万元
- 项目类别:面上项目
基于全息交通数据的高速公路大型货车运行风险识别算法及主动干预方法研究
- 批准号:52372329
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
高效非完全信息对抗性团队博弈求解算法研究
- 批准号:62376073
- 批准年份:2023
- 资助金额:51 万元
- 项目类别:面上项目
相似海外基金
CAREER: Algorithmic Aspects of Pan-genomic Data Modeling, Indexing and Querying
职业:泛基因组数据建模、索引和查询的算法方面
- 批准号:
2316691 - 财政年份:2023
- 资助金额:
$ 59.53万 - 项目类别:
Continuing Grant
CAREER: Algorithmic Aspects of Machine Learning
职业:机器学习的算法方面
- 批准号:
1453261 - 财政年份:2015
- 资助金额:
$ 59.53万 - 项目类别:
Continuing Grant
RESEARCH EDUCATION PROGRAM IN ASPECTS OF STATISTICAL GENETICS AND ADDICTION
统计遗传学和成瘾方面的研究教育计划
- 批准号:
8723791 - 财政年份:2009
- 资助金额:
$ 59.53万 - 项目类别:
CAREER: Algorithmic Aspects of Ordinal Matching Problems
职业:序数匹配问题的算法方面
- 批准号:
0845593 - 财政年份:2009
- 资助金额:
$ 59.53万 - 项目类别:
Continuing Grant
CAREER: Cryptography on Recongfigurable Hardware: Algorithmic and System Aspects
职业:可重构硬件上的密码学:算法和系统方面
- 批准号:
9733246 - 财政年份:1998
- 资助金额:
$ 59.53万 - 项目类别:
Standard Grant