Novel Data Structures And Scalable Algorithms For High Throughput Bioinformatics

高通量生物信息学的新颖数据结构和可扩展算法

基本信息

  • 批准号:
    RGPIN-2019-06640
  • 负责人:
  • 金额:
    $ 2.04万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

Latest advances in sequencing technologies, especially those from Illumina, 10X Genomics, Pacific Biosciences, and Oxford Nanopore Technologies, are opening up new possibilities and new fields of research. These instruments demonstrate a sustained trend of expanding sequencing throughput, growing read lengths, and improving data quality. In parallel, the cost of using these platforms reached an inflection point, whereby they became increasingly viable for widespread applications across life sciences. However, this translation requires enabling bioinformatics approaches. We propose a bioinformatics project to develop novel data structures specialized for large sequencing datasets, and an innovative RNA-seq assembly tool to leverage the properties of the latest sequencing platforms. Accordingly, we have developed a research plan with two aims. Aim 1. Advanced Data Structures The value of innovative data types in bioinformatics applications has been demonstrated several times. The most prominent example of this is the use of FM-indexing for rapid read alignments. Here, we will build on our extensive expertise with Bloom filters and spaced seeds to address memory and run time bottlenecks in bioinformatics applications. Particularly, we will develop error tolerant methods for the sequence classification problem, where a set of high throughput sequencing reads are assigned to a set of reference genomes and/or genomic loci. Results of this aim will also support the research activities in the second aim of our proposal. Aim 2. RNA-seq Assembly RNA-seq experiments, often in combination with genome sequencing, have proven useful in studying the biology of model and non-model species. Transcriptome analysis based on de novo assembly has demonstrated utility for discovery in many projects, but its routine application in translational studies may be computationally costly, hence requires a rethinking of the problem. Using the advanced data types we will develop in Aim 1, we will leverage the new information modalities in recent sequencing technologies, such as single cell RNA sequencing (scRNA-seq). Our lab has an established track record of developing, disseminating, and maintaining popular bioinformatics tools built on advanced computational methods. We will implement and release our tools and algorithms through our lab's software portal at https://github.com/bcgsc, providing the research community broad and timely access to these enabling technologies, and offering active support. We will also continue to collaborate widely across life sciences domains to apply our analytical methods, and support basic and applied research projects. The aims of this research plan are in response to the identified needs of our collaborators and end users. Last but not the least, we expect this project to serve as a platform to train a number graduate students and interns/co-op students.
测序技术的最新进展,尤其是来自 Illumina、10X Genomics、Pacific Biosciences 和 Oxford Nanopore Technologies 的技术,正在开辟新的可能性和新的研究领域。这些仪器展示了扩大测序通量、增加读取长度和提高数据质量的持续趋势。与此同时,使用这些平台的成本达到了拐点,它们在生命科学领域的广泛应用变得越来越可行。然而,这种翻译需要启用生物信息学方法。我们提出了一个生物信息学项目来开发专门用于大型测序数据集的新型数据结构,以及一个创新的 RNA-seq 组装工具来利用最新测序平台的特性。因此,我们制定了一项有两个目标的研究计划。 目标 1. 高级数据结构 创新数据类型在生物信息学应用中的价值已被多次证明。最突出的例子是使用 FM 索引进行快速读取对齐。在这里,我们将利用布隆过滤器和间隔种子方面的丰富专业知识来解决生物信息学应用中的内存和运行时瓶颈。特别是,我们将为序列分类问题开发容错方法,其中将一组高通量测序读数分配给一组参考基因组和/或基因组基因座。这一目标的结果也将支持我们提案的第二个目标中的研究活动。目标 2. RNA-seq 组装 RNA-seq 实验通常与基因组测序相结合,已被证明在研究模式和非模式物种的生物学方面很有用。基于从头组装的转录组分析已在许多项目中证明了其发现的实用性,但其在转化研究中的常规应用可能在计算上成本高昂,因此需要重新思考该问题。利用我们将在目标 1 中开发的高级数据类型,我们将利用最新测序技术中的新信息模式,例如单细胞 RNA 测序 (scRNA-seq)。 我们的实验室在开发、传播和维护基于先进计算方法的流行生物信息学工具方面拥有良好的记录。我们将通过我们实验室的软件门户(https://github.com/bcgsc)实施和发布我们的工具和算法,为研究社区提供广泛而及时的访问这些支持技术的机会,并提供积极的支持。我们还将继续在生命科学领域广泛合作,应用我们的分析方法,并支持基础和应用研究项目。 该研究计划的目的是响应我们的合作者和最终用户已确定的需求。最后但并非最不重要的一点是,我们希望这个项目能够作为一个平台来培养一些研究生和实习生/带薪实习生。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Birol, Inanc其他文献

Assembly and annotation of the black spruce genome provide insights on spruce phylogeny and evolution of stress response.
  • DOI:
    10.1093/g3journal/jkad247
  • 发表时间:
    2023-12-29
  • 期刊:
  • 影响因子:
    2.6
  • 作者:
    Lo, Theodora;Coombe, Lauren;Gagalova, Kristina K.;Marr, Alex;Warren, Rene L.;Kirk, Heather;Pandoh, Pawan;Zhao, Yongjun;Moore, Richard A.;Mungall, Andrew J.;Ritland, Carol;Pavy, Nathalie;Jones, Steven J. M.;Bohlmann, Joerg;Bousquet, Jean;Birol, Inanc;Thomson, Ashley
  • 通讯作者:
    Thomson, Ashley
Linear time complexity de novo long read genome assembly with GoldRush.
  • DOI:
    10.1038/s41467-023-38716-x
  • 发表时间:
    2023-05-22
  • 期刊:
  • 影响因子:
    16.6
  • 作者:
    Wong, Johnathan;Coombe, Lauren;Nikolic, Vladimir;Zhang, Emily;Nip, Ka Ming;Sidhu, Puneet;Warren, Rene L.;Birol, Inanc
  • 通讯作者:
    Birol, Inanc
Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma.
  • DOI:
    10.1038/nature10351
  • 发表时间:
    2011-07-27
  • 期刊:
  • 影响因子:
    64.8
  • 作者:
    Morin, Ryan D.;Mendez-Lago, Maria;Mungall, Andrew J.;Goya, Rodrigo;Mungall, Karen L.;Corbett, Richard D.;Johnson, Nathalie A.;Severson, Tesa M.;Chiu, Readman;Field, Matthew;Jackman, Shaun;Krzywinski, Martin;Scott, David W.;Trinh, Diane L.;Tamura-Wells, Jessica;Li, Sa;Firme, Marlo R.;Rogic, Sanja;Griffith, Malachi;Chan, Susanna;Yakovenko, Oleksandr;Meyer, Irmtraud M.;Zhao, Eric Y.;Smailus, Duane;Moksa, Michelle;Chittaranjan, Suganthi;Rimsza, Lisa;Brooks-Wilson, Angela;Spinelli, John J.;Ben-Neriah, Susana;Meissner, Barbara;Woolcock, Bruce;Boyle, Merrill;McDonald, Helen;Tam, Angela;Zhao, Yongjun;Delaney, Allen;Zeng, Thomas;Tse, Kane;Butterfield, Yaron;Birol, Inanc;Holt, Rob;Schein, Jacqueline;Horsman, Douglas E.;Moore, Richard;Jones, Steven J. M.;Connors, Joseph M.;Hirst, Martin;Gascoyne, Randy D.;Marra, Marco A.
  • 通讯作者:
    Marra, Marco A.
Antimicrobial peptides from Rana [Lithobates] catesbeiana: Gene structure and bioinformatic identification of novel forms from tadpoles
  • DOI:
    10.1038/s41598-018-38442-1
  • 发表时间:
    2019-02-06
  • 期刊:
  • 影响因子:
    4.6
  • 作者:
    Helbing, Caren C.;Hammond, S. Austin;Birol, Inanc
  • 通讯作者:
    Birol, Inanc
Theoretical Analysis of the Minimum Sum of Squared Similarities Sampling for Nystrom-Based Spectral Clustering

Birol, Inanc的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Birol, Inanc', 18)}}的其他基金

Novel Data Structures And Scalable Algorithms For High Throughput Bioinformatics
高通量生物信息学的新颖数据结构和可扩展算法
  • 批准号:
    RGPIN-2019-06640
  • 财政年份:
    2021
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Novel Data Structures And Scalable Algorithms For High Throughput Bioinformatics
高通量生物信息学的新颖数据结构和可扩展算法
  • 批准号:
    RGPIN-2019-06640
  • 财政年份:
    2020
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Novel Data Structures And Scalable Algorithms For High Throughput Bioinformatics
高通量生物信息学的新颖数据结构和可扩展算法
  • 批准号:
    RGPIN-2019-06640
  • 财政年份:
    2019
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Read-to-contig alignments for de novo genome assembly and annotation
用于从头基因组组装和注释的读取到重叠群比对
  • 批准号:
    RGPIN-2014-05112
  • 财政年份:
    2018
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Read-to-contig alignments for de novo genome assembly and annotation
用于从头基因组组装和注释的读取到重叠群比对
  • 批准号:
    RGPIN-2014-05112
  • 财政年份:
    2017
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Read-to-contig alignments for de novo genome assembly and annotation
用于从头基因组组装和注释的读取到重叠群比对
  • 批准号:
    RGPIN-2014-05112
  • 财政年份:
    2016
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Read-to-contig alignments for de novo genome assembly and annotation
用于从头基因组组装和注释的读取到重叠群比对
  • 批准号:
    RGPIN-2014-05112
  • 财政年份:
    2015
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Read-to-contig alignments for de novo genome assembly and annotation
用于从头基因组组装和注释的读取到重叠群比对
  • 批准号:
    RGPIN-2014-05112
  • 财政年份:
    2014
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

基于跨组学数据研究质体基因组中非同义突变对蛋白质结构的影响
  • 批准号:
    32300539
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于纳米孔多维度数据的恶性胶质瘤基因组结构变异异质性及调控网络研究
  • 批准号:
    32300522
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
非结构网格剖分下双重结构约束的重磁数据高精度快速联合反演方法研究
  • 批准号:
    42304150
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
知识与数据混合驱动的含缺陷点阵结构不确定性分析与优化方法研究
  • 批准号:
    12302149
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
长期监测数据驱动的FRP筋混凝土结构性能评估与寿命预测方法
  • 批准号:
    52378123
  • 批准年份:
    2023
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目

相似海外基金

Microscopy and Image Analysis Core
显微镜和图像分析核心
  • 批准号:
    10557025
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
Next-Generation Algorithms in Statistical Genetics Based on Modern Machine Learning
基于现代机器学习的下一代统计遗传学算法
  • 批准号:
    10714930
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
Technologies for High-Throughput Mapping of Antigen Specificity to B-Cell-Receptor Sequence
B 细胞受体序列抗原特异性高通量作图技术
  • 批准号:
    10734412
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
Continuous development of nTracer2 and its deployment at NIH image repositories
nTracer2 的持续开发及其在 NIH 图像存储库中的部署
  • 批准号:
    10726178
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
Development of multi-color 3D super-localization LiveFISH and LiveFISH PAINT to investigate the chromatin dynamics at any genomic scale
开发多色 3D 超定位 LiveFISH 和 LiveFISH PAINT,以研究任何基因组规模的染色质动态
  • 批准号:
    10725002
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了