Novel Data Structures And Scalable Algorithms For High Throughput Bioinformatics

高通量生物信息学的新颖数据结构和可扩展算法

基本信息

  • 批准号:
    RGPIN-2019-06640
  • 负责人:
  • 金额:
    $ 2.04万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

Latest advances in sequencing technologies, especially those from Illumina, 10X Genomics, Pacific Biosciences, and Oxford Nanopore Technologies, are opening up new possibilities and new fields of research. These instruments demonstrate a sustained trend of expanding sequencing throughput, growing read lengths, and improving data quality. In parallel, the cost of using these platforms reached an inflection point, whereby they became increasingly viable for widespread applications across life sciences. However, this translation requires enabling bioinformatics approaches. We propose a bioinformatics project to develop novel data structures specialized for large sequencing datasets, and an innovative RNA-seq assembly tool to leverage the properties of the latest sequencing platforms. Accordingly, we have developed a research plan with two aims. Aim 1. Advanced Data Structures The value of innovative data types in bioinformatics applications has been demonstrated several times. The most prominent example of this is the use of FM-indexing for rapid read alignments. Here, we will build on our extensive expertise with Bloom filters and spaced seeds to address memory and run time bottlenecks in bioinformatics applications. Particularly, we will develop error tolerant methods for the sequence classification problem, where a set of high throughput sequencing reads are assigned to a set of reference genomes and/or genomic loci. Results of this aim will also support the research activities in the second aim of our proposal. Aim 2. RNA-seq Assembly RNA-seq experiments, often in combination with genome sequencing, have proven useful in studying the biology of model and non-model species. Transcriptome analysis based on de novo assembly has demonstrated utility for discovery in many projects, but its routine application in translational studies may be computationally costly, hence requires a rethinking of the problem. Using the advanced data types we will develop in Aim 1, we will leverage the new information modalities in recent sequencing technologies, such as single cell RNA sequencing (scRNA-seq). Our lab has an established track record of developing, disseminating, and maintaining popular bioinformatics tools built on advanced computational methods. We will implement and release our tools and algorithms through our lab's software portal at https://github.com/bcgsc, providing the research community broad and timely access to these enabling technologies, and offering active support. We will also continue to collaborate widely across life sciences domains to apply our analytical methods, and support basic and applied research projects. The aims of this research plan are in response to the identified needs of our collaborators and end users. Last but not the least, we expect this project to serve as a platform to train a number graduate students and interns/co-op students.
测序技术的最新进展,尤其是Illumina,10倍基因组学,太平洋生物科学和牛津纳米孔技术的技术,正在开放新的可能性和新的研究领域。这些仪器表明了扩大测序吞吐量,增长读取长度以及提高数据质量的持续趋势。同时,使用这些平台的成本达到了一个拐点,因此它们在整个生命科学的广泛应用中变得越来越可行。但是,这种翻译需要实现生物信息学方法。我们提出了一个生物信息学项目,以开发专门用于大型测序数据集的新型数据结构,以及一个创新的RNA-Seq组装工具,以利用最新的测序平台的属性。因此,我们制定了一个具有两个目标的研究计划。 AIM1。高级数据结构已多次证明创新数据类型在生物信息学应用程序中的价值。最突出的例子是将FM索引用于快速读取对齐。在这里,我们将以大量的专业知识和间隔种子为基础,以解决记忆并在生物信息学应用程序中运行时间瓶颈。特别是,我们将针对序列分类问题开发错误耐受的方法,其中将一组高吞吐量测序读数分配给一组参考基因组和/或基因组基因局基因局基因粒。该目标的结果还将支持我们提案的第二个目标。 AIM 2。经常与基因组测序结合的RNA-Seq组装RNA-Seq实验已被证明可用于研究模型和非模型物种的生物学。基于从头组装的转录组分析已经证明了许多项目中发现的实用性,但是其在翻译研究中的常规应用可能在计算上是昂贵的,因此需要重新思考问题。使用我们将在AIM 1中开发的高级数据类型,我们将利用最近的测序技术中的新信息模式,例如单细胞RNA测序(SCRNA-SEQ)。 我们的实验室拥有开发,传播和维护基于高级计算方法的流行生物信息学工具的既定记录。我们将通过实验室的软件门户网站https://github.com/bcgsc实施和发布我们的工具和算法,从而为研究社区提供广泛,及时访问这些能力的技术,并提供积极的支持。我们还将继续在生命科学领域进行广泛的合作,以应用我们的分析方法,并支持基本和应用研究项目。 该研究计划的目的是响应我们的合作者和最终用户的确定需求。最后但并非最不重要的一点是,我们希望该项目可以作为培训许多研究生和实习生/合作社学生的平台。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Birol, Inanc其他文献

Assembly and annotation of the black spruce genome provide insights on spruce phylogeny and evolution of stress response.
  • DOI:
    10.1093/g3journal/jkad247
  • 发表时间:
    2023-12-29
  • 期刊:
  • 影响因子:
    2.6
  • 作者:
    Lo, Theodora;Coombe, Lauren;Gagalova, Kristina K.;Marr, Alex;Warren, Rene L.;Kirk, Heather;Pandoh, Pawan;Zhao, Yongjun;Moore, Richard A.;Mungall, Andrew J.;Ritland, Carol;Pavy, Nathalie;Jones, Steven J. M.;Bohlmann, Joerg;Bousquet, Jean;Birol, Inanc;Thomson, Ashley
  • 通讯作者:
    Thomson, Ashley
Linear time complexity de novo long read genome assembly with GoldRush.
  • DOI:
    10.1038/s41467-023-38716-x
  • 发表时间:
    2023-05-22
  • 期刊:
  • 影响因子:
    16.6
  • 作者:
    Wong, Johnathan;Coombe, Lauren;Nikolic, Vladimir;Zhang, Emily;Nip, Ka Ming;Sidhu, Puneet;Warren, Rene L.;Birol, Inanc
  • 通讯作者:
    Birol, Inanc
Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma.
  • DOI:
    10.1038/nature10351
  • 发表时间:
    2011-07-27
  • 期刊:
  • 影响因子:
    64.8
  • 作者:
    Morin, Ryan D.;Mendez-Lago, Maria;Mungall, Andrew J.;Goya, Rodrigo;Mungall, Karen L.;Corbett, Richard D.;Johnson, Nathalie A.;Severson, Tesa M.;Chiu, Readman;Field, Matthew;Jackman, Shaun;Krzywinski, Martin;Scott, David W.;Trinh, Diane L.;Tamura-Wells, Jessica;Li, Sa;Firme, Marlo R.;Rogic, Sanja;Griffith, Malachi;Chan, Susanna;Yakovenko, Oleksandr;Meyer, Irmtraud M.;Zhao, Eric Y.;Smailus, Duane;Moksa, Michelle;Chittaranjan, Suganthi;Rimsza, Lisa;Brooks-Wilson, Angela;Spinelli, John J.;Ben-Neriah, Susana;Meissner, Barbara;Woolcock, Bruce;Boyle, Merrill;McDonald, Helen;Tam, Angela;Zhao, Yongjun;Delaney, Allen;Zeng, Thomas;Tse, Kane;Butterfield, Yaron;Birol, Inanc;Holt, Rob;Schein, Jacqueline;Horsman, Douglas E.;Moore, Richard;Jones, Steven J. M.;Connors, Joseph M.;Hirst, Martin;Gascoyne, Randy D.;Marra, Marco A.
  • 通讯作者:
    Marra, Marco A.
Antimicrobial peptides from Rana [Lithobates] catesbeiana: Gene structure and bioinformatic identification of novel forms from tadpoles
  • DOI:
    10.1038/s41598-018-38442-1
  • 发表时间:
    2019-02-06
  • 期刊:
  • 影响因子:
    4.6
  • 作者:
    Helbing, Caren C.;Hammond, S. Austin;Birol, Inanc
  • 通讯作者:
    Birol, Inanc
Theoretical Analysis of the Minimum Sum of Squared Similarities Sampling for Nystrom-Based Spectral Clustering

Birol, Inanc的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Birol, Inanc', 18)}}的其他基金

Novel Data Structures And Scalable Algorithms For High Throughput Bioinformatics
高通量生物信息学的新颖数据结构和可扩展算法
  • 批准号:
    RGPIN-2019-06640
  • 财政年份:
    2021
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Novel Data Structures And Scalable Algorithms For High Throughput Bioinformatics
高通量生物信息学的新颖数据结构和可扩展算法
  • 批准号:
    RGPIN-2019-06640
  • 财政年份:
    2020
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Novel Data Structures And Scalable Algorithms For High Throughput Bioinformatics
高通量生物信息学的新颖数据结构和可扩展算法
  • 批准号:
    RGPIN-2019-06640
  • 财政年份:
    2019
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Read-to-contig alignments for de novo genome assembly and annotation
用于从头基因组组装和注释的读取到重叠群比对
  • 批准号:
    RGPIN-2014-05112
  • 财政年份:
    2018
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Read-to-contig alignments for de novo genome assembly and annotation
用于从头基因组组装和注释的读取到重叠群比对
  • 批准号:
    RGPIN-2014-05112
  • 财政年份:
    2017
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Read-to-contig alignments for de novo genome assembly and annotation
用于从头基因组组装和注释的读取到重叠群比对
  • 批准号:
    RGPIN-2014-05112
  • 财政年份:
    2016
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Read-to-contig alignments for de novo genome assembly and annotation
用于从头基因组组装和注释的读取到重叠群比对
  • 批准号:
    RGPIN-2014-05112
  • 财政年份:
    2015
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual
Read-to-contig alignments for de novo genome assembly and annotation
用于从头基因组组装和注释的读取到重叠群比对
  • 批准号:
    RGPIN-2014-05112
  • 财政年份:
    2014
  • 资助金额:
    $ 2.04万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

面向复杂网络结构数据的空间自回归模型理论与应用研究
  • 批准号:
    72371241
  • 批准年份:
    2023
  • 资助金额:
    43 万元
  • 项目类别:
    面上项目
带结构试验的设计与数据分析
  • 批准号:
    12371259
  • 批准年份:
    2023
  • 资助金额:
    43.5 万元
  • 项目类别:
    面上项目
知识与数据混合驱动的含缺陷点阵结构不确定性分析与优化方法研究
  • 批准号:
    12302149
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
数字技术创新网络结构与企业生产率增长研究:基于专利引用数据的理论与实证
  • 批准号:
    72303018
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
耦合多源遥感数据和物理知识的中尺度涡三维温盐结构反演研究
  • 批准号:
    42306194
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Microscopy and Image Analysis Core
显微镜和图像分析核心
  • 批准号:
    10557025
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
Application of New Tools for Probing the Roles of Sphingolipids and Cholesterol in Influenza Virus Infection
应用新工具探索鞘脂和胆固醇在流感病毒感染中的作用
  • 批准号:
    10678459
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
Next-Generation Algorithms in Statistical Genetics Based on Modern Machine Learning
基于现代机器学习的下一代统计遗传学算法
  • 批准号:
    10714930
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
Technologies for High-Throughput Mapping of Antigen Specificity to B-Cell-Receptor Sequence
B 细胞受体序列抗原特异性高通量作图技术
  • 批准号:
    10734412
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
Multi-modal Tracking of In Vivo Skeletal Structures and Implants
体内骨骼结构和植入物的多模式跟踪
  • 批准号:
    10839518
  • 财政年份:
    2023
  • 资助金额:
    $ 2.04万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了