Computational Methods for Microbial and Microbiome Sequence Analysis

微生物和微生物组序列分析的计算方法

基本信息

  • 批准号:
    10550160
  • 负责人:
  • 金额:
    $ 40.34万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-02-01 至 2025-01-31
  • 项目状态:
    未结题

项目摘要

Project Summary This project will support our work on computational methods for microbial sequence analysis, including gene finding, whole-genome alignment, genome assembly, and metagenomic sequence analysis. Over the years we have developed multiple systems to solve problems in these areas, some of which are very widely used. These tools need continued updates and improvements to keep pace with changes in sequencing technology, changes in experimental design, and the ever-growing number of sequenced genomes. One of these systems is Glimmer, a computational method for finding genes in bacteria, viruses, archaea, and simple eukaryotes. Glimmer is highly accurate, finding over 99% of the genes in most prokaryotic genomes. It has been used by thousands of scientists around the world and in the majority of published bacterial genome sequencing projects over the past decade. Collectively the three main publications describing Glimmer have been cited over 4,700 times, including >700 citations in 2016-17 alone. Usage of Glimmer has been increased in recent years due to the explosion in next-generation sequencing projects, which are particularly cost-effective for bacterial genomes. A second system, MUMmer, is an efficient whole-genome aligner that is used to compare genomes to one another and to compare genome assemblies to detect changes, both large and small. MUMmer and its components, especially Nucmer, have been widely used and incorporated in other systems, including multi-genome aligners and several genome assembly packages. The three main publications describing MUMmer have been cited over 3,600 times including >750 citations in 2016-17. In recent years we have focused our efforts on developing methods for the analysis of metagenomics data, producing several newer tools, including Kraken and Centrifuge. Both of these systems attempt to assign a species identifier to every read in a metagenomics data set. Because the Kraken algorithm is not only accurate but far faster than earlier methods, it was rapidly adopted by many labs soon after its release, and its usage continues to grow. The even newer and more space- efficient Centrifuge system has also been highly successful and was recently incorporated into the analysis package of one of the new third-generation sequencing companies. We continue to work on improving the performance of both algorithms, and this project will allow us to extend them to handle the newest long-read data that is increasingly being used for metagenomics experiments. Finally, a new direction of the lab is the use of metagenomic shotgun sequencing to diagnose infections, for which we are not only modifying our algorithms, but also building customized genome databases where we rigorously screen the genomes to identify and remove contaminants and low-complexity sequences that create false positives. As we have done for many years, we will release all of the software and data generated by this project for free under an open source license, allowing other scientists to use, modify, and redistribute them without restrictions of any kind.
项目摘要 该项目将支持我们在微生物序列分析的计算方法上的工作,包括基因 发现,全基因组比对,基因组组装和元基因组序列分析。多年来,我们 已经开发了多个系统来解决这些领域的问题,其中一些是非常广泛使用的。这些 工具需要持续的更新和改进,以保持测序技术的变化,变化 在实验设计和序列的测序基因组数量中。这些系统之一是微光, 一种用于在细菌,病毒,古细菌和简单真核生物中找到基因的计算方法。微光是 高度准确,在大多数核基因组中发现超过99%的基因。它已被数千个 过去,世界各地的科学家以及大多数已发表的细菌基因组测序项目 十年。总体上描述了微光的三个主要出版物已被引用了4,700多次, 仅在2016 - 17年就包括700次引用。由于近年来,由于 下一代测序项目中的爆炸,这对于细菌基因组特别有效。一个 第二系统,木乃伊是一种有效的全基因组对齐器,用于将基因组相互比较 并比较基因组组件以检测大小的变化。木乃伊及其组件, 尤其是核对器,已被广泛使用并纳入其他系统,包括多基因组对齐器 和几个基因组组装包。描述木乃伊的三个主要出版物已被引用 超过3,600次,包括2016 - 17年度的750次引用。近年来,我们将精力集中在 开发用于分析宏基因组学数据的方法,生产了几种较新的工具,包括Kraken 和离心机。这两个系统都试图为宏基因组学中的每个读取物分配物种标识符 数据集。因为Kraken算法不仅准确,而且远比早期方法快得多,因此它迅速 许多实验室在发布后不久就通过了,其使用量仍在增长。甚至更新的空间 - 有效的离心机系统也非常成功,最近被整合到分析中 新的第三代测序公司之一的包装。我们继续致力于改进 两种算法的性能,该项目将使我们能够扩展它们以处理最新的长阅读 越来越多地用于宏基因组学实验的数据。最后,实验室的新方向是使用 据诊断感染的宏基因组shot弹枪测序,我们不仅在修改我们的感染 算法,但也构建自定义的基因组数据库,我们严格筛选基因组以识别 并去除产生误报的污染物和低复杂性序列。正如我们为许多人所做的 几年,我们将在开源下免费发布该项目生成的所有软件和数据 许可证,允许其他科学家使用,修改和重新分配它们,而无需任何限制。

项目成果

期刊论文数量(47)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Rapid detection of inter-clade recombination in SARS-CoV-2 with Bolotie.
  • DOI:
    10.1101/2020.09.21.300913
  • 发表时间:
    2020-09-21
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Varabyou, Ales;Pockrandt, Christopher;Pertea, Mihaela
  • 通讯作者:
    Pertea, Mihaela
3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources.
  • DOI:
    10.1093/gigascience/giac118
  • 发表时间:
    2022-11-30
  • 期刊:
  • 影响因子:
    9.2
  • 作者:
  • 通讯作者:
CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure.
  • DOI:
    10.1186/s13059-023-03088-4
  • 发表时间:
    2023-10-30
  • 期刊:
  • 影响因子:
    12.3
  • 作者:
    Varabyou, Ales;Sommer, Markus J;Erdogdu, Beril;Shinder, Ida;Minkin, Ilia;Chao, Kuan-Hao;Park, Sukhwan;Heinz, Jakob;Pockrandt, Christopher;Shumate, Alaina;Rincon, Natalia;Puiu, Daniela;Steinegger, Martin;Salzberg, Steven L;Pertea, Mihaela
  • 通讯作者:
    Pertea, Mihaela
JASPER: A fast genome polishing tool that improves accuracy of genome assemblies.
  • DOI:
    10.1371/journal.pcbi.1011032
  • 发表时间:
    2023-03
  • 期刊:
  • 影响因子:
    4.3
  • 作者:
  • 通讯作者:
Rapidly fatal infection with Bacillus cereus/thuringiensis: genome assembly of the responsible pathogen and consideration of possibly contributing toxins.
  • DOI:
    10.1016/j.diagmicrobio.2021.115534
  • 发表时间:
    2021-12
  • 期刊:
  • 影响因子:
    2.9
  • 作者:
    Butcher, Monica;Puiu, Daniela;Romagnoli, Mark;Carroll, Karen C.;Salzberg, Steven L.;Nauen, David W.
  • 通讯作者:
    Nauen, David W.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Steven L. Salzberg其他文献

Q UALITY ASSESSMENT OF SPLICE SITE ANNOTATION BASED ON CONSERVATION ACROSS MULTIPLE SPECIES
基于多物种保护的剪接位点注释质量评估
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ilia Minkin;Steven L. Salzberg
  • 通讯作者:
    Steven L. Salzberg

Steven L. Salzberg的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Steven L. Salzberg', 18)}}的其他基金

Comprehensive Human Expressed Sequences in Brain (CHESS-BRAIN) and their roles in neuropsychiatric illness
大脑中综合人类表达序列(CHESS-BRAIN)及其在神经精神疾病中的作用
  • 批准号:
    10541887
  • 财政年份:
    2021
  • 资助金额:
    $ 40.34万
  • 项目类别:
Comprehensive Human Expressed Sequences in Brain (CHESS-BRAIN) and their roles in neuropsychiatric illness
大脑中综合人类表达序列(CHESS-BRAIN)及其在神经精神疾病中的作用
  • 批准号:
    10362615
  • 财政年份:
    2021
  • 资助金额:
    $ 40.34万
  • 项目类别:
Comprehensive Human Expressed Sequences in Brain (CHESS-BRAIN) and their roles in neuropsychiatric illness
大脑中综合人类表达序列(CHESS-BRAIN)及其在神经精神疾病中的作用
  • 批准号:
    10205617
  • 财政年份:
    2021
  • 资助金额:
    $ 40.34万
  • 项目类别:
Computational Methods for Microbial and Microbiome Sequence Analysis
微生物和微生物组序列分析的计算方法
  • 批准号:
    10331733
  • 财政年份:
    2019
  • 资助金额:
    $ 40.34万
  • 项目类别:
Computational Methods for Microbial and Microbiome Sequence Analysis
微生物和微生物组序列分析的计算方法
  • 批准号:
    10083744
  • 财政年份:
    2019
  • 资助金额:
    $ 40.34万
  • 项目类别:
The Terabase Search Engine
Terabase 搜索引擎
  • 批准号:
    8882493
  • 财政年份:
    2014
  • 资助金额:
    $ 40.34万
  • 项目类别:
The Terabase Search Engine
Terabase 搜索引擎
  • 批准号:
    8688406
  • 财政年份:
    2014
  • 资助金额:
    $ 40.34万
  • 项目类别:
Computational Gene Modeling and Genome Sequence Assembly
计算基因建模和基因组序列组装
  • 批准号:
    8329127
  • 财政年份:
    2011
  • 资助金额:
    $ 40.34万
  • 项目类别:
Alignment Software for Second-Generation Sequencing
用于第二代测序的比对软件
  • 批准号:
    8068060
  • 财政年份:
    2011
  • 资助金额:
    $ 40.34万
  • 项目类别:
Alignment Software for Second-Generation Sequencing
用于第二代测序的比对软件
  • 批准号:
    8464182
  • 财政年份:
    2011
  • 资助金额:
    $ 40.34万
  • 项目类别:

相似国自然基金

采用复合防护材料的水下多介质耦合作用下重力坝抗爆机理研究
  • 批准号:
    51779168
  • 批准年份:
    2017
  • 资助金额:
    59.0 万元
  • 项目类别:
    面上项目
采用数值计算求解一类半代数系统全部整数解
  • 批准号:
    11671377
  • 批准年份:
    2016
  • 资助金额:
    48.0 万元
  • 项目类别:
    面上项目
采用pinball loss的MEE算法研究
  • 批准号:
    11401247
  • 批准年份:
    2014
  • 资助金额:
    23.0 万元
  • 项目类别:
    青年科学基金项目
采用路径算法和管网简化的城市内涝近实时模拟
  • 批准号:
    41301419
  • 批准年份:
    2013
  • 资助金额:
    25.0 万元
  • 项目类别:
    青年科学基金项目
采用ε近似算法的盲信道均衡
  • 批准号:
    60172058
  • 批准年份:
    2001
  • 资助金额:
    16.0 万元
  • 项目类别:
    面上项目

相似海外基金

Applying Computational Phenotypes To Assess Mental Health Disorders Among Transgender Patients in the United States
应用计算表型评估美国跨性别患者的心理健康障碍
  • 批准号:
    10604723
  • 财政年份:
    2023
  • 资助金额:
    $ 40.34万
  • 项目类别:
Brain Digital Slide Archive: An Open Source Platform for data sharing and analysis of digital neuropathology
Brain Digital Slide Archive:数字神经病理学数据共享和分析的开源平台
  • 批准号:
    10735564
  • 财政年份:
    2023
  • 资助金额:
    $ 40.34万
  • 项目类别:
Toward Accurate Cardiovascular Disease Prediction in Hispanics/Latinos: Modeling Risk and Resilience Factors
实现西班牙裔/拉丁裔的准确心血管疾病预测:风险和弹性因素建模
  • 批准号:
    10852318
  • 财政年份:
    2023
  • 资助金额:
    $ 40.34万
  • 项目类别:
Unified, Scalable, and Reproducible Neurostatistical Software
统一、可扩展且可重复的神经统计软件
  • 批准号:
    10725500
  • 财政年份:
    2023
  • 资助金额:
    $ 40.34万
  • 项目类别:
Single viewpoint panoramic imaging technology for colonoscopy
肠镜单视点全景成像技术
  • 批准号:
    10580165
  • 财政年份:
    2023
  • 资助金额:
    $ 40.34万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了