喵ID:UGjds1免责声明

Freddie: annotation-independent detection and discovery of transcriptomic alternative splicing isoforms using long-read sequencing.

基本信息

DOI:
10.1093/nar/gkac1112
发表时间:
2023-01-25
影响因子:
14.9
通讯作者:
Hach, Faraz
中科院分区:
生物学2区
文献类型:
Journal Article
作者: Orabi, Baraa;Xie, Ning;McConeghy, Brian;Dong, Xuesen;Chauve, Cedric;Hach, Faraz研究方向: Biochemistry & Molecular BiologyMeSH主题词: --
关键词: --
来源链接:pubmed详情页地址

文献摘要

Alternative splicing (AS) is an important mechanism in the development of many cancers, as novel or aberrant AS patterns play an important role as an independent onco-driver. In addition, cancer-specific AS is potentially an effective target of personalized cancer therapeutics. However, detecting AS events remains a challenging task, especially if these AS events are novel. This is exacerbated by the fact that existing transcriptome annotation databases are far from being comprehensive, especially with regard to cancer-specific AS. Additionally, traditional sequencing technologies are severely limited by the short length of the generated reads, which rarely spans more than a single splice junction site. Given these challenges, transcriptomic long-read (LR) sequencing presents a promising potential for the detection and discovery of AS. We present Freddie, a computational annotation-independent isoform discovery and detection tool. Freddie takes as input transcriptomic LR sequencing of a sample alongside its genomic split alignment and computes a set of isoforms for the given sample. It then partitions the input reads into sets that can be processed independently and in parallel. For each partition, Freddie segments the genomic alignment of the reads into canonical exon segments. The goal of this segmentation is to be able to represent any potential isoform as a subset of these canonical exons. This segmentation is formulated as an optimization problem and is solved with a dynamic programming algorithm. Then, Freddie reconstructs the isoforms by jointly clustering and error-correcting the reads using the canonical segmentation as a succinct representation. The clustering and error-correcting step is formulated as an optimization problem—the Minimum Error Clustering into Isoforms (MErCi) problem—and is solved using integer linear programming (ILP). We compare the performance of Freddie on simulated datasets with other isoform detection tools with varying dependence on annotation databases. We show that Freddie outperforms the other tools in its accuracy, including those given the complete ground truth annotation. We also run Freddie on a transcriptomic LR dataset generated in-house from a prostate cancer cell line with a matched short-read RNA-seq dataset. Freddie results in isoforms with a higher short-read cross-validation rate than the other tested tools. Freddie is open source and available at https://github.com/vpc-ccg/freddie/.
可变剪接(AS)是许多癌症发生发展中的一个重要机制,因为新的或异常的可变剪接模式作为一种独立的致癌驱动因素发挥着重要作用。此外,癌症特异性可变剪接有可能成为个性化癌症治疗的一个有效靶点。然而,检测可变剪接事件仍然是一项具有挑战性的任务,尤其是当这些可变剪接事件是新的时。现有的转录组注释数据库远不够全面,特别是在癌症特异性可变剪接方面,这使得情况更加复杂。此外,传统的测序技术受到所产生读长较短的严重限制,这些读长很少能跨越一个以上的剪接连接位点。鉴于这些挑战,转录组长读长(LR)测序为可变剪接的检测和发现提供了有前景的潜力。我们介绍了Freddie,一种不依赖计算注释的异构体发现和检测工具。Freddie将一个样本的转录组长读长测序及其基因组拆分比对作为输入,并为给定样本计算一组异构体。然后它将输入读段划分为可以独立并行处理的集合。对于每个分区,Freddie将读段的基因组比对分割为规范外显子片段。这种分割的目的是能够将任何潜在的异构体表示为这些规范外显子的一个子集。这种分割被表述为一个优化问题,并通过动态规划算法来解决。然后,Freddie通过使用规范分割作为简洁表示对读段进行联合聚类和纠错来重建异构体。聚类和纠错步骤被表述为一个优化问题——最小误差聚类为异构体(MErCi)问题,并通过整数线性规划(ILP)来解决。我们将Freddie在模拟数据集上的性能与其他对注释数据库依赖程度不同的异构体检测工具进行了比较。我们表明Freddie在准确性方面优于其他工具,包括那些给定完整真实注释的工具。我们还在一个由前列腺癌细胞系内部产生的转录组长读长数据集以及一个匹配的短读长RNA - seq数据集上运行了Freddie。Freddie产生的异构体比其他测试工具具有更高的短读长交叉验证率。Freddie是开源的,可在https://github.com/vpc - ccg/freddie/获取。
参考文献(33)
被引文献(4)
Expansion of the eukaryotic proteome by alternative splicing.
DOI:
10.1038/nature08909
发表时间:
2010-01-28
期刊:
Nature
影响因子:
64.8
作者:
通讯作者:
UpSet: Visualization of Intersecting Sets.
DOI:
10.1109/tvcg.2014.2346248
发表时间:
2014-12
期刊:
IEEE transactions on visualization and computer graphics
影响因子:
5.2
作者:
Lex A;Gehlenborg N;Strobelt H;Vuillemot R;Pfister H
通讯作者:
Pfister H
RATTLE: reference-free reconstruction and quantification of transcriptomes from Nanopore sequencing.
DOI:
10.1186/s13059-022-02715-w
发表时间:
2022-07-08
期刊:
Genome biology
影响因子:
12.3
作者:
通讯作者:
Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.
DOI:
10.1038/nbt.1621
发表时间:
2010-05
期刊:
Nature biotechnology
影响因子:
46.9
作者:
Trapnell C;Williams BA;Pertea G;Mortazavi A;Kwan G;van Baren MJ;Salzberg SL;Wold BJ;Pachter L
通讯作者:
Pachter L
The Third Revolution in Sequencing Technology
DOI:
10.1016/j.tig.2018.05.008
发表时间:
2018-09-01
期刊:
TRENDS IN GENETICS
影响因子:
11.4
作者:
van Dijk, Erwin L.;Jaszczyszyn, Yan;Thermes, Claude
通讯作者:
Thermes, Claude

数据更新时间:{{ references.updateTime }}

Hach, Faraz
通讯地址:
Simon Fraser Univ, Dept Math, Burnaby, BC, Canada
所属机构:
Simon Fraser UnivnSimon Fraser UniversitynSimon Fraser University Faculty of SciencenSimon Fraser University Department of Mathematics
电子邮件地址:
--
通讯地址历史:
Univ British Columbia, Dept Comp Sci, Vancouver, BC, Canada
所属机构
Univ British Columbia
University of British Columbia
The University of British Columbia Faculty of Science
The University of British Columbia Department of Computer Science
Vancouver Prostate Ctr, Dept Math, Vancouver, BC, Canada
所属机构
Vancouver Prostate Ctr
Univ British Columbia, Dept Urol Sci, Vancouver, BC, Canada
所属机构
Univ British Columbia
University of British Columbia
The University of British Columbia Faculty of Medicine
The University of British Columbia Department of Urologic Sciences
免责声明免责声明
1、猫眼课题宝专注于为科研工作者提供省时、高效的文献资源检索和预览服务;
2、网站中的文献信息均来自公开、合规、透明的互联网文献查询网站,可以通过页面中的“来源链接”跳转数据网站。
3、在猫眼课题宝点击“求助全文”按钮,发布文献应助需求时求助者需要支付50喵币作为应助成功后的答谢给应助者,发送到用助者账户中。若文献求助失败支付的50喵币将退还至求助者账户中。所支付的喵币仅作为答谢,而不是作为文献的“购买”费用,平台也不从中收取任何费用,
4、特别提醒用户通过求助获得的文献原文仅用户个人学习使用,不得用于商业用途,否则一切风险由用户本人承担;
5、本平台尊重知识产权,如果权利所有者认为平台内容侵犯了其合法权益,可以通过本平台提供的版权投诉渠道提出投诉。一经核实,我们将立即采取措施删除/下架/断链等措施。
我已知晓