喵ID:XJLAid免责声明

Recurrent Neural Network for Predicting Transcription Factor Binding Sites.

用于预测转录因子结合位点的循环神经网络

基本信息

DOI:
10.1038/s41598-018-33321-1
发表时间:
2018-10-15
影响因子:
4.6
通讯作者:
Huang DS
中科院分区:
综合性期刊3区
文献类型:
Journal Article
作者: Shen Z;Bao W;Huang DS研究方向: -- MeSH主题词: --
关键词: --
来源链接:pubmed详情页地址

文献摘要

It is well known that DNA sequence contains a certain amount of transcription factors (TF) binding sites, and only part of them are identified through biological experiments. However, these experiments are expensive and time-consuming. To overcome these problems, some computational methods, based on k-mer features or convolutional neural networks, have been proposed to identify TF binding sites from DNA sequences. Although these methods have good performance, the context information that relates to TF binding sites is still lacking. Research indicates that standard recurrent neural networks (RNN) and its variants have better performance in time-series data compared with other models. In this study, we propose a model, named KEGRU, to identify TF binding sites by combining Bidirectional Gated Recurrent Unit (GRU) network with k-mer embedding. Firstly, DNA sequences are divided into k-mer sequences with a specified length and stride window. And then, we treat each k-mer as a word and pre-trained word representation model though word2vec algorithm. Thirdly, we construct a deep bidirectional GRU model for feature learning and classification. Experimental results have shown that our method has better performance compared with some state-of-the-art methods. Additional experiments about embedding strategy show that k-mer embedding will be helpful to enhance model performance. The robustness of KEGRU is proved by experiments with different k-mer length, stride window and embedding vector dimension.
众所周知,DNA序列包含一定量的转录因子(TF)结合位点,并且仅通过生物学实验鉴定了其中的一部分。但是,这些实验既昂贵又耗时。为了克服这些问题,已经提出了一些基于K-MER特征或卷积神经网络的计算方法,以从DNA序列中识别TF结合位点。尽管这些方法具有良好的性能,但仍然缺乏与TF绑定站点有关的上下文信息。研究表明,与其他模型相比,标准复发性神经网络(RNN)及其变体在时间序列数据中具有更好的性能。在这项研究中,我们提出了一个名为Kegru的模型,以通过将双向门控复发单元(GRU)网络与K-MER嵌入相结合,以识别TF结合位点。首先,将DNA序列分为具有指定长度和步幅窗口的K-MER序列。然后,我们将每个K-MER视为单词和预训练的单词表示模型,该模型是Word2Vec算法。第三,我们为特征学习和分类构建了深层双向GRU模型。实验结果表明,与某些最新方法相比,我们的方法具有更好的性能。有关嵌入策略的其他实验表明,K-MER嵌入将有助于提高模型性能。 Kegru的鲁棒性通过具有不同的K-MER长度,步幅窗口和嵌入向量维度的实验证明。
参考文献(0)
被引文献(0)
DeepCRISPR: optimized CRISPR guide RNA design by deep learning.
DeepCRISPR:通过深度学习优化 CRISPR 引导 RNA 设计
DOI:
10.1186/s13059-018-1459-4
发表时间:
2018-06-26
期刊:
Genome biology
影响因子:
12.3
作者:
Chuai G;Ma H;Yan J;Chen M;Hong N;Xue D;Zhou C;Zhu C;Chen K;Duan B;Gu F;Qu S;Huang D;Wei J;Liu Q
通讯作者:
Liu Q
A library of yeast transcription factor motifs reveals a widespread function for Rsc3 in targeting nucleosome exclusion at promoters.
DOI:
10.1016/j.molcel.2008.11.020
发表时间:
2008-12-26
期刊:
MOLECULAR CELL
影响因子:
16
作者:
Badis, Gwenael;Chan, Esther T.;van Bakel, Harm;Pena-Castillo, Lourdes;Tillo, Desiree;Tsui, Kyle;Carlson, Clayton D.;Gossett, Andrea J.;Hasinoff, Michael J.;Warren, Christopher L.;Gebbia, Marinella;Talukder, Shaheynoor;Yang, Ally;Mnaimneh, Sanie;Terterov, Dimitri;Coburn, David;Yeo, Ai Li;Yeo, Zhen Xuan;Clarke, Neil D.;Lieb, Jason D.;Ansari, Aseem Z.;Nislow, Corey;Hughes, Timothy R.
通讯作者:
Hughes, Timothy R.
Learning to forget: Continual prediction with LSTM
DOI:
10.1162/089976600300015015
发表时间:
2000-10-01
期刊:
NEURAL COMPUTATION
影响因子:
2.9
作者:
Gers, FA;Schmidhuber, J;Cummins, F
通讯作者:
Cummins, F
On the properties of neural machine translation: Encoder-decoder approaches
DOI:
10.3115/v1/w14-4012
发表时间:
2014-01-01
期刊:
arXiv
影响因子:
0
作者:
Cho, K.;van Merrienboer, B.;Bengio, Y.
通讯作者:
Bengio, Y.
Learning phrase representations using rnn encoder-decoder for statistical machine translation
DOI:
10.3115/v1/d14-1179
发表时间:
2014-01-01
期刊:
INT C MACH LEARN ICM
影响因子:
0
作者:
Cho, K;Van Merrienboer, B;Bengio, Y
通讯作者:
Bengio, Y

数据更新时间:{{ references.updateTime }}

Huang DS
通讯地址:
--
所属机构:
--
电子邮件地址:
--
免责声明免责声明
1、猫眼课题宝专注于为科研工作者提供省时、高效的文献资源检索和预览服务;
2、网站中的文献信息均来自公开、合规、透明的互联网文献查询网站,可以通过页面中的“来源链接”跳转数据网站。
3、在猫眼课题宝点击“求助全文”按钮,发布文献应助需求时求助者需要支付50喵币作为应助成功后的答谢给应助者,发送到用助者账户中。若文献求助失败支付的50喵币将退还至求助者账户中。所支付的喵币仅作为答谢,而不是作为文献的“购买”费用,平台也不从中收取任何费用,
4、特别提醒用户通过求助获得的文献原文仅用户个人学习使用,不得用于商业用途,否则一切风险由用户本人承担;
5、本平台尊重知识产权,如果权利所有者认为平台内容侵犯了其合法权益,可以通过本平台提供的版权投诉渠道提出投诉。一经核实,我们将立即采取措施删除/下架/断链等措施。
我已知晓