DUAL SCALE COMPUTING WITH RNA
RNA 双尺度计算
基本信息
- 批准号:7601404
- 负责人:
- 金额:$ 0.03万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2007
- 资助国家:美国
- 起止时间:2007-08-01 至 2008-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This subproject is one of many research subprojects utilizing the
resources provided by a Center grant funded by NIH/NCRR. The subproject and
investigator (PI) may have received primary funding from another NIH source,
and thus could be represented in other CRISP entries. The institution listed is
for the Center, which is not necessarily the institution for the investigator.
Our understanding of the role of RNA in biology has expanded enormously over the last two decades. Originally, RNA was understood to participate in protein expression as a passive carrier of genetic information (mRNA) and as an adapter molecule (tRNA) for reading the code. Then RNA was discovered to catalyze reactions, first self-splicing, then phosphodiester bond cleavage, and, recently, peptide bond formation. RNA is now known to play important functions in many diverse cellular processes, such as development, immunity, RNA editing and modification, and post-transcriptional gene regulation. RNA is also an important player in many diseases, including Prader-Willi, b-thalassemia, and myotonic dystrophy. With the increasing awareness of RNAs biological diversity, the ability to harness RNA as a tool has increased. RNA sequences can be evolved in vitro to catalyze many reactions that are not part of the natural repertoire. Antisense and RNAi can be used to modulate gene expression. Our group is interested in developing new computational biology tools and applying these tools to understanding RNA structure and function. We are currently working on four projects with extensive computational requirements: 1) We have developed a method for finding novel non-coding RNA (ncRNA) genes in genomic sequence. These are RNA sequences that function directly, without coding for a protein. Our method relies on our Dynalign algorithm to determine the lowest free energy structure common to two, unaligned sequences (Mathews, 2005; Mathews & Turner, 2002). Folding free energies, as determined by nearest neighbor parameters (Mathews et al., 2004; Mathews et al., 1999), of two ncRNA sequences are significantly lower than folding free energies of random sequences of identical dinucleotide-frequency content. Dynalign requires no sequence identity to find the common structure. We have therefore found that we can discover homologous ncRNA genes in crudely aligned genome fragments with greater sensitivity than other methods, especially at low sequence identity. 2) We have incorporated nudged elastic band (NEB) (Jnsson et al., 1998) into the AMBER molecular dynamics package (Pearlman et al., 1995) in collaboration with Dr. David Case, The Scripps Research Institute. NEB provides a time-scale independent method for finding low-energy pathways for conformational changes. We are currently applying NEB to understanding strand invasion, by which intermolecular base pairs displace intramolecular base pairs. This is important for understanding RNAi and antisense mechanisms. 3) We are using free energy calculations to understand the nature of SHAPE mapping of RNA structure. SHAPE maps RNA structures using N-methylisatoic anhydride to acylate the 2 OH of flexible RNA nucleotides (Merino et al., 2005; Wilkinson et al., 2005). It is known that the acylation reaction rate is dependent on the pKa of the 2 OH, but it is unclear why flexible nucleotides have lower pKas. Our simulations with AMBER using tRNA structures will be used to determine the connection. 4) We are developing new methods based on the Jarzynski equality to determine unfolding free energies of RNA hairpins on reasonable timescales (Jarzynski, 1997). Our goal is to test whether the AMBER forcefield can reproduce unfolding free energies found experimentally by mechanical pulling (Liphardt et al., 2002; Liphardt et al., 2001). Ultimately, we will model the molecular-level details of mechanical unfolding of RNA. For this Development Application to Teragrid, we propose to use most of the allocation in finding non-coding RNA sequences (project 1 above). To start, we will scan an alignment of the E. coli genome to S. Typhi that we have constructed using MUMmer 2 (Kurtz et al., 2004). We estimate that this will require 6,000 CPU hours on a 3 GHz Intel Pentium 4. We also plan use the balance of the allocation to benchmark our methods outlined in projects 2-4 and to start the scan of another genome for ncRNA genes. This will allow us to prepare a Medium Resource Allocation for Teragrid resources in the next 4 to 6 months. References: Jarzynski, C. (1997). Nonequilibrium equality for free energy differences. Phys. Rev. Lett. 78, 2690-2693. Jnsson, H., Mills, G. & Jacobsen, K. W. (1998). Nudged elastic band method for finding minimum energy paths of transitions. In Classical and Quantum Dynamics in Condensed Phase Simulations (Berne, B. J., Ciccoti, G. & Coker, D. F., eds.), pp. 385-404. World Scientific, Singapore. Kurtz, S., Phillippy, A., Delcher, A. L., Smoot, M., Shumway, M., Antonescu, C., et al. (2004). Versatile and open software for comparing large genomes. Genome Biol 5, R12. Liphardt, J., Dumont, S., Smith, S. B., Tinoco, I., Jr. & Bustamante, C. (2002). Equilibrium information from nonequilibrium measurements in an experimental test of Jarzynski's equality. Science 296, 1832-5. Liphardt, J., Onoa, B., Smith, S. B., Tinoco, I. J. & Bustamante, C. (2001). Reversible unfolding of single RNA molecules by mechanical force. Science 292, 733-7. Mathews, D. H. (2005). Predicting a set of minimal free energy RNA secondary structures common to two sequences. Bioinformatics 21, 2246-2253. Mathews, D. H., Disney, M. D., Childs, J. L., Schroeder, S. J., Zuker, M. & Turner, D. H. (2004). Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc. Natl. Acad. Sci. USA 101, 7287-7292. Mathews, D. H., Sabina, J., Zuker, M. & Turner, D. H. (1999). Expanded sequence dependence of thermodynamic parameters provides improved prediction of RNA Secondary Structure. J. Mol. Biol. 288, 911-940. Mathews, D. H. & Turner, D. H. (2002). Dynalign: An algorithm for finding the secondary structure common to two RNA sequences. J. Mol. Biol. 317, 191-203. Merino, E. J., Wilkinson, K. A., Coughlan, J. L. & Weeks, K. M. (2005). RNA structure analysis at single nucleotide resolution by selective 2'-hydroxyl acylation and primer extension (SHAPE). J Am Chem Soc 127, 4223-31. Pearlman, D. A., Case, D. A., Caldwell, J. W., Ross, W. S., Cheatham, T. E., III, DeBolt, S., et al. (1995). AMBER, a package of computer programs for applying molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules. Comp. Phys. Commun. 91, 1-41. Wilkinson, K. A., Merino, E. J. & Weeks, K. M. (2005). RNA SHAPE chemistry reveals nonhierarchical interactions dominate equilibrium structural transitions in tRNA(Asp) transcripts. J Am Chem Soc 127, 4659-67.
该副本是利用众多研究子项目之一
由NIH/NCRR资助的中心赠款提供的资源。子弹和
调查员(PI)可能已经从其他NIH来源获得了主要资金,
因此可以在其他清晰的条目中代表。列出的机构是
对于中心,这不一定是调查员的机构。
在过去的二十年中,我们对RNA在生物学中的作用的理解已大大扩展。最初,RNA被理解为作为遗传信息(mRNA)的被动载体和作为读取代码的衔接分子(mRNA)的被动载体。然后发现RNA是为了催化反应,首先是自剪接,然后是磷酸二酯键的裂解,最近是肽键的形成。现在已知RNA在许多不同的细胞过程中起重要功能,例如发育,免疫,RNA编辑和修饰以及转录后基因调节。 RNA也是许多疾病中的重要参与者,包括Prader-Willi,B- Thalassemia和Myotonic Distrophophy。随着RNA生物多样性的意识越来越多,将RNA作为工具的能力提高了。 RNA序列可以在体外演化,以催化许多不属于自然曲目的反应。反义和RNAi可用于调节基因表达。我们的小组有兴趣开发新的计算生物学工具,并应用这些工具来理解RNA的结构和功能。我们目前正在研究具有广泛计算要求的四个项目:1)我们开发了一种在基因组序列中找到新型非编码RNA(NCRNA)基因的方法。这些是直接起作用的RNA序列,而无需编码蛋白质。我们的方法依赖于我们的Dynalign算法来确定两个未对齐序列共有的最低自由能结构(Mathews,2005; Mathews&Turner,2002)。由最近的邻居参数确定的折叠自由能(Mathews等,2004; Mathews等,1999)的两个NCRNA序列明显低于相同二核苷酸频率含量的随机序列的折叠自由能。 Dynalign不需要序列身份才能找到共同的结构。因此,我们发现我们可以发现比其他方法更灵敏的粗糙对准基因组片段中的同源ncRNA基因,尤其是在低序列同一性下。 2)我们已与Scripps Research Institute的David Case博士合作,将弹性的弹性带(NEB)(NEB)(JNSSON等,1998)融入了Amber Molecular Dynamics软件包(Pearlman等,1995)。 NEB提供了一种时间尺度的独立方法,用于寻找构象变化的低能途径。我们目前正在应用NEB来理解链入侵,分子间碱基对置换分子内碱基对。这对于理解RNAi和反义机制很重要。 3)我们使用自由能计算来了解RNA结构的形状映射的性质。使用N-甲基撒甲性酸酐映射RNA结构,以酰化柔性RNA核苷酸的2 OH(Merino等,2005; Wilkinson等,2005)。众所周知,酰基化反应速率取决于2 OH的PKA,但尚不清楚柔性核苷酸为何具有较低的PKA。我们使用琥珀色使用tRNA结构的模拟将用于确定连接。 4)我们正在基于Jarzynski平等开发新方法,以确定RNA发夹在合理的时间尺度上展开的自由能(Jarzynski,1997)。我们的目标是测试琥珀色力场是否可以通过机械拉动实验发现的展开自由能(Liphardt等,2002; Liphardt等,2001)。最终,我们将建模RNA机械展开的分子级细节。对于TeraGRID的此开发应用,我们建议在查找非编码RNA序列(上面的项目1)中使用大多数分配。首先,我们将扫描大肠杆菌基因组与已使用Mummer 2构建的Typhi的对齐(Kurtz等,2004)。我们估计,这将需要3 GHz Intel Pentium 4中的6,000个CPU小时。我们还计划使用分配的平衡来基准我们在项目2-4中概述的方法,并开始对NCRNA基因的另一个基因组进行扫描。这将使我们能够在接下来的4到6个月内为TERAGRID资源准备中等资源分配。参考文献:Jarzynski,C。(1997)。自由能差的非平等。物理。莱特牧师。 78,2690-2693。 Jnsson,H.,Mills,G。&Jacobsen,K。W.(1998)。用于寻找过渡的最小能量路径的弹性弹性带方法。在冷凝相模拟中的经典和量子动力学中(Berne,B.J.,Ciccoti,G。&Coker,D。F.,编辑),第385-404页。世界科学,新加坡。 Kurtz,S.,Phillippy,A.,Delcher,A.L.,Smoot,M.,Shumway,M.,Antonescu,C。等。 (2004)。用于比较大基因组的多功能和开放软件。基因组生物5,R12。 Liphardt,J.,Dumont,S.,Smith,S.B.,Tinoco,I.,Jr。&Bustamante,C。(2002)。在Jarzynski平等的实验测试中,来自非平衡测量的平衡信息。科学296,1832-5。 Liphardt,J.,Onoa,B.,Smith,S.B.,Tinoco,I。J.&Bustamante,C。(2001)。通过机械力对单个RNA分子的可逆展开。科学292,733-7。 Mathews,D。H.(2005)。预测一组最小的自由能RNA二级结构,共有两个序列。生物信息学21,2246-2253。 Mathews,D。H.,Disney,M。D.,Childs,J.L.,Schroeder,S.J.,Zuker,M。&Turner,D。H.(2004)。将化学修饰约束纳入用于预测RNA二级结构的动态编程算法中。 Proc。纳特。学院。科学。美国101,7287-7292。 Mathews,D。H.,Sabina,J.,Zuker,M。&Turner,D。H.(1999)。热力学参数的扩展序列依赖性可改善RNA二级结构的预测。 J. Mol。生物。 288,911-940。 Mathews,D。H.&Turner,D。H.(2002)。 Dynalign:一种用于查找两个RNA序列共有的二级结构的算法。 J. Mol。生物。 317,191-203。 Merino,E。J.,Wilkinson,K。A.,Coughlan,J。L.&Weeks,K。M.(2005)。通过选择性2'-羟基化和引物延伸(形状),在单核苷酸分辨率下进行的RNA结构分析。 J Am Chem Soc 127,4223-31。 Pearlman,D.A.,Case,D.A.,Caldwell,J.W.,Ross,W。S.,Cheatham,T。E.,III,III,Debolt,S。等。 (1995)。琥珀色,用于应用分子动力学和自由能计算的计算机程序包,以模拟分子的结构和能量性能。 comp。物理。社区。 91,1-41。 Wilkinson,K。A.,Merino,E。J.&Weeks,K。M.(2005)。 RNA形状化学揭示了非等级相互作用主导了tRNA(ASP)转录本中的平衡结构跃迁。 J Am Chem Soc 127,4659-67。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

暂无数据
数据更新时间:2024-06-01
David Mathews的其他基金
Novel Combined Costimulation and CD122 Blockade in Islet Transplantation
胰岛移植中的新型联合共刺激和 CD122 阻断
- 批准号:91244309124430
- 财政年份:2016
- 资助金额:$ 0.03万$ 0.03万
- 项目类别:
Assertive Community Living for Appalachian Dual-Diagnosed Adults
阿巴拉契亚双重诊断成年人的自信社区生活
- 批准号:74193667419366
- 财政年份:2006
- 资助金额:$ 0.03万$ 0.03万
- 项目类别:
Assertive Community Living for Appalachian Dual-Diagnosed Adults
阿巴拉契亚双重诊断成年人的自信社区生活
- 批准号:74655587465558
- 财政年份:2006
- 资助金额:$ 0.03万$ 0.03万
- 项目类别:
相似国自然基金
分布式非凸非光滑优化问题的凸松弛及高低阶加速算法研究
- 批准号:12371308
- 批准年份:2023
- 资助金额:43.5 万元
- 项目类别:面上项目
资源受限下集成学习算法设计与硬件实现研究
- 批准号:62372198
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
基于物理信息神经网络的电磁场快速算法研究
- 批准号:52377005
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
考虑桩-土-水耦合效应的饱和砂土变形与流动问题的SPH模型与高效算法研究
- 批准号:12302257
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
面向高维不平衡数据的分类集成算法研究
- 批准号:62306119
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
New Generation of General AMBER Force Field for Biomedical Research
用于生物医学研究的新一代通用琥珀力场
- 批准号:1079882910798829
- 财政年份:2022
- 资助金额:$ 0.03万$ 0.03万
- 项目类别:
New Generation of General AMBER Force Field for Biomedical Research
用于生物医学研究的新一代通用琥珀力场
- 批准号:1070955110709551
- 财政年份:2022
- 资助金额:$ 0.03万$ 0.03万
- 项目类别:
New Generation of General AMBER Force Field for Biomedical Research
用于生物医学研究的新一代通用琥珀力场
- 批准号:1050388610503886
- 财政年份:2022
- 资助金额:$ 0.03万$ 0.03万
- 项目类别:
Combining molecular dynamics simulations with crystallographic refinement
将分子动力学模拟与晶体学细化相结合
- 批准号:92436219243621
- 财政年份:2017
- 资助金额:$ 0.03万$ 0.03万
- 项目类别:
Cerebral Oxygen Metabolism and Functional Network Architecture in Pediatric Sickle Cell Disease
小儿镰状细胞病的脑氧代谢和功能网络架构
- 批准号:92955699295569
- 财政年份:2017
- 资助金额:$ 0.03万$ 0.03万
- 项目类别: