Exploiting Differentiable Programming Models For Protein Structure Prediction And Modelling

利用可微分编程模型进行蛋白质结构预测和建模

基本信息

  • 批准号:
    BB/W008556/1
  • 负责人:
  • 金额:
    $ 51.79万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2022
  • 资助国家:
    英国
  • 起止时间:
    2022 至 无数据
  • 项目状态:
    未结题

项目摘要

Proteins are molecules present in every cell that carry out essential biological processes. These molecules are essentially strings of simpler chemicals, called amino acids and these strings are able to self-assemble into a unique 3-D structure as soon as the protein is made by the cell's protein-making machinery (called ribosomes). It's this unique structure that determines the function of the protein (i.e. what is does in the cell and how it does it). By shining X-rays on crystallised proteins, scientists can determine their structure by looking at how the rays reflect off the layers of atoms that make up the crystal. However, this process can take many months or even years of effort. With hundreds of thousands of proteins for which the native structure is unknown, it is not surprising that scientists are keen to find clever shortcuts to working out the structure of proteins. We, like many other scientists have been trying to decipher the so-called protein folding "code" i.e. trying to work out the rules which govern how the protein finds its unique structure and then trying to program a computer with these rules to allow scientists to quickly "predict" what the structure of their protein of interest might be.Although the shape or "fold" of a single protein is an important piece of information, it is arguably even more useful to determine which proteins interact with a given protein of interest, and the geometry these so-called protein complexes i.e. groups of proteins which have evolved to stick together in a very specific way. Good examples of such complexes are found in many areas of biology and medicine. For example, a number of different protein complexes play a crucial role in controlling how blood clots. In general, protein-protein complexes underlie our whole understanding of how cells and organisms operate as "systems" - which is a field known as "systems biology". Unfortunately, experimentally studying the structure of a protein complex is even more difficult than studying the structure of a single protein, and so scientists have an urgent need for better computational tools to allow them to predict which proteins could interact and the likely overall shape of the complex that they form.In this project, we propose to exploit some recent breakthroughs in computing and artificial intelligence to allow us to deduce which parts of proteins are likely to interact and the structures of the complexes that they form when they do. In a nutshell we start by looking for pairs of amino acids that appear to change in synchrony when we look at the different versions of the proteins found in different organisms i.e. we look for cases where a change in one amino acid always seem to occur when we see another amino acid changing. These linked changes are called "correlated mutations" and when we find them, we can be reasonably sure that the two amino acids have evolved to be close together in 3-D space in the final folded form of the protein. If we find enough correlated mutations, we can even go as far as predicting the complete structure of the protein and we hope as far as predicting the structure of a protein-protein complex in a similar way. To do this we will use a new type of computer software called "differentiable programming". This means that our computer programs are treated like mathematical formulae which can be improved by applying basic rules of calculus. In this way, the accuracy of our methods can be automatically improved as more data is obtained to optimize the algorithms.
蛋白质是存在于每个细胞中执行重要生物过程的分子。这些分子本质上是一串更简单的化学物质,称为氨基酸,一旦细胞的蛋白质制造机器(称为核糖体)制造出蛋白质,这些串就能够自组装成独特的 3D 结构。正是这种独特的结构决定了蛋白质的功能(即在细胞中做什么以及如何做)。通过将 X 射线照射到结晶​​蛋白质上,科学家可以通过观察射线如何从构成晶体的原子层反射来确定其结构。然而,这个过程可能需要数月甚至数年的努力。由于有数十万种蛋白质的天然结构未知,因此科学家们热衷于寻找巧妙的捷径来解决蛋白质的结构也就不足为奇了。像许多其他科学家一样,我们一直在尝试破译所谓的蛋白质折叠“代码”,即尝试找出控制蛋白质如何发现其独特结构的规则,然后尝试用这些规则对计算机进行编程,以使科学家能够快速“预测”感兴趣的蛋白质的结构。虽然单个蛋白质的形状或“折叠”是重要的信息,但可以说确定哪些蛋白质与给定的感兴趣的蛋白质相互作用更有用,以及这些几何形状所谓的蛋白质复合物,即进化为以非常特定的方式粘在一起的蛋白质组。在生物学和医学的许多领域都可以找到此类复合物的好例子。例如,许多不同的蛋白质复合物在控制血液凝结方面发挥着至关重要的作用。一般来说,蛋白质-蛋白质复合物是我们对细胞和生物体如何作为“系统”运作的整体理解的基础——这是一个被称为“系统生物学”的领域。不幸的是,通过实验研究蛋白质复合物的结构比研究单个蛋白质的结构更加困难,因此科学家迫切需要更好的计算工具来预测哪些蛋白质可以相互作用以及蛋白质可能的整体形状。在这个项目中,我们建议利用计算和人工智能方面的一些最新突破,使我们能够推断出蛋白质的哪些部分可能相互作用,以及它们相互作用时形成的复合物的结构。简而言之,当我们观察不同生物体中发现的不同版本的蛋白质时,我们首先寻找似乎同步变化的氨基酸对,即我们寻找当我们观察时似乎总是发生一种氨基酸变化的情况。看到另一种氨基酸的变化。这些相关的变化被称为“相关突变”,当我们发现它们时,我们可以合理地确定这两个氨基酸已经进化到在蛋白质的最终折叠形式中在 3D 空间中紧密结合在一起。如果我们找到足够多的相关突变,我们甚至可以预测蛋白质的完整结构,并且我们希望能够以类似的方式预测蛋白质-蛋白质复合物的结构。为此,我们将使用一种称为“可微分编程”的新型计算机软件。这意味着我们的计算机程序被视为数学公式,可以通过应用微积分的基本规则来改进。这样,随着获得更多数据来优化算法,我们的方法的准确性可以自动提高。

项目成果

期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Machine learning methods for predicting protein structure from single sequences.
  • DOI:
    10.1016/j.sbi.2023.102627
  • 发表时间:
    2023-06
  • 期刊:
  • 影响因子:
    6.8
  • 作者:
    S. M. Kandathil;Andy M. Lau;David T. Jones
  • 通讯作者:
    S. M. Kandathil;Andy M. Lau;David T. Jones
Merizo: a rapid and accurate domain segmentation method using invariant point attention
Merizo:一种使用不变点注意力的快速准确的域分割方法
  • DOI:
    10.1101/2023.02.19.529114
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Lau A
  • 通讯作者:
    Lau A
Merizo: a rapid and accurate protein domain segmentation method using invariant point attention.
  • DOI:
    10.1038/s41467-023-43934-4
  • 发表时间:
    2023-12-19
  • 期刊:
  • 影响因子:
    16.6
  • 作者:
    Lau, Andy M.;Kandathil, Shaun M.;Jones, David T.
  • 通讯作者:
    Jones, David T.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

David Jones其他文献

Use of Computers in Assessment: A Potential Solution to the Documentation Dilemma of the Activities Coordinator
在评估中使用计算机:活动协调员文档困境的潜在解决方案
  • DOI:
  • 发表时间:
    1986
  • 期刊:
  • 影响因子:
    0
  • 作者:
    K. Halberg;Lisa E. Duncan;N. Z. Mitchell;F. Hendrick;David Jones
  • 通讯作者:
    David Jones
Central Stars of Planetary Nebulae
行星状星云的中心恒星
Air Toxics Under The Big Sky – A High School Science Teaching Tool
广阔天空下的空气毒物——高中科学教学工具
  • DOI:
  • 发表时间:
    2007
  • 期刊:
  • 影响因子:
    0
  • 作者:
    David Jones;T. Ward;D. Vanek;Nancy Marra;C. Noonan;Garon C. Smith;Earle Adams
  • 通讯作者:
    Earle Adams
Climate change and the prescription pad
气候变化和处方簿
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    C. Richie;A. Kesselheim;David Jones
  • 通讯作者:
    David Jones
An experimental study into the effects of positive subliminal priming and its effect on peoples levels of happiness
积极潜意识启动效应及其对人们幸福水平影响的实验研究
  • DOI:
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    David Jones
  • 通讯作者:
    David Jones

David Jones的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('David Jones', 18)}}的其他基金

Open Access Block Award 2024 - The Francis Crick Institute
2024 年开放获取区块奖 - 弗朗西斯·克里克研究所
  • 批准号:
    EP/Z531844/1
  • 财政年份:
    2024
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Research Grant
Open Access Block Award 2023 - The Francis Crick Institute
2023 年开放获取区块奖 - 弗朗西斯·克里克研究所
  • 批准号:
    EP/Y530360/1
  • 财政年份:
    2023
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Research Grant
Open Access Block Award 2022 - The Francis Crick Institute
2022 年开放获取区块奖 - 弗朗西斯·克里克研究所
  • 批准号:
    EP/X526381/1
  • 财政年份:
    2022
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Research Grant
Accelerating and enhancing the PSIPRED Workbench with deep learning
通过深度学习加速和增强 PSIPRED Workbench
  • 批准号:
    BB/T019409/1
  • 财政年份:
    2021
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Research Grant
Statewide effort to diversify undergraduate engineering student population.
全州范围内努力使本科工程学生群体多样化。
  • 批准号:
    1848696
  • 财政年份:
    2018
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Standard Grant
Cross Disciplinary Thinking about 'Antisocial Personality Disorder'.
关于“反社会人格障碍”的跨学科思考。
  • 批准号:
    ES/L000911/2
  • 财政年份:
    2017
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Research Grant
ANAMMARKS: ANaerobic AMmonium oxidiation bioMARKers in paleoenvironmentS
ANAMMARKS:古环境中的厌氧铵氧化生物标志物
  • 批准号:
    NE/N011112/1
  • 财政年份:
    2016
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Research Grant
Newcastle University Confidence in Concept 2014
纽卡斯尔大学 2014 年理念信心
  • 批准号:
    MC_PC_14101
  • 财政年份:
    2015
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Intramural
Expansion and Further Development of the PSIPRED Protein Structure and Function Bioinformatics Workbench
PSIPRED 蛋白质结构和功能生物信息学工作台的扩展和进一步发展
  • 批准号:
    BB/M011712/1
  • 财政年份:
    2015
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Research Grant
Large area two dimensional mapping of carbon dioxide fluxes for assessment and control of carbon capture and storage project
大面积二维二氧化碳通量测绘,用于碳捕获和封存项目的评估和控制
  • 批准号:
    ST/L00626X/1
  • 财政年份:
    2014
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Research Grant

相似国自然基金

基于可微分光线追踪的端到端折衍射复杂透镜混合设计
  • 批准号:
  • 批准年份:
    2023
  • 资助金额:
    48 万元
  • 项目类别:
可微分三维辐射传输建模与高分辨率冠层参数反演
  • 批准号:
    42371345
  • 批准年份:
    2023
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目
三维微分系统的可积性与动力学
  • 批准号:
    12301205
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
高效率可微分蒙特卡洛光线追踪渲染算法与系统研究
  • 批准号:
    62372257
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
足式机器人虚实融合可微分仿真理论与应用研究
  • 批准号:
    62373242
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目

相似海外基金

CAREER: Differentiable Programming for Visual Computing
职业:视觉计算的可微分编程
  • 批准号:
    2238839
  • 财政年份:
    2023
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Continuing Grant
ELEMENTS: CLAD ENABLING DIFFERENTIABLE PROGRAMMING IN SCIENCE
元素:CLAD 实现科学中的差异化编程
  • 批准号:
    2311471
  • 财政年份:
    2023
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Standard Grant
Differentiable Programming for Computer Vision and Medical Image Analysis
计算机视觉和医学图像分析的可微分编程
  • 批准号:
    RGPIN-2020-04139
  • 财政年份:
    2022
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Discovery Grants Program - Individual
Collaborative Research: Frameworks: Convergence of Bayesian inverse methods and scientific machine learning in Earth system models through universal differentiable programming
协作研究:框架:通过通用可微编程将贝叶斯逆方法和科学机器学习在地球系统模型中融合
  • 批准号:
    2103791
  • 财政年份:
    2021
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Standard Grant
Collaborative Research: Frameworks: Convergence of Bayesian inverse methods and scientific machine learning in Earth system models through universal differentiable programming
协作研究:框架:通过通用可微编程将贝叶斯逆方法和科学机器学习在地球系统模型中融合
  • 批准号:
    2104009
  • 财政年份:
    2021
  • 资助金额:
    $ 51.79万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了