3D-Gateway - Gateway to protein structure and function

3D-Gateway - 蛋白质结构和功能的门户

基本信息

  • 批准号:
    BB/S020144/1
  • 负责人:
  • 金额:
    $ 37.37万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2020
  • 资助国家:
    英国
  • 起止时间:
    2020 至 无数据
  • 项目状态:
    已结题

项目摘要

Proteins comprise long chains of organic molecules that fold into compact globular 3-dimensional structures. Knowing this structure can give very valuable insights into the clefts, pockets or other surface features important for binding other molecules in the cell eg small molecules or proteins. Knowledge of the structure is also essential for designing drugs that bind to these features and inhibit the protein and can also help in understanding whether mutations in the protein's residues affect its stability or function, leading to disease. Experimentally determining the structure can be challenging, which is why only a small percentage of known proteins (~145,000 out of 120 million) have been characterised. However, powerful computational methods have been developed that predict protein structures by inheriting structural information from evolutionary related proteins whose structures are known. These prediction techniques have been made even more powerful, recently, as new ways of exploiting the evolutionary data have been found that more accurately constrain contacts in the protein. Applying these techniques, structures can be predicted for a large proportion of uncharacterised proteins. For example, for human proteins about 5% of the structures are known but a further 88% can be modelled, some to very high accuracy, thereby providing important frameworks for designing drugs to treat human diseases. When inheriting structural data between distant relatives one has to be much more cautious and most prediction methods return a confidence score for the models produced. This project will build an infrastructure (3D-Beacons) that aggregates experimentally determined structures with predicted structures generated by groups applying different algorithms. This will be done for proteins from selected organisms relevant to food security and human health - some will be pathogenic bacteria that threaten humans or animals/crops. We will use this data to annotate proteins in the UniProt resource, widely used by more than 750,000 unique users each month. Since the prediction methods reside in many different labs, by pooling the data in this way we can significantly increase the number of proteins with structural data. In addition, combining models built by independent algorithms allows us to compare 3D-models to find which parts agree regardless of method and which parts vary between methods and are clearly harder to model. Therefore, we will use this aggregated data to research the best strategies for calculating model quality at each position in the protein.We will build web pages to display the known and predicted structures for a given protein. It can be difficult to determine the structure of the whole protein so, where appropriate, we will display both experimental and predicted structures, taking great care to label the structures with information on the source (eg method used) and reliability of the data (eg confidence).We will also use our 3D-Beacons infrastructure to aggregate information on known and predicted functional sites on the protein structure and display this data on web pages, together with information on source and confidence. The site data mapped onto structure will be particularly helpful for developing rules that allow us to gauge whether a protein with no experimental characterisation has the same function as an evolutionary related protein with experimental characterisation. Relatives sharing the same function should have the same key functional site residues. With these rules we will be able to provide structural and functional annotations for millions of proteins in UniProt. The new data will represent a tenfold or more increase in the number of UniProt sequences which have structural and functional site information. UniProt is also widely used by researchers in industry and thus this expansion in information will have a very significant impact.
蛋白质由长链有机分子组成,可折叠成紧凑的球状 3 维结构。了解这种结构可以为了解裂缝、口袋或其他表面特征提供非常有价值的见解,这些特征对于结合细胞中的其他分子(例如小分子或蛋白质)很重要。了解该结构对于设计与这些特征结合并抑制蛋白质的药物也至关重要,并且还可以帮助了解蛋白质残基的突变是否会影响其稳定性或功能,从而导致疾病。通过实验确定其结构可能具有挑战性,这就是为什么只有一小部分已知蛋白质(1.2 亿个蛋白质中的约 145,000 个)得到了表征。然而,已经开发出强大的计算方法,通过继承结构已知的进化相关蛋白质的结构信息来预测蛋白质结构。最近,随着发现利用进化数据的新方法可以更准确地限制蛋白质中的接触,这些预测技术变得更加强大。应用这些技术,可以预测大部分未表征的蛋白质的结构。例如,对于人类蛋白质,大约 5% 的结构是已知的,但另外 88% 的结构可以建模,其中一些精度非常高,从而为设计治疗人类疾病的药物提供重要框架。当继承远亲之间的结构数据时,人们必须更加谨慎,大多数预测方法都会返回所生成模型的置信度得分。该项目将构建一个基础设施(3D 信标),将实验确定的结构与应用不同算法的小组生成的预测结​​构聚合在一起。这将针对来自与粮食安全和人类健康相关的选定生物体的蛋白质进行,其中一些是威胁人类或动物/农作物的病原菌。我们将使用这些数据来注释 UniProt 资源中的蛋白质,该资源每月被超过 750,000 名唯一用户广泛使用。由于预测方法存在于许多不同的实验室,通过以这种方式汇集数据,我们可以显着增加具有结构数据的蛋白质的数量。此外,通过组合由独立算法构建的模型,我们可以比较 3D 模型,以找出哪些部分无论采用哪种方法都是一致的,哪些部分因方法而异并且显然更难建模。因此,我们将使用这些聚合数据来研究计算蛋白质中每个位置的模型质量的最佳策略。我们将构建网页来显示给定蛋白质的已知和预测结构。确定整个蛋白质的结构可能很困难,因此,在适当的情况下,我们将显示实验结构和预测结构,并非常小心地用有关来源(例如使用的方法)和数据可靠性(例如我们还将使用 3D 信标基础设施来聚合蛋白质结构上已知和预测的功能位点的信息,并将这些数据以及来源和置信度信息显示在网页上。映射到结构上的位点数据对于制定规则特别有帮助,这些规则使我们能够判断未经实验表征的蛋白质是否与具有实验表征的进化相关蛋白质具有相同的功能。具有相同功能的亲属应具有相同的关键功能位点残基。有了这些规则,我们将能够为 UniProt 中的数百万种蛋白质提供结构和功能注释。新数据将代表具有结构和功能位点信息的 UniProt 序列数量增加十倍或更多。 UniProt 也被工业研究人员广泛使用,因此信息的扩展将产生非常重大的影响。

项目成果

期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
CATHe: detection of remote homologues for CATH superfamilies using embeddings from protein language models.
CATHe:使用蛋白质语言模型的嵌入检测 CATH 超家族的远程同源物。
  • DOI:
    http://dx.10.1093/bioinformatics/btad029
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Nallapareddy V
  • 通讯作者:
    Nallapareddy V
PDBe: improved findability of macromolecular structure data in the PDB.
PDBe:改进了 PDB 中大分子结构数据的可查找性。
  • DOI:
    http://dx.10.1093/nar/gkz990
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    14.9
  • 作者:
    Armstrong DR
  • 通讯作者:
    Armstrong DR
The impact of structural bioinformatics tools and resources on SARS-CoV-2 research and therapeutic strategies.
结构生物信息学工具和资源对 SARS-CoV-2 研究和治疗策略的影响。
  • DOI:
    http://dx.10.1093/bib/bbaa362
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    9.5
  • 作者:
    Waman VP
  • 通讯作者:
    Waman VP
AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms.
AlphaFold2 揭示了 21 种模型生物体蛋白质结构空间的共性和新颖性。
  • DOI:
    http://dx.10.1038/s42003-023-04488-9
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    5.9
  • 作者:
    Bordin N
  • 通讯作者:
    Bordin N
Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs.
描述和解释在结构或结构同系物未知的情况下与疾病相关的蛋白质突变的影响。
  • DOI:
    http://dx.10.1093/bib/bbac187
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    9.5
  • 作者:
    Sen N
  • 通讯作者:
    Sen N
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Christine Orengo其他文献

Christine Orengo的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Christine Orengo', 18)}}的其他基金

Improving accuracy, coverage, and sustainability of functional protein annotation in InterPro, Pfam and FunFam using Deep Learning methods PID 7012435
使用深度学习方法提高 InterPro、Pfam 和 FunFam 中功能蛋白注释的准确性、覆盖范围和可持续性 PID 7012435
  • 批准号:
    BB/X018563/1
  • 财政年份:
    2024
  • 资助金额:
    $ 37.37万
  • 项目类别:
    Research Grant
BBSRC-NSF/BIO: An AI-based domain classification platform for 200 million 3D-models of proteins to reveal protein evolution
BBSRC-NSF/BIO:基于人工智能的域分类平台,可用于 2 亿个蛋白质 3D 模型,以揭示蛋白质进化
  • 批准号:
    BB/Y001117/1
  • 财政年份:
    2024
  • 资助金额:
    $ 37.37万
  • 项目类别:
    Research Grant
ProtFunAI: AI based methods for functional annotation of proteins in crop genomes
ProtFunAI:基于人工智能的作物基因组蛋白质功能注释方法
  • 批准号:
    BB/Y514044/1
  • 财政年份:
    2024
  • 资助金额:
    $ 37.37万
  • 项目类别:
    Research Grant
Unlocking the chemical potential of plants: Predicting function from DNA sequence for complex enzyme superfamilies
释放植物的化学潜力:根据复杂酶超家族的 DNA 序列预测功能
  • 批准号:
    BB/V014722/1
  • 财政年份:
    2022
  • 资助金额:
    $ 37.37万
  • 项目类别:
    Research Grant
Transforming the Structural Landscape of CATH to Aid Variant Analyses in Human and Agricultural Organisms and their Pathogens
改变 CATH 的结构景观以帮助人类和农业生物体及其病原体的变异分析
  • 批准号:
    BB/W018802/1
  • 财政年份:
    2022
  • 资助金额:
    $ 37.37万
  • 项目类别:
    Research Grant
CATH-FunVar - Predicting Viral and Human Variants Affecting COVID-19 Susceptibility and Severity and Repurposing Therapeutics
CATH-FunVar - 预测影响 COVID-19 易感性和严重程度的病毒和人类变异并重新调整治疗用途
  • 批准号:
    BB/W003368/1
  • 财政年份:
    2021
  • 资助金额:
    $ 37.37万
  • 项目类别:
    Research Grant
BBSRC-NSF/BIO Expanding the fold library in the twilight zone to facilitate structure determination of macromolecular machines
BBSRC-NSF/BIO 扩展暮光区折叠库以促进大分子机器的结构测定
  • 批准号:
    BB/S016007/1
  • 财政年份:
    2020
  • 资助金额:
    $ 37.37万
  • 项目类别:
    Research Grant
Exploiting data driven computational approaches for understanding protein structure and function in InterPro and Pfam
利用数据驱动的计算方法来理解 InterPro 和 Pfam 中的蛋白质结构和功能
  • 批准号:
    BB/S020039/1
  • 财政年份:
    2020
  • 资助金额:
    $ 37.37万
  • 项目类别:
    Research Grant
SENSE - Screening of ENvironmental SEquences to discover novel protein functions, using informatics target selection and high-throughput validation
SENSE - 使用信息学目标选择和高通量验证筛选环境序列以发现新的蛋白质功能
  • 批准号:
    BB/T002735/1
  • 财政年份:
    2020
  • 资助金额:
    $ 37.37万
  • 项目类别:
    Research Grant
Increasing the Coverage and Accuracy of CATH for Comparative Genomics and Variant Interpretation
提高比较基因组学和变异解释的 CATH 的覆盖范围和准确性
  • 批准号:
    BB/R014892/1
  • 财政年份:
    2018
  • 资助金额:
    $ 37.37万
  • 项目类别:
    Research Grant

相似国自然基金

面向立体覆盖的通感融合低功率广域网关键技术研究
  • 批准号:
    62372307
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
知识和数据协同驱动的车联网关键技术研究
  • 批准号:
    62371309
  • 批准年份:
    2023
  • 资助金额:
    53 万元
  • 项目类别:
    面上项目
区块链嵌入创新生态系统对工业互联网关键核心技术突破的影响研究
  • 批准号:
    72372074
  • 批准年份:
    2023
  • 资助金额:
    42 万元
  • 项目类别:
    面上项目
支持肩关节复合体代偿运动监测的织物传感网关键技术研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
面向互联智能的车联网关键技术研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    55 万元
  • 项目类别:
    面上项目

相似海外基金

The Serine Protease HTRA1 Antigen: A Gateway to Elucidating Membranous Nephropathy Pathogenesis and the Targeting of Antigen Epitopes
丝氨酸蛋白酶 HTRA1 抗原:阐明膜性肾病发病机制和抗原表位靶向的途径
  • 批准号:
    10740614
  • 财政年份:
    2023
  • 资助金额:
    $ 37.37万
  • 项目类别:
AMD-Patient-Derived hiPSC-RPE: Gateway for Assessing Novel and Emerging Modulators of Autophagy
AMD 患者衍生的 hiPSC-RPE:评估新型和新兴自噬调节剂的网关
  • 批准号:
    10283447
  • 财政年份:
    2021
  • 资助金额:
    $ 37.37万
  • 项目类别:
AMD-Patient-Derived hiPSC-RPE: Gateway for Assessing Novel and Emerging Modulators of Autophagy
AMD 患者衍生的 hiPSC-RPE:评估新型和新兴自噬调节剂的网关
  • 批准号:
    10487506
  • 财政年份:
    2021
  • 资助金额:
    $ 37.37万
  • 项目类别:
The wandering nerve: gateway to boost Alzheimer's disease related cognitive performance
游走神经:提高阿尔茨海默病相关认知能力的途径
  • 批准号:
    10612830
  • 财政年份:
    2021
  • 资助金额:
    $ 37.37万
  • 项目类别:
The wandering nerve: gateway to boost Alzheimer's disease related cognitive performance
游走神经:提高阿尔茨海默病相关认知能力的途径
  • 批准号:
    10398966
  • 财政年份:
    2021
  • 资助金额:
    $ 37.37万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了