ProtFunAI: AI based methods for functional annotation of proteins in crop genomes
ProtFunAI:基于人工智能的作物基因组蛋白质功能注释方法
基本信息
- 批准号:BB/Y514044/1
- 负责人:
- 金额:$ 32.43万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2024
- 资助国家:英国
- 起止时间:2024 至 无数据
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Our project will 'build on existing links and deepen existing relationships' between the two groups pioneering the development of AI/Deep-Learning models for proteins (Rost Group, TUM) and the application of these to protein domain families (Orengo Group, UCL). It will leverage world leading expertise in protein Language Models (pLMs) in order to accelerate the scientific discovery of protein functions in the genomes of key agricultural crops important for food security. However, our approaches will be generic and rolled out to all UniProt proteins through existing collaborations.Synergies between both groups have evolved over several collaborations. Since 2019, ground-breaking results tuned the pLMs developed in the Rost Group (e.g. the ProtTrans series, incl. ProtT5, ProtTucker) with protein family and functional family data (CATH superfamilies and FunFams) generated and maintained by the Orengo Group.The partnership proposed, here, would allow researchers in the Rost and Orengo Groups to intensify exchanges through visiting each others labs and interacting more comprehensively to design more effective protocols that enhance (1) protein homologue detection (2) protein function prediction and (3) protein functional site prediction.The Orengo and Rost Groups began collaborating in 2000 when working together on protein family analysis for target identification in the NIH-funded USA Structural Genomics initiative (PSI), which ended in 2015 [21-23]. Subsequently funding from the German BMBF (Federal German Research Ministry) and DFG (German Research Foundation) supported visits of PhD and Masters students from both groups and resulted in the development of new approaches for protein function prediction [14,15]. This application seeks funds to continue these collaborations to leverage the latest advances in AI/Deep Learning. The Rost Group recently enhanced their pLMs significantly (ProstT5 [18]) and the funding would allow us to apply ProstT5 to exploit the hugely expanded CATH classification, which is currently integrating hundreds of millions of predicted protein structures from the AlphaFold portal (AFDB).The application is very timely as it will address key BBSRC strategic priorities around data intensive biology and AI and the important challenge of food security. We will apply improved function prediction methods to significantly increase the functional annotations of plant genomes. This will bring 'new knowledge about key biological principles and mechanisms using AI-based approaches' and bring 'AI in sustainable agriculture and food' and enable 'smart agriculture' by identifying genes implicated in biological systems associated with growth and stress resistance e.g. drought and antimicrobial resistance. Most genes (typically >90%) from plants valuable as crops (e.g. wheat, maize, rice, sorghum) are experimentally uncharacterized or very poorly annotated. Our methods will be state-of-the-art to accurately guide experimental validation.We will disseminate the annotations using our established web-based CATH resource accessed by over 27,000 users/month. Since CATH data is also disseminated by PDB, UniProt and InterPro the predictions will be accessible to >900,000s of users/month. We will also work closely with collaborators in the UK researching plant genomes to get feedback and solicit experimental validation where possible.The project will significantly enhance the AI/ML skills of UK based researchers in the Orengo Group, whose prior training was largely in biology. On the flip side, the more AI-focused members from the Rost group will deepen their understanding of individual proteins, organisms, and evolution. German scholars will also dive deeper into the workings of UK-based resources.
我们的项目将“建立在现有联系的基础上并加深现有关系”,这两个团体率先开发蛋白质人工智能/深度学习模型(Rost Group,TUM)并将其应用于蛋白质结构域家族(Orengo Group,UCL) 。它将利用世界领先的蛋白质语言模型(pLM)专业知识,加速对对粮食安全至关重要的关键农作物基因组中蛋白质功能的科学发现。然而,我们的方法将是通用的,并通过现有的合作推广到所有 UniProt 蛋白质。两个团队之间的协同作用已经通过多次合作而发展。自 2019 年以来,突破性的成果利用 Orengo 集团生成和维护的蛋白质家族和功能家族数据(CATH 超家族和 FunFams)调整了 Rost 集团开发的 pLM(例如 ProtTrans 系列,包括 ProtT5、ProtTucker)。这里提出的建议将允许罗斯特和奥伦戈小组的研究人员通过访问彼此的实验室和更全面的互动来加强交流,以设计更有效的方案增强 (1) 蛋白质同源物检测 (2) 蛋白质功能预测和 (3) 蛋白质功能位点预测。Orengo 和 Rost 小组于 2000 年开始合作,当时在 NIH 资助的美国结构基因组学计划中共同致力于蛋白质家族分析以识别目标(PSI),于 2015 年结束[21-23]。随后,德国 BMBF(德国联邦研究部)和 DFG(德国研究基金会)的资助支持了这两个小组的博士生和硕士生的访问,并导致了蛋白质功能预测新方法的开发[14,15]。该申请寻求资金来继续这些合作,以利用人工智能/深度学习的最新进展。 Rost Group 最近显着增强了他们的 pLM(ProstT5 [18]),这笔资金将使我们能够应用 ProstT5 来利用大幅扩展的 CATH 分类,该分类目前正在整合来自 AlphaFold 门户 (AFDB) 的数亿个预测蛋白质结构。该申请非常及时,因为它将解决 BBSRC 围绕数据密集型生物学和人工智能的关键战略重点以及粮食安全的重要挑战。我们将应用改进的功能预测方法来显着增加植物基因组的功能注释。这将带来“关于使用基于人工智能的方法的关键生物学原理和机制的新知识”,并将“人工智能引入可持续农业和食品”,并通过识别与生长和抗逆性相关的生物系统中涉及的基因来实现“智能农业”。干旱和抗菌素耐药性。来自具有农作物价值的植物(例如小麦、玉米、水稻、高粱)的大多数基因(通常> 90%)在实验上未经表征或注释非常少。我们的方法将采用最先进的方法来准确指导实验验证。我们将使用我们已建立的基于网络的 CATH 资源(每月有超过 27,000 名用户访问)来传播注释。由于 CATH 数据也由 PDB、UniProt 和 InterPro 传播,每月将有超过 900,000 名用户可以访问预测。我们还将与英国研究植物基因组的合作者密切合作,尽可能获得反馈并征求实验验证。该项目将显着提高 Orengo Group 英国研究人员的 AI/ML 技能,他们之前的培训主要是生物学。另一方面,Rost 小组中更关注人工智能的成员将加深对个体蛋白质、有机体和进化的理解。德国学者还将更深入地研究英国资源的运作情况。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Christine Orengo其他文献
Globalization : Approaches to Diversities
全球化:实现多元化的途径
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
Benoit H Dessailly;Natalie L Dawson;Kenji Mizuguchi;Christine Orengo;Hector Cuadra-Montiel - 通讯作者:
Hector Cuadra-Montiel
Christine Orengo的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Christine Orengo', 18)}}的其他基金
BBSRC-NSF/BIO: An AI-based domain classification platform for 200 million 3D-models of proteins to reveal protein evolution
BBSRC-NSF/BIO:基于人工智能的域分类平台,可用于 2 亿个蛋白质 3D 模型,以揭示蛋白质进化
- 批准号:
BB/Y001117/1 - 财政年份:2024
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
Improving accuracy, coverage, and sustainability of functional protein annotation in InterPro, Pfam and FunFam using Deep Learning methods PID 7012435
使用深度学习方法提高 InterPro、Pfam 和 FunFam 中功能蛋白注释的准确性、覆盖范围和可持续性 PID 7012435
- 批准号:
BB/X018563/1 - 财政年份:2024
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
Transforming the Structural Landscape of CATH to Aid Variant Analyses in Human and Agricultural Organisms and their Pathogens
改变 CATH 的结构景观以帮助人类和农业生物体及其病原体的变异分析
- 批准号:
BB/W018802/1 - 财政年份:2022
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
Unlocking the chemical potential of plants: Predicting function from DNA sequence for complex enzyme superfamilies
释放植物的化学潜力:根据复杂酶超家族的 DNA 序列预测功能
- 批准号:
BB/V014722/1 - 财政年份:2022
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
CATH-FunVar - Predicting Viral and Human Variants Affecting COVID-19 Susceptibility and Severity and Repurposing Therapeutics
CATH-FunVar - 预测影响 COVID-19 易感性和严重程度的病毒和人类变异并重新调整治疗用途
- 批准号:
BB/W003368/1 - 财政年份:2021
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
3D-Gateway - Gateway to protein structure and function
3D-Gateway - 蛋白质结构和功能的门户
- 批准号:
BB/S020144/1 - 财政年份:2020
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
Exploiting data driven computational approaches for understanding protein structure and function in InterPro and Pfam
利用数据驱动的计算方法来理解 InterPro 和 Pfam 中的蛋白质结构和功能
- 批准号:
BB/S020039/1 - 财政年份:2020
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
SENSE - Screening of ENvironmental SEquences to discover novel protein functions, using informatics target selection and high-throughput validation
SENSE - 使用信息学目标选择和高通量验证筛选环境序列以发现新的蛋白质功能
- 批准号:
BB/T002735/1 - 财政年份:2020
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
BBSRC-NSF/BIO Expanding the fold library in the twilight zone to facilitate structure determination of macromolecular machines
BBSRC-NSF/BIO 扩展暮光区折叠库以促进大分子机器的结构测定
- 批准号:
BB/S016007/1 - 财政年份:2020
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
Increasing the Coverage and Accuracy of CATH for Comparative Genomics and Variant Interpretation
提高比较基因组学和变异解释的 CATH 的覆盖范围和准确性
- 批准号:
BB/R014892/1 - 财政年份:2018
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
相似国自然基金
基于AI的Ⅱ型糖尿病药物响应预测和个体用药方案推荐研究
- 批准号:82373790
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
基于物理约束人工智能的缺资料流域山洪模拟方法研究
- 批准号:42371086
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
基于多模态分子影像和人工智能的结直肠癌PD-L1表达演变预测及机制研究
- 批准号:82302185
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于TILs功能分群和AI工作流精准指导早期肺癌SBRT联合ICI治疗的研究
- 批准号:82373424
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
基于人工智能的微结构光纤研究
- 批准号:62375013
- 批准年份:2023
- 资助金额:54 万元
- 项目类别:面上项目
相似海外基金
I-Corps: Centralized, Cloud-Based, Artificial Intelligence (AI) Video Analysis for Enhanced Intubation Documentation and Continuous Quality Control
I-Corps:基于云的集中式人工智能 (AI) 视频分析,用于增强插管记录和持续质量控制
- 批准号:
2405662 - 财政年份:2024
- 资助金额:
$ 32.43万 - 项目类别:
Standard Grant
CRII: CPS: FAICYS: Model-Based Verification for AI-Enabled Cyber-Physical Systems Through Guided Falsification of Temporal Logic Properties
CRII:CPS:FAICYS:通过时态逻辑属性的引导伪造,对支持人工智能的网络物理系统进行基于模型的验证
- 批准号:
2347294 - 财政年份:2024
- 资助金额:
$ 32.43万 - 项目类别:
Standard Grant
BBSRC-NSF/BIO: An AI-based domain classification platform for 200 million 3D-models of proteins to reveal protein evolution
BBSRC-NSF/BIO:基于人工智能的域分类平台,可用于 2 亿个蛋白质 3D 模型,以揭示蛋白质进化
- 批准号:
BB/Y000455/1 - 财政年份:2024
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
BBSRC-NSF/BIO: An AI-based domain classification platform for 200 million 3D-models of proteins to reveal protein evolution
BBSRC-NSF/BIO:基于人工智能的域分类平台,可用于 2 亿个蛋白质 3D 模型,以揭示蛋白质进化
- 批准号:
BB/Y001117/1 - 财政年份:2024
- 资助金额:
$ 32.43万 - 项目类别:
Research Grant
I(eye)-SCREEN: A real-world AI-based infrastructure for screening and prediction of progression in age-related macular degeneration (AMD) providing accessible shared care
I(eye)-SCREEN:基于人工智能的现实基础设施,用于筛查和预测年龄相关性黄斑变性 (AMD) 的进展,提供可及的共享护理
- 批准号:
10102692 - 财政年份:2024
- 资助金额:
$ 32.43万 - 项目类别:
EU-Funded