SIFTER: A Systems Biology Platform for Protein Function Prediction

SIFTER:蛋白质功能预测的系统生物学平台

基本信息

  • 批准号:
    1122732
  • 负责人:
  • 金额:
    $ 24万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Fellowship Award
  • 财政年份:
    2011
  • 资助国家:
    美国
  • 起止时间:
    2011-09-01 至 2014-08-31
  • 项目状态:
    已结题

项目摘要

Proteins are key biomolecules involved in virtually all processes within cells,e.g., metabolism, cell signaling, immune response, etc., and knowledge ofprotein function is vital to obtain a basic understanding of cellular activity.Due to recent advances in nucleotide sequencing technology, the number ofavailable genomic sequences is doubling in size roughly every 12 months, anincredibly fast pace vastly exceeding Moore's law. Experimental technologiesrequired to decipher protein function have not progressed nearly as fast. Infact, although there are roughly 10 million protein sequences in thecomprehensive Uniprot database, only 0.2% have experimentally validatedfunction annotations. This sequence-function gap is rapidly expanding, and thedevelopment of computational methods is of crucial importance to effectivelyutilize this deluge of sequence data.In this work, we develop SIFTER, a large-scale, systems biology platform toaccurately predict protein function from high-throughput data. Building upon apromising phylogenomic-based prototype, we incorporate interaction networksinto our model to improve performance. Interaction data intrinsically couplesthe thousands to millions of proteins within such networks, and we usevariational inference and parallelized implementations to address thischallenging computational problem. We also explore techniques for functionprediction based on low-rank matrix factorization, and along the way, introducenovel sampling-based approaches to speed up computation. Additionally, wedevelop algorithms to quantify uncertainty in SIFTER's predictions tohelp guide future experimental work. These novel algorithms are large-scaleextensions to classical bootstrap sampling and are generally applicable to anyproblem involving massive data. Finally, we evaluate SIFTER incollaboration with experimental biologists, allowing us to pinpoint relevantuse cases and resulting in an effective method with widespread impact withinthe biomedical community.
Proteins are key biomolecules involved in virtually all processes within cells,e.g., metabolism, cell signaling, immune response, etc., and knowledge ofprotein function is vital to obtain a basic understanding of cellular activity.Due to recent advances in nucleotide sequencing technology, the number ofavailable genomic sequences is doubling in size roughly every 12 months, anincredibly fast pace vastly exceeding Moore's law.将重新推出蛋白质功能的实验技术的进展几乎没有那么快。实际上,尽管在综合的Uniprot数据库中大约有1000万个蛋白质序列,但只有0.2%的蛋白质序列具有实验验证的函数注释。该序列功能差距正在迅速扩大,计算方法的开发对于有效地利用了这一序列数据的泛滥至关重要。在这项工作中,我们开发了SIFTER,这是一个大规模的系统生物学平台,可以准确地预测高通量数据的蛋白质功能。 在基于系统基因组的原型中,我们结合了相互作用网络,以提高性能。交互数据本质上是在此类网络中数千个蛋白质的数千蛋白,我们使用不同的推断和并行化的实现来解决策划的计算问题。 我们还基于低级别矩阵分解的功能预测技术,并在此过程中,基于引入采样的方法来加快计算的速度。 此外,在Sifter的预测中量化不确定性的介绍算法TOHELP指导未来的实验工作。 这些新颖的算法是对经典的引导抽样的大规模X骨,通常适用于涉及大量数据的任何问题。 最后,我们评估了Sifter与实验生物学家的不散热器,使我们能够查明相关病例,并在生物医学界产生有效的方法,并产生广泛影响。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ameet Talwalkar其他文献

AutoML Decathlon: Diverse Tasks, Modern Methods, and Efficiency at Scale
AutoML Decathlon:多样化的任务、现代方法和大规模效率
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Nicholas Roberts;Samuel Guo;Cong Xu;Ameet Talwalkar;David Lander;Lvfang Tao;Linhang Cai;Shuaicheng Niu;Jianyu Heng;Hongyang Qin;Minwen Deng;Johannes Hog;Alexander Pfefferle;Sushil Ammanaghatta Shivakumar;Arjun Krishnakumar;Yubo Wang;R. Sukthanker;Frank Hutter;Euxhen Hasanaj;Tien;M. Khodak;Yuriy Nevmyvaka;Kashif Rasul;Frederic Sala;Anderson Schneider;Junhong Shen;Evan R. Sparks
  • 通讯作者:
    Evan R. Sparks
On the support recovery of marginal regression.
关于边际回归的支持恢复。
  • DOI:
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    S. J. Kazemitabar;A. Amini;Ameet Talwalkar
  • 通讯作者:
    Ameet Talwalkar
NAS-Bench-360: Benchmarking Diverse Tasks for Neural Architecture Search
NAS-Bench-360:神经架构搜索的各种任务基准测试
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Renbo Tu;M. Khodak;Nicholas Roberts;Ameet Talwalkar
  • 通讯作者:
    Ameet Talwalkar
Targeted treatment of folate receptor-positive platinum-resistant ovarian cancer and companion diagnostics, with specific focus on vintafolide and etarfolatide
叶酸受体阳性铂耐药性卵巢癌的靶向治疗和伴随诊断,特别关注vintafolide和etarfolatide
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Nicholas Roberts;Samuel Guo;Cong Xu;Ameet Talwalkar;David Lander;Lvfang Tao;Linhang Cai;Shuaicheng Niu;Jianyu Heng;Hongyang Qin;Minwen Deng;Johannes Hog;Alexander Pfefferle;Sushil Ammanaghatta Shivakumar;Arjun Krishnakumar;Yubo Wang;R. Sukthanker;Frank Hutter;Euxhen Hasanaj;Tien;M. Khodak;Yuriy Nevmyvaka;Kashif Rasul;Frederic Sala;Anderson Schneider;Junhong Shen;Evan R. Sparks
  • 通讯作者:
    Evan R. Sparks
Variable Importance Using Decision Trees
使用决策树的变量重要性

Ameet Talwalkar的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ameet Talwalkar', 18)}}的其他基金

Travel: NSF Student Travel Grant for the Sixth Conference on Machine Learning and Systems (MLSys 2023)
旅行:第六届机器学习和系统会议 (MLSys 2023) 的 NSF 学生旅行补助金
  • 批准号:
    2325547
  • 财政年份:
    2023
  • 资助金额:
    $ 24万
  • 项目类别:
    Standard Grant
CAREER: Foundations of Next-Generation Neural Architecture Search
职业:下一代神经架构搜索的基础
  • 批准号:
    2046613
  • 财政年份:
    2021
  • 资助金额:
    $ 24万
  • 项目类别:
    Continuing Grant
BIGDATA: F: Optimization in Federated Networks of Devices
BIGDATA:F:设备联合网络的优化
  • 批准号:
    1838017
  • 财政年份:
    2019
  • 资助金额:
    $ 24万
  • 项目类别:
    Standard Grant
Model-Parallel Collaborative Filtering in Apache Spark
Apache Spark 中的模型并行协同过滤
  • 批准号:
    1555772
  • 财政年份:
    2015
  • 资助金额:
    $ 24万
  • 项目类别:
    Standard Grant

相似国自然基金

秸秆还田下玉/豆间作系统的生物固氮效应及微生物学机制
  • 批准号:
    32301962
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
III-E型CRISPR-Cas系统的结构生物学及其应用研究
  • 批准号:
    32371276
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
青藏高原棕尾虹雉的繁殖生物学和交配系统
  • 批准号:
    32300390
  • 批准年份:
    2023
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
交感神经—肥大细胞功能联系介导针刺改善特应性皮炎的机制研究
  • 批准号:
    82305408
  • 批准年份:
    2023
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Computational Systems Biology for Investigating Infectious Diseases
研究传染病的计算系统生物学
  • 批准号:
    502567
  • 财政年份:
    2024
  • 资助金额:
    $ 24万
  • 项目类别:
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
  • 批准号:
    2341402
  • 财政年份:
    2024
  • 资助金额:
    $ 24万
  • 项目类别:
    Standard Grant
Computational topology and geometry for systems biology
系统生物学的计算拓扑和几何
  • 批准号:
    EP/Z531224/1
  • 财政年份:
    2024
  • 资助金额:
    $ 24万
  • 项目类别:
    Research Grant
Acute human gingivitis systems biology
人类急性牙龈炎系统生物学
  • 批准号:
    484000
  • 财政年份:
    2023
  • 资助金额:
    $ 24万
  • 项目类别:
    Operating Grants
Bio-Responsive and Immune Protein-Based Therapies for Inhibition of Proteolytic Enzymes in Dental Tissues
用于抑制牙齿组织中蛋白水解酶的基于生物响应和免疫蛋白的疗法
  • 批准号:
    10555093
  • 财政年份:
    2023
  • 资助金额:
    $ 24万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了