Nonlinear Manifold Learning of Protein Folding Funnels from Delay-Embedded Experimental Measurements

来自延迟嵌入实验测量的蛋白质折叠漏斗的非线性流形学习

基本信息

  • 批准号:
    1841810
  • 负责人:
  • 金额:
    $ 16.2万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2018
  • 资助国家:
    美国
  • 起止时间:
    2018-07-01 至 2021-07-31
  • 项目状态:
    已结题

项目摘要

Proteins are the molecular engines that perform biological functions essential to life. A major milestone in the understanding of protein behavior emerged with the advent of the "new view" of protein folding. This perspective conceives of protein structure, stability, and dynamics as governed by a molecular-level landscape not unlike a relief map describing the surface of the Earth. Each point on the landscape - analogous to latitude and longitude - corresponds to a particular spatial arrangement of protein atoms. The altitude of each point on the map defines protein stability - unstable conformations lie on mountaintops and stable conformations within valley floors. Determining these landscapes is a key goal of protein biology since they are useful in the understanding of natural proteins and design of synthetic proteins as drugs, enzymes, and molecular machines. It is relatively straightforward to calculate these landscapes for small proteins using computer simulations, but it has not been possible to do so from experimental measurements. It is the primary objective of this research to combine mathematical tools from the modeling of dynamical systems with machine learning approaches to analyze high-dimensional datasets to determine approximate protein folding landscapes directly from experimental data. The approach will first be validated in computer simulations of small proteins where the folding landscape is known. Theoretical analyses will place bounds on how close the approximate landscapes are to the true landscapes, and place conditions on the experimental data required for their determination. Ultimately the approach will be applied to experimental measurements of a tuberculosis protein. The computational analysis tool will be released as user-friendly software for free public download. Positive research experiences have great benefits for undergraduate success and retention, and this award will support summer and academic year research opportunities. New educational outreach materials will be developed for the University of Illinois "Engineering Open House" to promote awareness of materials science and engineering among middle- and high-school students.The aim of this work is to integrate nonlinear manifold learning with dynamical systems theory to reconstruct protein folding landscapes from experimental time series measuring a single system observable. The "new view" of protein folding revolutionized understanding of folding as a conformational search over rugged and funneled free energy landscapes parameterized by a small number of emergent collective variables, with transformative implications for the understanding and design of proteins as drugs, enzymes, and molecular machines. It is now relatively routine to determine multidimensional folding landscapes from computer simulations in which all atomic coordinates are known, but it has not been possible to do so from experimental measurements of protein dynamics that are restricted to small numbers of coarse-grained observables. This research project integrates Takens' delay embeddings with nonlinear manifold learning using diffusion maps to first project univariate time series in an experimentally measurable observable into a high-dimensional space in which the dynamics are C1-equivalent to those in real space, and then extract from this space a topologically and geometrically equivalent reconstruction of the folding funnel to that which would have been determined from knowledge of all atomic coordinates. The reconstructed landscape preserves the topology of the true funnel - the metastable configurations and folding pathways - but the topography may be perturbed, i.e., the heights and depths of the free energy peaks and valleys. The three primary objectives of this work are to (i) validate the approach in molecular dynamics simulations of small proteins for which the true landscape is known, (ii) place conditions on the sampling resolution and signal-to-noise ratio in experimental measurements for robust landscape recovery, and theoretical bounds on the induced topographical perturbations, and (iii) apply the approach to experimental single-molecule Forster resonance energy transfer (smFRET) measurements on the lid-opening and closing dynamics of Mycobacterium tuberculosis protein tyrosine phosphatase (Mtb-PtpB).
蛋白质是执行生命必需的生物功能的分子引擎。随着蛋白质折叠“新观点”的出现,理解蛋白质行为的一个重要里程碑出现了。这种观点认为蛋白质的结构、稳定性和动力学是由分子水平的景观控制的,这与描述地球表面的地形图没什么不同。景观上的每个点(类似于纬度和经度)都对应于蛋白质原子的特定空间排列。地图上每个点的高度定义了蛋白质的稳定性——不稳定的构象位于山顶,稳定的构象位于谷底。确定这些景观是蛋白质生物学的一个关键目标,因为它们有助于理解天然蛋白质和设计合成蛋白质作为药物、酶和分子机器。使用计算机模拟计算小蛋白质的这些景观相对简单,但不可能通过实验测量来做到这一点。这项研究的主要目标是将动力系统建模的数学工具与机器学习方法结合起来,分析高维数据集,从而直接从实验数据确定近似的蛋白质折叠景观。该方法将首先在折叠景观已知的小蛋白质的计算机模拟中得到验证。理论分析将对近似景观与真实景观的接近程度设定界限,并为确定所需的实验数据设定条件。最终,该方法将应用于结核病蛋白的实验测量。该计算分析工具将作为用户友好的软件发布,供公众免费下载。积极的研究经验对本科生的成功和保留有很大好处,该奖项将支持暑期和学年的研究机会。将为伊利诺伊大学“工程开放日”开发新的教育宣传材料,以提高中学生对材料科学和工程的认识。这项工作的目的是将非线性流形学习与动力系统理论相结合,从测量单个可观察系统的实验时间序列重建蛋白质折叠景观。蛋白质折叠的“新观点”彻底改变了对折叠的理解,折叠是对由少量新兴集体变量参数化的崎岖和漏斗状自由能景观的构象搜索,对作为药物、酶和分子的蛋白质的理解和设计具有变革性的影响。机器。现在,通过已知所有原子坐标的计算机模拟来确定多维折叠景观已经相对常规,但通过仅限于少量粗粒度可观测值的蛋白质动力学实验测量来确定多维折叠景观是不可能的。该研究项目将 Takens 的延迟嵌入与非线性流形学习相结合,使用扩散图首先将实验可测量的单变量时间序列投影到高维空间中,其中动力学与真实空间中的动力学 C1 等效,然后从这个空间是折叠漏斗的拓扑和几何等效重建,与根据所有原子坐标的知识确定的空间相同。重建的景观保留了真实漏斗的拓扑结构 - 亚稳态配置和折叠路径 - 但地形可能会受到干扰,即自由能峰和谷的高度和深度。这项工作的三个主要目标是(i)验证真实情况已知的小蛋白质的分子动力学模拟方法,(ii)在实验测量中设置采样分辨率和信噪比的条件稳健的景观恢复和诱导地形扰动的理论界限,以及(iii)将该方法应用于结核分枝杆菌蛋白的盖子打开和关闭动力学的实验单分子福斯特共振能量转移(smFRET)测量酪氨酸磷酸酶(Mtb-PtpB)。

项目成果

期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Learned Reconstruction of Protein Folding Trajectories from Noisy Single-Molecule Time Series
从嘈杂的单分子时间序列中学习重建蛋白质折叠轨迹
Synthesis of polypeptides via bioinspired polymerization of in situ purified N -carboxyanhydrides
通过原位纯化的 N-羧酸酐的仿生聚合合成多肽
  • DOI:
    10.1073/pnas.1901442116
  • 发表时间:
    2019-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Song, Ziyuan;Fu, Hailin;Wang, Jiang;Hui, Jingshu;Xue, Tianrui;Pacheco, Lazaro A.;Yan, Haoyuan;Baumgartner, Ryan;Wang, Zhiyu;Xia, Yingchun;et al
  • 通讯作者:
    et al
Reconstruction of protein structures from single-molecule time series
从单分子时间序列重建蛋白质结构
  • DOI:
    10.1063/5.0024732
  • 发表时间:
    2020-11
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Topel, Maximilian;Ferguson, Andrew L.
  • 通讯作者:
    Ferguson, Andrew L.
Polymeric “Clickase” Accelerates the Copper Click Reaction of Small Molecules, Proteins, and Cells
聚合“点击酶”加速小分子、蛋白质和细胞的铜点击反应
  • DOI:
    10.1021/jacs.9b04181
  • 发表时间:
    2019-06
  • 期刊:
  • 影响因子:
    15
  • 作者:
    Chen, Junfeng;Wang, Jiang;Li, Ke;Wang, Yuhan;Gruebele, Martin;Ferguson, Andrew L.;Zimmerman, Steven C.
  • 通讯作者:
    Zimmerman, Steven C.
Accelerated polymerization of N-carboxyanhydrides catalyzed by crown ether
冠醚催化N-羧酸酐的加速聚合
  • DOI:
    10.1038/s41467-020-20724-w
  • 发表时间:
    2021-02
  • 期刊:
  • 影响因子:
    16.6
  • 作者:
    Xia, Yingchun;Song, Ziyuan;Tan, Zhengzhong;Xue, Tianrui;Wei, Shiqi;Zhu, Lingyang;Yang, Yingfeng;Fu, Hailin;Jiang, Yunjiang;Lin, Yao;et al
  • 通讯作者:
    et al
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Andrew Ferguson其他文献

Enough is Enough: Policy Uncertainty and Acquisition Abandonment
受够了:政策不确定性和收购放弃
  • DOI:
    10.2139/ssrn.3883981
  • 发表时间:
    2021-07-10
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Andrew Ferguson;Wei;P. Lam
  • 通讯作者:
    P. Lam
‘Know when to fold 'em’: Policy uncertainty and acquisition abandonment
“知道何时放弃”:政策不确定性和收购放弃
  • DOI:
    10.1111/acfi.13179
  • 发表时间:
    2023-10-15
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Andrew Ferguson;Cecilia Wei Hu;P. Lam
  • 通讯作者:
    P. Lam
The Hausdorff dimension of the projections of self-affine carpets
自仿射地毯投影的豪斯多夫维数
  • DOI:
    10.4064/fm209-3-1
  • 发表时间:
    2009-03-12
  • 期刊:
  • 影响因子:
    0.6
  • 作者:
    Andrew Ferguson;T. Jordan;Pablo Shmerkin
  • 通讯作者:
    Pablo Shmerkin
The clinical relevance of oliguria in the critically ill patient: analysis of a large observational database
危重患者少尿的临床相关性:大型观察数据库的分析
  • DOI:
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    15.1
  • 作者:
    J. Vincent;Andrew Ferguson;P. Pickkers;Stephan M. Jakob;U. Jaschinski;G. Almekhlafi;Marc Leone;Majid Mokhtari;L. E. Fontes;Philippe R. Bauer;Y. Sakr;for the Icon Investigators
  • 通讯作者:
    for the Icon Investigators
Political discretion and risk: the Fukushima nuclear disaster, the distribution of global operations, and uranium company valuation
政治自由裁量权和风险:福岛核灾难、全球业务分布以及铀公司估值
  • DOI:
    10.1093/icc/dtad038
  • 发表时间:
    2023-06-27
  • 期刊:
  • 影响因子:
    2.5
  • 作者:
    Murod Aliyev;T. Devinney;Andrew Ferguson;P. Lam
  • 通讯作者:
    P. Lam

Andrew Ferguson的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Andrew Ferguson', 18)}}的其他基金

Collaborative Research: DMREF: Closed-Loop Design of Polymers with Adaptive Networks for Extreme Mechanics
合作研究:DMREF:采用自适应网络进行极限力学的聚合物闭环设计
  • 批准号:
    2323730
  • 财政年份:
    2023
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
Latent Space Simulators for the Efficient Estimation of Long-time Molecular Thermodynamics and Kinetics
用于有效估计长时间分子热力学和动力学的潜在空间模拟器
  • 批准号:
    2152521
  • 财政年份:
    2022
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
REU SITE: Research Experience for Undergraduates in Molecular Engineering
REU 网站:分子工程本科生的研究经验
  • 批准号:
    2050878
  • 财政年份:
    2021
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
EAGER: (ST1) Collaborative Research: Exploring the emergence of peptide-based compartments through iterative machine learning, molecular modeling, and cell-free protein synthesis
EAGER:(ST1)协作研究:通过迭代机器学习、分子建模和无细胞蛋白质合成探索基于肽的隔室的出现
  • 批准号:
    1939463
  • 财政年份:
    2019
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
EAGER: Collaborative Research: Type II: Data-Driven Characterization and Engineering of Protein Hydrophobicity
EAGER:合作研究:II 类:数据驱动的蛋白质疏水性表征和工程
  • 批准号:
    1844505
  • 财政年份:
    2019
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
Nonlinear dimensionality reduction and enhanced sampling in molecular simulation using auto-associative neural networks
使用自关联神经网络进行分子模拟中的非线性降维和增强采样
  • 批准号:
    1841805
  • 财政年份:
    2018
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
DMREF: Collaborative Research: Self-assembled peptide-pi-electron supramolecular polymers for bioinspired energy harvesting, transport and management
DMREF:合作研究:用于仿生能量收集、运输和管理的自组装肽-π-电子超分子聚合物
  • 批准号:
    1841807
  • 财政年份:
    2018
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
CAREER: Teaching Machines to Design Self-Assembling Materials
职业:教授机器设计自组装材料
  • 批准号:
    1841800
  • 财政年份:
    2018
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Continuing Grant
Nonlinear dimensionality reduction and enhanced sampling in molecular simulation using auto-associative neural networks
使用自关联神经网络进行分子模拟中的非线性降维和增强采样
  • 批准号:
    1664426
  • 财政年份:
    2017
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
DMREF: Collaborative Research: Self-assembled peptide-pi-electron supramolecular polymers for bioinspired energy harvesting, transport and management
DMREF:合作研究:用于仿生能量收集、运输和管理的自组装肽-π-电子超分子聚合物
  • 批准号:
    1729011
  • 财政年份:
    2017
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant

相似国自然基金

基于高速可重构匹配网络的VHF宽带多路跳频Manifold耦合器基础问题研究
  • 批准号:
    61001012
  • 批准年份:
    2010
  • 资助金额:
    22.0 万元
  • 项目类别:
    青年科学基金项目
辛几何中的开“格罗莫夫-威腾”不变量
  • 批准号:
    10901084
  • 批准年份:
    2009
  • 资助金额:
    16.0 万元
  • 项目类别:
    青年科学基金项目
类环体流形和小覆盖流形的拓扑与组合
  • 批准号:
    10826040
  • 批准年份:
    2008
  • 资助金额:
    3.0 万元
  • 项目类别:
    数学天元基金项目

相似海外基金

Collaborative Research: CPS: Medium: Data Driven Modeling and Analysis of Energy Conversion Systems -- Manifold Learning and Approximation
合作研究:CPS:媒介:能量转换系统的数据驱动建模和分析——流形学习和逼近
  • 批准号:
    2223986
  • 财政年份:
    2023
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
Collaborative Research: CPS: Medium: Data Driven Modeling and Analysis of Energy Conversion Systems -- Manifold Learning and Approximation
合作研究:CPS:媒介:能量转换系统的数据驱动建模和分析——流形学习和逼近
  • 批准号:
    2223987
  • 财政年份:
    2023
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
Collaborative Research: CPS: Medium: Data Driven Modeling and Analysis of Energy Conversion Systems -- Manifold Learning and Approximation
合作研究:CPS:媒介:能量转换系统的数据驱动建模和分析——流形学习和逼近
  • 批准号:
    2223985
  • 财政年份:
    2023
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
Deep Intrinsic Learning for On-line Process Control of Manufacturing Manifold Data
用于制造流形数据在线过程控制的深度内在学习
  • 批准号:
    2121625
  • 财政年份:
    2022
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Standard Grant
CAREER: Exploiting Low-Dimensional Structures in Data Science: Manifold Learning, Partial Differential Equation Identification, and Neural Networks
职业:在数据科学中利用低维结构:流形学习、偏微分方程识别和神经网络
  • 批准号:
    2145167
  • 财政年份:
    2022
  • 资助金额:
    $ 16.2万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了