MegaPredict for predicting natural product uses and their drug interactions

MegaPredict 用于预测天然产物用途及其药物相互作用

基本信息

  • 批准号:
    10055938
  • 负责人:
  • 金额:
    $ 15.57万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-08-15 至 2021-08-14
  • 项目状态:
    已结题

项目摘要

Project Summary The objective of ‘MegaPredict’ is to enable scientists to generate predictions for a natural product (or any molecule) and identify targets for efficacy assessment as well as identify any potential liabilities. We are building on our previous work which has compiled a comprehensive collection of datasets for structure-activity data for a broad variety of disease targets and other properties, in a form ready for model building. All of these models utilize the many sources of curated open data, including ChEMBL, ToxCast etc. We have developed a prototype of MegaPredict that utilizes Bayesian algorithm and ECFP6 fingerprints to output a list of prioritized ‘targets’. We realize that neither the algorithm or the descriptors may be optimal therefore we propose to address this as we validate MegaPredict and develop a product over this proposal. Our team is suitably qualified to develop the software needed and we will leverage our large collaborator network to assist us in validating the activity of compounds. We will initially create a script to take a natural product and score it against many thousands of machine learning models then rank the outputs to propose efficacy targets. We will use over 12,000 ChEMBL derived target-assay / bioactivity groups extracted from the ChEMBL v24 database, as well as EPA Tox21 measurements and other public datasets, using methodology that we have already partially developed. We can repeat this process for over 200 published compounds and access the outputs versus what is known. We intend to compare how the approach performs with synthetic drugs or drug-like compounds as well as natural products. We will assess whether other machine learning algorithms and molecular descriptors can improve predictions. As we generate machine learning models such as Linear Logistic Regression, AdaBoost Decision Tree, Random Forest, Support Vector Machine and deep neural networks (DNN) of varying depth we will assess the predictions for natural products and compare with the Bayesian approach. We will compare ECFP6 with other 2D, 3D descriptors and physicochemical properties in order to identify the optimal combination for generating predictions for natural products and compare how this differs for synthetic compounds. We will validate our predictions for natural product efficacy assessment. We will work closely with multiple academic groups to generate predictions for at least 20 natural products of interest against over 20 different targets or diseases. Our goal will be to identify potential targets that were previously unknown and then generate in vitro data inhouse or with academic collaborators. Develop a prototype user interface for input of a structure, processing an input molecule and output of prioritized targets and liabilities. We have developed multiple software prototypes (e. Assay Central, MegaTox, etc.) previously and will ensure a user-friendly interface and develop new visualization methods and algorithms for prioritizing potential predicted targets based on the outputs of thousands of machine learning models. In Phase I, we will use the software internally with collaborators to rapidly prototype it. In Phase II we will develop a commercial product, and greatly expand our validation by building a larger network of academic and industry partners that would help to prioritize features of most relevance. Using the machine learning models which we have for natural products is limited because ECFP6 fingerprints cannot distinguish between these very different classes of molecules. But this provides us with an opportunity by going for a "pharmacophore" style approach (ideally without using 3D conformations directly). We will therefore focus on developing a ‘3D shape-based fingerprint’ or developing a novel ‘2D fingerprint’ that captures the ‘3D shape’ for natural- and druglike molecules. Currently, the public datasets in ChEMBL and PubChem etc. are made up of mostly druglike molecules, but if we have fingerprints that can compare drug-like and natural product-like molecules then we can likely reliably use our MegaPredict models for natural products as well. We can also attempt to rank natural products with our ChEMBL models or we can look through catalogs of druglike compounds using models derived from natural products. That would be an important innovation. Additionally, in Phase II it would be important to see if we could find uses for natural products with any of the 7000 rare diseases. Developing software that predicts potential natural product drug interactions with various targets could be useful to regulatory organizations as well as the pharmaceutical industry and may broaden utility of being able to more effectively mix natural product and druglike compounds in models will have a profound effect on the value of cheminformatics in this arena.
项目概要 “MegaPredict”的目标是使科学家能够对天然产品(或任何 分子)并确定功效评估的目标以及确定我们正在建立的任何潜在责任。 我们之前的工作为结构活动数据编制了全面的数据集 各种各样的疾病目标和其他特性,以适合模型构建的形式。 利用许多精选的开放数据源,包括 ChEMBL、ToxCast 等。我们开发了一个原型 MegaPredict 利用贝叶斯算法和 ECFP6 指纹输出优先“目标”列表。 意识到算法或描述符都可能不是最优的,因此我们建议解决这个问题 验证 MegaPredict 并根据该提案开发产品。我们的团队有资格开发该产品。 需要软件,我们将利用我们庞大的合作者网络来帮助我们验证 化合物。 我们最初将创建一个脚本来获取天然产品并针对数千台机器对其进行评分 然后,学习模型对输出进行排序以提出功效目标。我们将使用超过 12,000 个 ChEMBL。 从 ChEMBL v24 数据库以及 EPA Tox21 中提取的衍生靶标测定/生物活性组 测量和其他公共数据集,使用我们已经部分开发的方法。 对 200 多种已发布的化合物重复此过程,并获取与我们已知的结果相对应的结果。 比较该方法与合成药物或类药物化合物以及天然产物的效果。 我们将评估其他机器学习算法和分子描述符是否可以改进 当我们生成线性 Logistic 回归、AdaBoost 决策等机器学习模型时。 我们将评估不同深度的树、随机森林、支持向量机和深度神经网络 (DNN) 我们将比较 ECFP6 与天然产物的预测并与贝叶斯方法进行比较。 其他 2D、3D 描述符和物理化学特性,以确定最佳组合 生成天然产物的预测并比较合成化合物的差异。 我们将验证我们对天然产品功效评估的预测。我们将与多个机构密切合作。 学术团体对至少 20 种感兴趣的天然产物与 20 多种不同的天然产物进行预测 我们的目标是识别以前未知的潜在目标,然后生成。 内部或与学术合作者合作的体外数据。 开发一个原型用户界面,用于输入结构、处理输入分子和输出 我们开发了多种软件原型(例如 Assay Central、MegaTox、 等)之前,将确保用户友好的界面并开发新的可视化方法和算法 根据数千个机器学习模型的输出对潜在的预测目标进行优先级排序。 在第一阶段,我们将与合作者在内部使用该软件来快速原型化。在第二阶段,我们将开发它。 商业产品,并通过建立更大的学术和工业网络来极大地扩展我们的验证 合作伙伴将有助于使用我们的机器学习模型来优先考虑最相关的功能。 对于天然产品来说,这种方法是有限的,因为 ECFP6 指纹无法区分这些非常不同的产品 但这为我们提供了一个采用“药效团”式方法的机会。 (理想情况下不直接使用 3D 构象)因此我们将专注于开发“基于 3D 形状”。 指纹”或开发一种新颖的“2D 指纹”来捕获天然和药物分子的“3D 形状”。 目前,ChEMBL 和 PubChem 等中的公共数据集主要由类药物分子组成,但如果 我们拥有可以比较药物样分子和天然产物样分子的指纹,那么我们很可能可以可靠地 我们也可以尝试使用我们的 MegaPredict 模型对天然产品进行排名。 ChEMBL 模型,或者我们可以使用源自天然的模型来浏览药物化合物目录 这将是一个重要的创新,在第二阶段,我们是否可以做到这一点也很重要。 寻找天然产品在 7000 种罕见疾病中的用途 开发预测潜在疾病的软件。 天然产物药物与各种靶点的相互作用可能对监管组织以及 制药行业,并可能扩大能够更有效地混合天然产物和药物的效用 模型中的化合物将对化学信息学在这一领域的价值产生深远的影响。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

SEAN EKINS其他文献

SEAN EKINS的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('SEAN EKINS', 18)}}的其他基金

Preclinical development of a Nipah Virus inhibitor
尼帕病毒抑制剂的临床前开发
  • 批准号:
    10761349
  • 财政年份:
    2023
  • 资助金额:
    $ 15.57万
  • 项目类别:
New therapeutic approaches to identifying molecules for opioid abuse treatment
识别阿片类药物滥用分子的新治疗方法
  • 批准号:
    10385998
  • 财政年份:
    2022
  • 资助金额:
    $ 15.57万
  • 项目类别:
Machine learning approaches to predict Acetylcholinesterase inhibition
预测乙酰胆碱酯酶抑制的机器学习方法
  • 批准号:
    10378934
  • 财政年份:
    2021
  • 资助金额:
    $ 15.57万
  • 项目类别:
MegaTox for analyzing and visualizing data across different screening systems
MegaTox 用于分析和可视化不同筛选系统的数据
  • 批准号:
    10094026
  • 财政年份:
    2020
  • 资助金额:
    $ 15.57万
  • 项目类别:
MegaTox for analyzing and visualizing data across different screening systems
MegaTox 用于分析和可视化不同筛选系统的数据
  • 批准号:
    10470050
  • 财政年份:
    2019
  • 资助金额:
    $ 15.57万
  • 项目类别:
MegaTox for analyzing and visualizing data across different screening systems
MegaTox 用于分析和可视化不同筛查系统的数据
  • 批准号:
    10674729
  • 财政年份:
    2019
  • 资助金额:
    $ 15.57万
  • 项目类别:
MegaTrans – human transporter machine learning models
MegaTrans — 人类运输机机器学习模型
  • 批准号:
    9768844
  • 财政年份:
    2019
  • 资助金额:
    $ 15.57万
  • 项目类别:
Manufacture of an intracerebroventricular Enzyme Replacement Therapy for CLN1 Batten Disease
CLN1巴顿病脑室内酶替代疗法的研制
  • 批准号:
    10483470
  • 财政年份:
    2018
  • 资助金额:
    $ 15.57万
  • 项目类别:
Manufacture of an intracerebroventricular Enzyme Replacement Therapy for CLN1 Batten Disease
CLN1巴顿病脑室内酶替代疗法的研制
  • 批准号:
    10641950
  • 财政年份:
    2018
  • 资助金额:
    $ 15.57万
  • 项目类别:
Centralized assay datasets for modelling support of small drug discovery organizations
用于小型药物发现组织建模支持的集中化分析数据集
  • 批准号:
    9751326
  • 财政年份:
    2017
  • 资助金额:
    $ 15.57万
  • 项目类别:

相似国自然基金

时空序列驱动的神经形态视觉目标识别算法研究
  • 批准号:
    61906126
  • 批准年份:
    2019
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目
本体驱动的地址数据空间语义建模与地址匹配方法
  • 批准号:
    41901325
  • 批准年份:
    2019
  • 资助金额:
    22.0 万元
  • 项目类别:
    青年科学基金项目
大容量固态硬盘地址映射表优化设计与访存优化研究
  • 批准号:
    61802133
  • 批准年份:
    2018
  • 资助金额:
    23.0 万元
  • 项目类别:
    青年科学基金项目
针对内存攻击对象的内存安全防御技术研究
  • 批准号:
    61802432
  • 批准年份:
    2018
  • 资助金额:
    25.0 万元
  • 项目类别:
    青年科学基金项目
IP地址驱动的多径路由及流量传输控制研究
  • 批准号:
    61872252
  • 批准年份:
    2018
  • 资助金额:
    64.0 万元
  • 项目类别:
    面上项目

相似海外基金

Fluency from Flesh to Filament: Collation, Representation, and Analysis of Multi-Scale Neuroimaging data to Characterize and Diagnose Alzheimer's Disease
从肉体到细丝的流畅性:多尺度神经影像数据的整理、表示和分析,以表征和诊断阿尔茨海默病
  • 批准号:
    10462257
  • 财政年份:
    2023
  • 资助金额:
    $ 15.57万
  • 项目类别:
A computational model for prediction of morphology, patterning, and strength in bone regeneration
用于预测骨再生形态、图案和强度的计算模型
  • 批准号:
    10727940
  • 财政年份:
    2023
  • 资助金额:
    $ 15.57万
  • 项目类别:
Computer-Aided Triage of Body CT Scans with Deep Learning
利用深度学习对身体 CT 扫描进行计算机辅助分类
  • 批准号:
    10585553
  • 财政年份:
    2023
  • 资助金额:
    $ 15.57万
  • 项目类别:
High-resolution cerebral microvascular imaging for characterizing vascular dysfunction in Alzheimer's disease mouse model
高分辨率脑微血管成像用于表征阿尔茨海默病小鼠模型的血管功能障碍
  • 批准号:
    10848559
  • 财政年份:
    2023
  • 资助金额:
    $ 15.57万
  • 项目类别:
Bioethical, Legal, and Anthropological Study of Technologies (BLAST)
技术的生物伦理、法律和人类学研究 (BLAST)
  • 批准号:
    10831226
  • 财政年份:
    2023
  • 资助金额:
    $ 15.57万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了