Synthetically Accessible Virtual Inventory (SAVI)
可综合访问的虚拟库存 (SAVI)
基本信息
- 批准号:10926263
- 负责人:
- 金额:$ 36.94万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:2019-nCoVAddressBehaviorBusinessesChemicalsChemistryClinicalCodeCollaborationsCustomDataDatabasesDevelopmentDockingDrug DesignEquipment and supply inventoriesFutureGenerationsGermanyGoalsGrowthHIV-1InformaticsIntelligenceInternationalInternetIntuitionKnowledgeLanguageLinkLogicMalignant NeoplasmsMethyltransferaseModelingMolecularNatureNucleocapsidOutcomePaperPatternPeer ReviewPilot ProjectsPredictive Cancer ModelPricePrintingProcessProductivityProgramming LanguagesPropertyPublicationsPublishingReactionResearchResourcesRouteRunningSamplingServicesStructureSystems BiologyTechnologyTestingWorkWritingcancer initiationcomputer generatedcost effectivedesigndrug candidatedrug developmentendonucleaseexperiencefunctional groupin silicoknowledgebasemanufacturing runmembernew therapeutic targetnovelnovel therapeuticsscreeningsuccesstumoruser-friendlyvirtualweb server
项目摘要
The SAVI Project is based on: (a) a set of transforms with rich chemical context annotation including functional group reactivity data (LHASA, LLC, U.S.; and Lhasa Limited, UK) (b) a set of highly annotated building blocks (Sigma-Aldrich, Global Strategic Services) (c) the chemoinformatics toolkit CACTVS with custom development (Xemistry GmbH, Germany) The transforms are a set of more than 1,500 rules described in the CHMTRN/PATRAN language for encoding chemical transformations with chemical context and quality criteria added, based ultimately on the pioneering work of E. J. Corey. These rules, in contrast to simple SMIRKS transforms, allow/provide: - Computation of whether a reaction, depending on the overall structural features of the target, will work at all. - Scoring: If the reaction works, how robust it is, taking into account overall structural features. - Whether protection of interfering groups is required - and these can then already be integrated in the final starting materials queries to prioritize pre-protected starting materials. - Proposal of suitable context-dependent reaction conditions. - Textual warnings in specific circumstances, such as potential of multiple products, borderline conditions, etc. Ancillary information to the rules is a set of functional group reactivity data, i.e. a table describing whether any of the standard functional groups in the rule set is unstable under any of the standard conditions. The building blocks are a set of several hundred thousand compounds available in gram quantities, and with high reliability, from, or through, Sigma-Aldrich. This set has been annotated with pricing information and other business intelligence type data useful for this project. The chemoinformatics toolkit CACTVS has been expanded in various ways, e.g. with the capability to read the CHMTRN/PATRAN transforms. An important feature that needed to be implemented was the handling of the reversal of the original LHASA transform direction, without re-writing rules, for the strictly forward-synthetic SAVI project. Another important capability was the initial and final starting material (SM) query handling, i.e. the 4-steps: initial SM query extraction from the 2D patterns in the rules; forward reaction from the 2D patterns; scoring (which is the only original LHASA functionality); final SM query expansion (R-groups, protecting groups, etc.). For the goal of filtering out structures with less-than-desirable attributes in the drug development context, several additional computed properties regarded as important in current drug design have been implemented, such as the demerit scores based on 275 rules for identifying potentially reactive or promiscuous compounds, published by Bruns and Watson (J. Med. Chem. 2012, 55, 9763?9772); dx.doi.org/10.1021/jm301008n. In the current, very early alpha, stage of this project, only 11 transforms of the possible 1,500 were used; applied to approx. 230,000 building blocks; in only one-step reactions. The 610,000 resulting products have been annotated but not yet filtered with any of the computed or associated molecular properties. To limit the file size, only on the order of one percent of the theoretically possible products (of one-step reactions) was sampled. We have addressed the task of generating schematic graphical representations of the transforms. We are ultimately aiming at creating a database of one billion high-quality screening samples that should be easily and cheaply synthesizable. Our first full production run, using 14 transforms and about 377,000 building blocks, has resulted in more than 236 million products. These novel molecules are all annotated with a proposed simple and high-yield synthetic route, as well as by 50 molecular properties generally recognized as important in cutting-edge drug design that we have implemented. We are developing an approach that is intended to primarily allow the rapid determination of the targeted chunks of the current or a future, much larger, SAVI database that is optimal for finding active molecules for a given target ("a SAVI a la carte"). This technology is called SLICE (Smarts and Logic In ChEmistry). SLICE is designed to (a) be a simple, powerful and open language that allows chemists to encode chemistry knowledge with no/low code and complete SMARTS by reasoning directly on a molecule, with a UI that should permit one to enter and test a new transform quickly, (b) integrated in a newly developed no/low-code platform that allows users to graphically encode chemistry knowledge without experience with programming languages, (c) be fast in the execution of SLICE-enabled transforms on building blocks to generate products, and (d) to allow for reactant-based filtering of the possible SAVI space to find the right "a la carte" SAVI menu for the given target. An intuitive, user friendly web GUI has been launched and is currently being used by team members for writing transforms in the new SLICE language. The GUI will allow users free access to this database via searches by various criteria including substructure searches. It will also present links to pages where users can place requests for having the molecule(s) synthesized by commercial entities. Additional novel transforms for chemistry heretofore not in the knowledgebase have been written, yielding a total of over 70 productive and drafted transforms. After a change in the business model of Sigma-Aldrich, we decided to change the set of building blocks to Enamine, from whom we got about 151,000 possible structures. With 143,000 of those matching the 53 productive transforms used, we finished calculating 1.75 billion SAVI products in early 2020. We have made them publicly available for download on the CADD Group's web server at https://doi.org/10.35115/37N9-5738. A publication about the SAVI project is available as a preprint at https://doi.org/10.26434/chemrxiv.12185559.v1, with the peer-reviewed paper published in Nature Scientific Data. CR colleagues are using the SAVI database to screen for docking against the following SARS-CoV-2 targets: NSP7, NSP8, NSP9, NSP10, NSP15 (endonuclease), NSP16 (methyltransferase), Spike RBD, Nucleocapsid. SAVI syntheses have been extraordinarily successful at 97% success rates with the SAVI-predicted route. Out of about 170 synthesized molecules tested against cancer, HIV-1, and SARS-CoV-2 targets, tens have shown activity. New transforms have been written,
The SAVI Project is based on: (a) a set of transforms with rich chemical context annotation including functional group reactivity data (LHASA, LLC, U.S.; and Lhasa Limited, UK) (b) a set of highly annotated building blocks (Sigma-Aldrich, Global Strategic Services) (c) the chemoinformatics toolkit CACTVS with custom development (Xemistry GmbH, Germany) The transforms are a set CHMTRN/Patran语言中描述的1,500多个规则,用于用化学背景和质量标准编码化学转化,最终基于E. J. Corey的开拓性工作。这些规则与简单的假笑声相反,允许/提供: - 计算反应是否取决于目标的整体结构特征,完全可以起作用。 - 评分:如果反应起作用,则考虑到整体结构特征。 - 是否需要保护干涉组 - 然后可以将这些保护组集成到最终的起始材料查询中,以优先考虑预先保护的起始材料。 - 适当依赖上下文反应条件的建议。 - 在特定情况下的文本警告,例如多种产品的潜力,边界条件等。规则的辅助信息是一组功能组反应性数据,即描述规则集中的任何标准功能组在任何标准条件下是否不稳定。构建块是数十万种化合物的集合,可提供革兰氏数量,并且具有很高的可靠性,从或通过Sigma-Aldrich。该集合已注明了定价信息和其他对该项目有用的商业智能类型数据。化学信息学工具包CACTV已通过各种方式扩展,例如具有阅读CHMTRN/Patran变换的能力。需要实施的一个重要功能是处理原始的LHASA变换方向的逆转,而无需重新编写规则,即严格的前向合成的SAVI项目。另一个重要功能是初始和最终的起始材料(SM)查询处理,即4步:从规则中的2D模式中提取初始SM查询; 2D模式的正向反应;评分(这是唯一的原始LHASA功能);最终的SM查询扩展(R组,保护组等)。为了使在药物开发环境中过滤质量不足的属性过滤结构,已经实施了几种在当前药物设计中被认为很重要的计算特性,例如基于275种识别潜在反应性或滥交化合物的规则的Demerit得分,Bruns和Watson(J. Med。Chem。Chem。Chem。Chem。2012,55,977,977,9772); dx.doi.org/10.1021/jm301008n。在该项目的当前非常早的α阶段,仅使用了11个可能的1,500个变换。应用于大约。 230,000个构件;仅在一步反应中。已有610,000个产生的产品被注释但尚未用任何计算的或相关的分子特性过滤。为了限制文件大小,仅根据理论上可能(一步反应)的1%的订单进行采样。我们已经解决了生成转换的示意图图形表示的任务。我们最终旨在创建一个十亿高质量筛查样本的数据库,该样本应该容易且便宜地合成。我们使用14个变换和约377,000个构建块的首次完整生产运行,产生了超过2.36亿种产品。这些新型分子均以所提出的简单和高收益合成途径以及50个分子特性进行注释,通常在我们实施的尖端药物设计中被认为很重要。我们正在开发一种方法,该方法主要允许快速确定当前或未来的目标块,即更大的SAVI数据库,该数据库最适合为给定目标找到活性分子(“ savi a la la la carte”)。该技术称为切片(化学中的智能和逻辑)。 SLICE is designed to (a) be a simple, powerful and open language that allows chemists to encode chemistry knowledge with no/low code and complete SMARTS by reasoning directly on a molecule, with a UI that should permit one to enter and test a new transform quickly, (b) integrated in a newly developed no/low-code platform that allows users to graphically encode chemistry knowledge without experience with programming languages, (c) be fast in the execution of SLICE-enabled transforms on building blocks为了生成产品,(d)允许对可能的savi空间进行基于反应物的过滤,以找到适合给定目标的正确的“单点”萨维菜菜单。已经启动了直观,用户友好的Web GUI,目前由团队成员使用新的Slice语言编写转换。 GUI将允许用户通过各种标准(包括子结构搜索)免费访问该数据库。它还将介绍到网页的链接,用户可以在其中提出请求,以使商业实体合成分子。在知识库中,还没有编写其他新的化学变换,总共产生了70多种生产和起草的变换。在改变了Sigma-Aldrich的业务模型之后,我们决定将构件集更改为Onamine,我们从中获得了约151,000个可能的结构。在2020年初,我们完成了与53种生产转换相匹配的人中有143,000个,我们完成了17.5亿SAVI产品的计算。我们已在CADD Group的Web服务器上公开下载,网址为https://doi.org/10.35115/37n9-5738。有关SAVI项目的出版物可作为预印本提供,网址为https://doi.org/10.26434/chemrxiv.121855559.v1,并在自然科学数据上发表了同行评审的论文。 CR同事正在使用SAVI数据库来筛选以接近以下SARS-COV-2目标:NSP7,NSP8,NSP9,NSP9,NSP10,NSP10,NSP15(内核酸酶),NSP16,NSP16(甲基转移酶)(甲基转移酶),Spike RBD,rbd,nucleocapsid。 SAVI综合在SAVI预测的路线上以97%的成功率取得了非凡的成功。在针对癌症,HIV-1和SARS-COV-2靶标测试的大约170个合成分子中,数十个显示了活性。已经写了新的转换,
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Large-Scale Modeling of Multispecies Acute Toxicity End Points Using Consensus of Multitask Deep Learning Methods.
- DOI:10.1021/acs.jcim.0c01164
- 发表时间:2021-02-22
- 期刊:
- 影响因子:5.6
- 作者:Jain S;Siramshetty VB;Alves VM;Muratov EN;Kleinstreuer N;Tropsha A;Nicklaus MC;Simeonov A;Zakharov AV
- 通讯作者:Zakharov AV
ReactionCode: format for reaction searching, analysis, classification, transform, and encoding/decoding.
- DOI:10.1186/s13321-020-00476-x
- 发表时间:2020-12-03
- 期刊:
- 影响因子:8.6
- 作者:Delannée V;Nicklaus MC
- 通讯作者:Nicklaus MC
[Discovering new antiretroviral compounds in "Big Data" chemical space of the SAVI library].
[在SAVI图书馆的“大数据”化学空间中发现新的抗逆转录病毒化合物]。
- DOI:10.18097/pbmc20196502073
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Savosina,PI;Stolbov,LA;Druzhilovskiy,DS;Filimonov,DA;Nicklaus,MC;Poroikov,VV
- 通讯作者:Poroikov,VV
Special Issue on Reaction Informatics and Chemical Space.
反应信息学和化学空间特刊。
- DOI:10.1021/acs.jcim.2c00390
- 发表时间:2022
- 期刊:
- 影响因子:5.6
- 作者:Rarey,Matthias;Nicklaus,MarcC;Warr,Wendy
- 通讯作者:Warr,Wendy
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
MARC NICKLAUS其他文献
MARC NICKLAUS的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('MARC NICKLAUS', 18)}}的其他基金
HIV Integrase Modeling and Computer-Aided Inhibitor Deve
HIV整合酶建模和计算机辅助抑制剂开发
- 批准号:
7291875 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
HIV Integrase Modeling and Computer-Aided Inhibitor and Microbicide Development
HIV 整合酶建模以及计算机辅助抑制剂和杀菌剂开发
- 批准号:
10702372 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
Large Databases of Small Molecules - Drug Development Tool and Public Resource
小分子大型数据库 - 药物开发工具和公共资源
- 批准号:
10262724 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
Large Databases of Small Molecules - Drug Development Tool and Public Resource
小分子大型数据库 - 药物开发工具和公共资源
- 批准号:
10703018 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
HIV Integrase Modeling and Computer-Aided Inhibitor Development
HIV 整合酶建模和计算机辅助抑制剂开发
- 批准号:
7965392 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
Large Databases of Small Molecules - Drug Development Tool and Public Resource
小分子大型数据库 - 药物开发工具和公共资源
- 批准号:
10926595 - 财政年份:
- 资助金额:
$ 36.94万 - 项目类别:
相似国自然基金
时空序列驱动的神经形态视觉目标识别算法研究
- 批准号:61906126
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
本体驱动的地址数据空间语义建模与地址匹配方法
- 批准号:41901325
- 批准年份:2019
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
大容量固态硬盘地址映射表优化设计与访存优化研究
- 批准号:61802133
- 批准年份:2018
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
IP地址驱动的多径路由及流量传输控制研究
- 批准号:61872252
- 批准年份:2018
- 资助金额:64.0 万元
- 项目类别:面上项目
针对内存攻击对象的内存安全防御技术研究
- 批准号:61802432
- 批准年份:2018
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
相似海外基金
NeuroMAP Phase II - Recruitment and Assessment Core
NeuroMAP 第二阶段 - 招募和评估核心
- 批准号:
10711136 - 财政年份:2023
- 资助金额:
$ 36.94万 - 项目类别:
Developing a U.S. National Cohort to Improve Virologic Suppression among Stimulant-using Men Living with HIV.
建立美国国家队列以改善使用兴奋剂的艾滋病毒男性感染者的病毒抑制。
- 批准号:
10675863 - 财政年份:2023
- 资助金额:
$ 36.94万 - 项目类别:
Society of Behavioral Medicine 2023 Annual Meeting & Scientific Sessions
行为医学学会2023年年会
- 批准号:
10681958 - 财政年份:2023
- 资助金额:
$ 36.94万 - 项目类别:
Neuroprotective Potential of Vaccination Against SARS-CoV-2 in Nonhuman Primates
SARS-CoV-2 疫苗对非人灵长类动物的神经保护潜力
- 批准号:
10646617 - 财政年份:2023
- 资助金额:
$ 36.94万 - 项目类别: