Better Understanding and Handling of Tautomerism
更好地理解和处理互变异构
基本信息
- 批准号:10262460
- 负责人:
- 金额:$ 21.34万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:AnticoagulantsAppearanceAreaAzidesBibliographyBook ChaptersCarbonChargeChemical StructureChemicalsChemistryChildCodeComputer softwareConflict (Psychology)ContractorCyclizationDataDatabasesEnvironmentEquilibriumEyeHydrogenIndividualInformaticsIntuitionJournalsLeadLiteratureManuscriptsMeasuresMechanicsMethodsModificationMolecular WeightMotivationMovementOrganic ChemistryPaperPhasePreparationPrevalencePropertyProtonsPublic DomainsPublicationsPublishingReactionRecommendationRecordsSamplingSolventsSpectrum AnalysisStructureSystemTechniquesTemperatureTerminologyTestingTetrazolesTriplet Multiple BirthVariantVotingWarfarinWorkX-Ray Crystallographybasecatalystchemical information systemdeep learningfootimprovedinformation modelmigrationpostersquantumquantum computingscreeningsingle bondsmall moleculestructural biologytautomertheoriestoolweb servicesweb siteworking group
项目摘要
One motivation of our tautomerism-related work is thus to use all tools at our disposal, chemoinformatics analyses, QM computations, experimental work, and systematic extraction of results from literature, to provide a scientific footing for the recommendations how to improve handling of tautomerism in InChI V2 - instead of just holding a vote in the Working Group. While prototropic tautomerism rules are the only ones currently implemented as the standard rule set in CACTVS, and all tautomeric transformations covered by InChI (as default or by option) are prototropic, ring-chain (RC) tautomerism is well-known and widespread. Nevertheless, and somewhat surprisingly, very little in terms of RC rules was available in chemoinformatics until recently. Based on Baldwin's well-known set of rules to predict the relative facility of ring forming reactions, we developed a set of 11 rules describing RC tautomerism. The rules were encoded in SMIRKS line notation, the chemical transform extension of the chemical structure line notation SMILES, developed by Daylight Chemical Information Systems, Inc., just like the currently 20 individual rules in CACTVS for describing prototropic tautomerism are encoded. A number of modifications were applied to Baldwin's rule set, which, after all, were rules for ring-closure in general, not for RC tautomerism in specific. Foremost, ring closure and opening reactions involving a tetrahedral electrophilic carbon thus leading to breakage of a single bond would cause a loss of atoms to the molecule, violating the definition of tautomerism. Adding these new RC rules to the existing standard prototropic rules in CACTVS, we applied this combined rule set to the "poster child" of RC tautomerism: warfarin. This anticoagulant drug, in wide use for decades, can theoretically exist in solution in 40 distinct tautomeric forms. We investigated all these tautomers with computational approaches (relative energies calculated at the B3LYP/6-311G+ level of theory) and recorded NMR (13C and 1H) spectra. We introduced an intuitive and graphical network for tautomers and their interconversion paths, which for warfarin contained 11 tautomers and 17 tautomeric transformations between them allowed by our rules. We then applied the combined RC and prototropic rule set to an entire database: the Aldrich Market Select (AMS) database of (then) 6 million screening samples and building blocks. We found over 30,000 cases where two or more AMS products were declared by our rules to be just different tautomeric forms of the same compound. 1H and 13C NMR analysis of 166 such tautomer pairs (plus a few triplets) we purchased from the AMS were performed to determine whether the chemoinformatics transforms had accurately predicted what was the same "stuff in the bottle" as determined by NMR. Essentially all prototropic transforms for which examples in the AMS existed (some of the "rarer" types of tautomerism had no such "conflict pairs" in the AMS) were confirmed. Some of the RC transforms were found to be too "aggressive", i.e. to equate structures with one another that were different compounds according to the NMR analyses. This paper received an Editor's Choice selection in the Journal of Chemical Information and Modeling. In order to provide additional experimental data for tautomerism-related analyses and chemoinformatics work, we have created a database based on data extracted from experimental literature. This database consists of 1,873 entries which belong to n-tuples of tautomers studied in a particular set of experimental conditions (pH, solvent, temperature, technique), adding up to 3,898 records since the average of n is slightly 2. The data were extracted from 73 publications, many of them reviews, taken from a selection of 200 papers provided to the contractor company that did the initial extraction (Parthys Reverse Informatics), out of about 900 papers we identified in literature searches that might contain useful data for this purpose. Each tautomer (or tuple, as appropriate) is annotated with Structural information: SMILES, InChI, InChIKey, NCI/CADD Identifiers; "Prevalence" data: measured ratios, interconversion rates, relative energies etc.; Condition data: solvent, temperature, pH etc. (if given); Method data: NMR, UV spectroscopy, IR spectroscopy etc.; Reference data: Bibliographic information. To the best of our knowledge, such as tautomer database does not exist elsewhere, certainly not in the public domain. A new web service - called Tautomerizer - was created to apply and test the transforms we have compiled from the above database and literature for the Redesign of Handling of Tautomerism in InChI(Key) V.2. The set of transforms compiled in the context of this project has meanwhile grown to its final number of 86, which are also being added to the Tautomerizer. The phase of initiating and then making a decision in the IUPAC Working Group about the final set of transforms to be recommended for InChI V2 has been started. Exploratory coding for adding some of the 86 rukes to the current InChI code (v.1.05) were successful for 6 rules. Work on a second-level analysis of tautomerism based on quantum-mechanical calculations and subsequent Deep Learning approaches has been started. Also, X-ray crystallography on a subset of the small molecules mentioned above has been performed. Several manuscripts about this project have been published or are under preparation.
因此,与互变异委的工作的一种动机是使用所有可供使用的工具,化学信息分析,QM计算,实验性工作以及从文献中进行系统提取的系统提取,以提供科学的基础,以提高建议如何改善Inchi V2中的辛顿族主义的处理,而不是仅仅在工作组中投票。虽然原始的互变异构规则是当前唯一作为CACTV中设置的标准规则实施的规则,而Inchi涵盖的所有互变异态转换(默认或选择)都是原始的,是原始的,环形链(RC)互变异分解症是众所周知的,并且是众所周知的。然而,有些令人惊讶的是,直到最近,化学信息学的RC规则就很少。根据鲍德温(Baldwin)众所周知的一组规则,以预测环形反应的相对设施,我们制定了一组描述RC互变异态主义的11个规则。该规则是用假笑线符号编码的,《化学结构线符号》的化学变换扩展是由日光化学信息系统,Inc。开发的,就像CACTV中目前的20个单个规则所描述的原始互变异分解一样。对鲍德温的规则集进行了许多修改,毕竟,这是环闭环的规则,而不是针对特定的RC互变异。最重要的是,环闭合和涉及四面体亲电碳的反应导致单个键断裂会导致原子损失到该分子,从而违反了互变异分解的定义。将这些新的RC规则添加到CACTV中现有的标准原始规则中,我们将此组合规则设置应用于RC互构象的“海报孩子”:Warfarin。这种抗凝药物数十年来广泛使用,理论上可以以40种不同的折线形式存在于溶液中。我们使用计算方法(以B3LYP/6-311G+理论水平计算的相对能量)研究了所有这些互变异符,并记录了NMR(13C和1H)光谱。我们引入了一个直观的图形网络,用于互变异者及其相互转换路径,对于华法林来说,该网络包含11个互变符和17个互变量符号和17个互变异粒转换,这是我们规则允许的。然后,我们将组合的RC和原始规则设置应用于整个数据库:Aldrich Market Select(AMS)数据库(当时)600万个筛选样本和构建块。我们发现超过30,000例,其中规则宣布了两种或多种AMS产品是同一化合物的不同互变异组形式。对我们从AMS购买的166对1H和13C NMR分析进行了166对(加上几个三胞胎)的分析,以确定化学信息传感器转换是否准确地预测了NMR确定的“瓶中的东西”。从本质上讲,所有原始转换都存在AMS中存在的例子(某些“稀有”类型的互变异构象体在AMS中没有这种“冲突对”)。发现某些RC变换太“侵略性”,即根据NMR分析,将结构相互等同。本文在《化学信息与建模杂志》中获得了编辑的选择。为了提供与互变异相关的分析和化学信息学工作的其他实验数据,我们根据从实验文献中提取的数据创建了一个数据库。 This database consists of 1,873 entries which belong to n-tuples of tautomers studied in a particular set of experimental conditions (pH, solvent, temperature, technique), adding up to 3,898 records since the average of n is slightly 2. The data were extracted from 73 publications, many of them reviews, taken from a selection of 200 papers provided to the contractor company that did the initial extraction (Parthys Reverse Informatics), out of我们在文献搜索中发现了大约900篇论文,这些论文可能包含有用的数据。每个互变异符(或适当的元组)都有结构信息:微笑,Inchi,Inchikiky,nci/cadd标识符; “患病率”数据:测得的比率,互换率,相对能量等;条件数据:溶剂,温度,pH等(如果给出);方法数据:NMR,紫外光谱,红外光谱法等;参考数据:书目信息。据我们所知,诸如互变异数据库之类的知识不存在其他地方,当然不存在于公共领域。创建了一种新的Web服务(称为互构体的),以应用和测试我们从上述数据库和文献中编制的转换,以重新设计Inchi(键)v.2中的互变异稳定性的重新设计。同时,在该项目的上下文中汇编的一组变换已增长到86个数字,这也已添加到互变异器中。启动的阶段,然后在IUPAC工作组中做出决定有关Inchi V2的最终转换集的阶段。探索性编码用于将86个Rukes添加到当前的Inchi代码(v.1.05)中,成功完成了6个规则。基于量子力学计算和随后的深度学习方法,对互变异构的第二级分析进行了研究。同样,在上述小分子的子集上还进行了X射线晶体学。有关该项目的几个手稿已经发布或正在准备。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
MARC NICKLAUS其他文献
MARC NICKLAUS的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('MARC NICKLAUS', 18)}}的其他基金
HIV Integrase Modeling and Computer-Aided Inhibitor Deve
HIV整合酶建模和计算机辅助抑制剂开发
- 批准号:
7291875 - 财政年份:
- 资助金额:
$ 21.34万 - 项目类别:
HIV Integrase Modeling and Computer-Aided Inhibitor Development
HIV 整合酶建模和计算机辅助抑制剂开发
- 批准号:
7965392 - 财政年份:
- 资助金额:
$ 21.34万 - 项目类别:
HIV Integrase Modeling and Computer-Aided Inhibitor and Microbicide Development
HIV 整合酶建模以及计算机辅助抑制剂和杀菌剂开发
- 批准号:
10702372 - 财政年份:
- 资助金额:
$ 21.34万 - 项目类别:
HIV Integrase Modeling and Computer-Aided Inhibitor Development
HIV 整合酶建模和计算机辅助抑制剂开发
- 批准号:
7733068 - 财政年份:
- 资助金额:
$ 21.34万 - 项目类别:
Large Databases of Small Molecules - Drug Development Tool and Public Resource
小分子大型数据库 - 药物开发工具和公共资源
- 批准号:
10926595 - 财政年份:
- 资助金额:
$ 21.34万 - 项目类别:
Synthetically Accessible Virtual Inventory (SAVI)
可综合访问的虚拟库存 (SAVI)
- 批准号:
10926263 - 财政年份:
- 资助金额:
$ 21.34万 - 项目类别:
Large Databases of Small Molecules - Drug Development Tool and Public Resource
小分子大型数据库 - 药物开发工具和公共资源
- 批准号:
10703018 - 财政年份:
- 资助金额:
$ 21.34万 - 项目类别:
相似国自然基金
3D打印物体表面外貌和视觉感知色差表征方法研究
- 批准号:61775170
- 批准年份:2017
- 资助金额:63.0 万元
- 项目类别:面上项目
服务接触中外貌刻板印象对消费者响应的影响机制研究:基于社会距离的中介
- 批准号:71602073
- 批准年份:2016
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
SOX10基因增强子缺失导致白来航蛋鸡羽色变异的分子机制研究
- 批准号:31672409
- 批准年份:2016
- 资助金额:60.0 万元
- 项目类别:面上项目
颜料彩绘文物全外貌信息表征及再现方法研究
- 批准号:61575147
- 批准年份:2015
- 资助金额:16.0 万元
- 项目类别:面上项目
观察者特征与目标特征双视角下外貌社会比较的认知神经机制研究
- 批准号:31100758
- 批准年份:2011
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Microthrombus Formation Triggers Lung Injury in Sepsis
脓毒症中微血栓形成引发肺损伤
- 批准号:
7680461 - 财政年份:2009
- 资助金额:
$ 21.34万 - 项目类别:
Microthrombus Formation Triggers Lung Injury in Sepsis
脓毒症中微血栓形成引发肺损伤
- 批准号:
8258193 - 财政年份:2009
- 资助金额:
$ 21.34万 - 项目类别:
Microthrombus Formation Triggers Lung Injury in Sepsis
脓毒症中微血栓形成引发肺损伤
- 批准号:
8195634 - 财政年份:2009
- 资助金额:
$ 21.34万 - 项目类别:
Microthrombus Formation Triggers Lung Injury in Sepsis
脓毒症中微血栓形成引发肺损伤
- 批准号:
7780052 - 财政年份:2009
- 资助金额:
$ 21.34万 - 项目类别: