DMREF: Collaborative Research: The Synthesis Genome: Data Mining for Synthesis of New Materials

DMREF:协作研究:合成基因组:新材料合成的数据挖掘

基本信息

  • 批准号:
    1922090
  • 负责人:
  • 金额:
    $ 40万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-10-01 至 2023-09-30
  • 项目状态:
    已结题

项目摘要

Successes in accelerated materials design, made possible in part through the Materials Genome Initiative, have shifted the bottleneck in materials development towards the synthesis of novel compounds. Existing databases do not contain information about the synthesis recipes necessary to make compounds that are found to have promising properties, designed through computational methods. As a result, much of the momentum and efficiency gained in the design process becomes gated by trial-and-error synthesis techniques. This delay in going from promising materials concept to validation, optimization, and scale-up is a significant burden to the commercialization of novel materials. This Designing Materials to Revolutionize and Engineer our Future (DMREF) research will build predictive tools for synthesis so that the development time for chemical compounds with interesting properties can be synthesized in a matter of days, rather than months or years. The research activities include automatically extracting information from the published literature and patents on how solid inorganic materials have been made in the past by using natural language processing techniques. After this text extraction the project will generate a "cookbook" of materials synthesis recipes. This cookbook can be mined through machine learning approaches for suggestions on how to make new materials by looking for patterns and similarities among previously made materials. The project outcome will be a data set of materials synthesis methods, to be made available to the community. Another key project outcome is to use machine learning to predict novel or optimized recipes for materials. These predictions will be accompanied by experimental confirmation for a class of materials used in catalysis called zeolites. The major objective of the outreach component of this research is to enable the use of the database by non-experts. This will be accomplished through both online tutorials and in person workshops. The online tutorials will teach the basic knowledge required to utilize the online tools and functionalities while the workshops will be addressed to students and researchers who want to make use of the database itself. The approach to automatic extraction of information in the literature will be semi-supervised from a machine learning perspective. Unsupervised methods, including word embeddings that capture the context of words within scientific corpus, will be used. Then downstream supervised methods will be used to classify words by their type and their relationship to other words. This forms the basis of the recipe database. The extracted information will then be mined using machine learning tools from the materials informatics community. Because the recipe classification (described subsequently) leverages expertise from the NLP perspective and the target material classification leverages expertise from the materials perspective, there is significant leverage to be had from this interdisciplinary approach, a partnership not previously pursued to further materials design. This approach builds on established synthesis knowledge, and combines it with modern data extraction, materials informatics, text mining and machine learning techniques, and high-throughput ab-initio thermochemical data availability. The integration of these different fields will provide a direct route towards more rational design of synthesis methods and thereby significantly accelerate the deployment and testing of new materials concepts.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
加速材料设计的成功部分通过了材料基因组计划,使材料发育中的瓶颈转向了新化合物的合成。现有数据库不包含有关制作通过计算方法设计的有希望属性的化合物所需的合成配方的信息。结果,在设计过程中获得的大部分动力和效率都通过反复试验的合成技术进​​行了门控。从有希望的材料概念到验证,优化和扩大规模的这种延迟是新颖材料商业化的重大负担。这种设计材料彻底改变和设计我们的未来(DMREF)研究将建立合成的预测工具,以便可以在几天而不是数月或几年的时间内综合具有有趣属性的化合物的开发时间。研究活动包括从已发表的文献中自动提取信息,以及有关使用自然语言处理技术如何制作固体无机材料的专利。在此文本提取之后,该项目将生成材料合成食谱的“食谱”。可以通过机器学习方法开采该食谱,以获取有关如何通过在先前制造的材料中寻找图案和相似性来制作新材料的建议。项目结果将是材料综合方法的数据集,可供社区提供。另一个关键的项目结果是使用机器学习来预测材料的新颖或优化食谱。这些预测将伴随着对称为沸石的一类材料的实验确认。这项研究的外展部分的主要目标是使非专家使用数据库。 这将通过在线教程和个人讲习班来完成。 在线教程将教授利用在线工具和功能所需的基本知识,而研讨会将向想要利用数据库本身的学生和研究人员介绍。从机器学习的角度,将对文献中自动提取信息的方法进行半监督。将使用无监督的方法,包括捕获科学语料库中单词上下文的单词嵌入。然后,下游监督方法将通过单词的类型及其与其他单词的关系来对单词进行分类。这构成了食谱数据库的基础。然后,将使用材料信息学界的机器学习工具来开采提取的信息。因为从NLP的角度来看,配方分类(随后描述)利用专业知识,而目标材料分类从材料的角度利用了专业知识,因此这种跨学科方法具有很大的杠杆作用,这是一种跨学科的方法,以前没有追求的伙伴关系来实现更多的材料设计。这种方法基于既定的合成知识,并将其与现代数据提取,材料信息学,文本挖掘和机器学习技术以及高通量AB-Initio热化学数据可用性相结合。这些不同领域的整合将为合成方法的更合理设计提供直接的途径,从而大大加快了新材料概念的部署和测试。该奖项反映了NSF的法定任务,并被认为是通过基金会的知识分子优点和更广泛的审查标准来通过评估来获得支持的。

项目成果

期刊论文数量(12)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity
  • DOI:
    10.18653/v1/2022.naacl-main.331
  • 发表时间:
    2021-11
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sheshera Mysore;Arman Cohan;Tom Hope
  • 通讯作者:
    Sheshera Mysore;Arman Cohan;Tom Hope
Unsupervised Partial Sentence Matching for Cited Text Identification
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Kathryn Ricci;Haw-Shiuan Chang;Purujit Goyal;A. McCallum
  • 通讯作者:
    Kathryn Ricci;Haw-Shiuan Chang;Purujit Goyal;A. McCallum
Augmenting Scientific Creativity with Retrieval across Knowledge Domains
通过跨知识领域的检索增强科学创造力
Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks
Editable User Profiles for Controllable Text Recommendations
用于可控文本推荐的可编辑用户配置文件
  • DOI:
    10.1145/3539618
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Mysore, Sheshera;Jasim, Mahmood;McCallum, Andrew;Zamani, Hamed
  • 通讯作者:
    Zamani, Hamed
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Andrew McCallum其他文献

An Interoperable Multimedia Catalog System for Electronic Commerce.
用于电子商务的可互操作多媒体目录系统。
  • DOI:
  • 发表时间:
    2000
  • 期刊:
  • 影响因子:
    0
  • 作者:
    William W. Cohen;Andrew McCallum;D. Quass
  • 通讯作者:
    D. Quass
Scaling Within Document Coreference to Long Texts
文档共指内的缩放到长文本
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Raghuveer Thirukovalluru;Nicholas Monath;K. Shridhar;M. Zaheer;Mrinmaya Sachan;Andrew McCallum
  • 通讯作者:
    Andrew McCallum
ezCoref : A Scalable Approach for Collecting Crowdsourced Annotations for Coreference Resolution
ezCoref:一种收集众包注释以进行共指解析的可扩展方法
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    A. Crowdsourced;David Bamman;Olivia Lewke;Rachel Bawden;Rico Sennrich;Alexandra Birch;Ari Bornstein;Arie Cattan;Ido Dagan;Hong Chen;Zhenhua Fan;Hao Lu;Alan Yuille;Eduard Hovy;Mitch Marcus;M. Palmer;Lance;Rodney Huddleston. 2002;Frédéric Landragin;T. Poibeau;Bernard Vic;Belinda Z. Li;Gabriel Stanovsky;Robert L Logan;Andrew McCallum;Sameer Singh
  • 通讯作者:
    Sameer Singh
PaRaDe: Passage Ranking using Demonstrations with Large Language Models
PaRaDe:使用大型语言模型的演示进行段落排名
  • DOI:
    10.48550/arxiv.2310.14408
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Andrew Drozdov;Honglei Zhuang;Zhuyun Dai;Zhen Qin;Razieh Rahimi;Xuanhui Wang;Dana Alon;Mohit Iyyer;Andrew McCallum;Donald Metzler;Kai Hui
  • 通讯作者:
    Kai Hui
Every Answer Matters: Evaluating Commonsense with Probabilistic Measures
每个答案都很重要:用概率度量评估常识
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Qi Cheng;Michael Boratko;Pranay Kumar Yelugam;T. O’Gorman;Nalini Singh;Andrew McCallum;X. Li
  • 通讯作者:
    X. Li

Andrew McCallum的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Andrew McCallum', 18)}}的其他基金

Collaborative Research: SOS-DCI / HNDS-R: Advancing Semantic Network Analysis to Better Understand How Evaluative Exchanges Shape Scientific Arguments
合作研究:SOS-DCI / HNDS-R:推进语义网络分析,以更好地理解评估性交流如何塑造科学论证
  • 批准号:
    2244805
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
RI: Medium: Probabilistic Box Embeddings
RI:中:概率框嵌入
  • 批准号:
    2106391
  • 财政年份:
    2021
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
RI: Medium: Extreme Clustering
RI:中:极端集群
  • 批准号:
    1763618
  • 财政年份:
    2018
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
DMREF: Collaborative Research: The Synthesis Genome: Data Mining for Synthesis of New Materials
DMREF:协作研究:合成基因组:新材料合成的数据挖掘
  • 批准号:
    1534431
  • 财政年份:
    2015
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
III: Medium: Constructing Knowledge Bases by Extracting Entity-Relations and Meanings from Natural Language via "Universal Schema"
III:媒介:通过“通用模式”从自然语言中提取实体关系和含义来构建知识库
  • 批准号:
    1514053
  • 财政年份:
    2015
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
The Fourth Northeast Student Colloquium on Artificial Intelligence
第四届东北学生人工智能学术研讨会
  • 批准号:
    1036017
  • 财政年份:
    2010
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CI-ADDO-EN: Flexible Machine Learning for Natural Language in the MALLET Toolkit
CI-ADDO-EN:MALLET 工具包中自然语言的灵活机器学习
  • 批准号:
    0958392
  • 财政年份:
    2010
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
RI-Medium: Collaborative Research: Dynamically-Structured Conditional Random Fields for Complex, Natural Domains
RI-Medium:协作研究:复杂自然域的动态结构条件随机场
  • 批准号:
    0803847
  • 财政年份:
    2008
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CRI: Collaborative Research: Improving Experimental Computer Science with a Searchable Web Portal for Data Sets
CRI:协作研究:通过可搜索的数据集门户网站改进实验计算机科学
  • 批准号:
    0551597
  • 财政年份:
    2006
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
ITR: Collaborative Research: (ACS+NHS)-(dmc+soc): Machine Learning for Sequences and Structured Data: Tools for Non-Experts
ITR:协作研究:(ACS NHS)-(dmc soc):序列和结构化数据的机器学习:非专家工具
  • 批准号:
    0427594
  • 财政年份:
    2004
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant

相似国自然基金

数智背景下的团队人力资本层级结构类型、团队协作过程与团队效能结果之间关系的研究
  • 批准号:
    72372084
  • 批准年份:
    2023
  • 资助金额:
    40 万元
  • 项目类别:
    面上项目
在线医疗团队协作模式与绩效提升策略研究
  • 批准号:
    72371111
  • 批准年份:
    2023
  • 资助金额:
    41 万元
  • 项目类别:
    面上项目
面向人机接触式协同作业的协作机器人交互控制方法研究
  • 批准号:
    62373044
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
基于数字孪生的颅颌面人机协作智能手术机器人关键技术研究
  • 批准号:
    82372548
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
A-型结晶抗性淀粉调控肠道细菌协作产丁酸机制研究
  • 批准号:
    32302064
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: DMREF: Closed-Loop Design of Polymers with Adaptive Networks for Extreme Mechanics
合作研究:DMREF:采用自适应网络进行极限力学的聚合物闭环设计
  • 批准号:
    2413579
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: DMREF: Organic Materials Architectured for Researching Vibronic Excitations with Light in the Infrared (MARVEL-IR)
合作研究:DMREF:用于研究红外光振动激发的有机材料 (MARVEL-IR)
  • 批准号:
    2409552
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
Collaborative Research: DMREF: AI-enabled Automated design of ultrastrong and ultraelastic metallic alloys
合作研究:DMREF:基于人工智能的超强和超弹性金属合金的自动化设计
  • 批准号:
    2411603
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: DMREF: Predicting Molecular Interactions to Stabilize Viral Therapies
合作研究:DMREF:预测分子相互作用以稳定病毒疗法
  • 批准号:
    2325392
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: DMREF: Topologically Designed and Resilient Ultrahigh Temperature Ceramics
合作研究:DMREF:拓扑设计和弹性超高温陶瓷
  • 批准号:
    2323458
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了