EarthCube Data Capabilities: Collaborative Research: Integration of Reproducibility into Community Cyberinfrastructure
EarthCube 数据能力:协作研究:将可重复性集成到社区网络基础设施中
基本信息
- 批准号:1928288
- 负责人:
- 金额:$ 33.19万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-09-01 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
For science to reliably support new discoveries, its results must be reproducible. This has proven to be a challenge in many fields including, most notably, fields that rely on computational studies as a means for supporting new discoveries. Reproducibility in these studies is particularly difficult because they require open sharing of data and models and careful control by the original researcher. This is to ensure that products can be run on later generations of hardware and software and produce consistent results. This project will develop software that helps support computational reproducibility and makes it easier and more efficient for geoscientists to preserve, share, repeat and replicate scientific computations. The Broader Impacts of this project include a collaboration between computer scientists, hydrologists and the Consortium of Universities for the Advancement of Hydrologic Science Inc. (CUAHSI) for the hydrology research community. With over 3500 users, and holding over 8000 model and data resources, this collaboration will bring improved tools and best practices to a broad and diverse community of geoscientists. Beyond hydrology, the methods and tools developed as part of this project have the potential to be extended to the solid Earth and space science geoscience domains. They also have the potential to inform the reproducibility evaluation process as currently undertaken by journals and publishers. The projct will also conduct workshops to train researchers and be used in the classroom at Utah Sate Universtiy, DePaul University and the University of Virginia. Emphasis on the importance of research reproducibility is steadily rising, however many studies still continue to not be reproducible. Reproducibility in computational studies is particularly difficult because of the challenges involved in completely documenting the data, models and procedures used together with the underlying hardware and software dependencies. The reproducibility workbench software (ReproBench) developed in this project will address reproducibility questions by establishing a container-based reproducible workflow that will make it easy and efficient for geoscientists to verify scientific results. Automation and documentation are two key methods for improving verification and, in general, the conduct of reproducible science. This project will build-from past investments: (I) automated containerization methods, through the Sciunit project, and (II) well-documented, community-adopted interfaces, through HydroShare, and bring these investments together to establish a novel, robust, and reproducible workflow. By applying this workflow to water-related science use cases, this project will demonstrate how to preserve, share, repeat, and replicate scientific results. The interfaces can become an exemplar for other community cyberinfrastructure that, akin to Hydrology, aims to share data and models at a large scale. In establishing this workflow, the ReproBench project team combines expertise in cyberinfrastructure, domain science, and reproducible computational data science. By leveraging Sciunit, ReproBench brings formal methods for the conduct of reproducible computational science into the geosciences.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
为了使科学能够可靠地支持新发现,其结果必须是可重复的。事实证明,这在许多领域都是一个挑战,尤其是那些依赖计算研究作为支持新发现的手段的领域。这些研究的可重复性特别困难,因为它们需要数据和模型的开放共享以及原始研究人员的仔细控制。这是为了确保产品可以在下一代硬件和软件上运行并产生一致的结果。该项目将开发有助于支持计算再现性的软件,使地球科学家能够更轻松、更高效地保存、共享、重复和复制科学计算。该项目的更广泛影响包括计算机科学家、水文学家和水文科学促进大学联盟 (CUAHSI) 之间针对水文研究界的合作。 此次合作拥有超过 3500 名用户,并拥有 8000 多个模型和数据资源,将为广泛而多样化的地球科学家社区带来改进的工具和最佳实践。 除了水文学之外,作为该项目一部分开发的方法和工具有可能扩展到固体地球和空间科学地球科学领域。 它们还有可能为期刊和出版商目前进行的可重复性评估过程提供信息。该项目还将举办研讨会来培训研究人员,并在犹他州立大学、德保罗大学和弗吉尼亚大学的课堂上使用。人们越来越重视研究可重复性的重要性,但许多研究仍然不可重复。计算研究中的可重复性特别困难,因为完整记录与底层硬件和软件依赖项一起使用的数据、模型和程序面临着挑战。该项目中开发的再现性工作台软件(ReproBench)将通过建立基于容器的可再现工作流程来解决再现性问题,使地球科学家能够轻松高效地验证科学结果。自动化和文档化是改进验证和可重复科学实施的两种关键方法。该项目将建立在过去投资的基础上:(I) 通过 Sciunit 项目实现自动化容器化方法,以及 (II) 通过 HydroShare 记录完善、社区采用的接口,并将这些投资结合在一起,建立一个新颖、强大且易于使用的系统。可重复的工作流程。通过将此工作流程应用于与水相关的科学用例,该项目将演示如何保存、共享、重复和复制科学结果。这些界面可以成为其他社区网络基础设施的典范,类似于水文学,旨在大规模共享数据和模型。在建立此工作流程时,ReproBench 项目团队结合了网络基础设施、领域科学和可重复计算数据科学方面的专业知识。通过利用 Sciunit,ReproBench 将进行可重复计算科学的正式方法引入地球科学。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Efficient Provenance Alignment in Reproduced Executions
复制执行中的高效出处对齐
- DOI:
- 发表时间:2020-01
- 期刊:
- 影响因子:0
- 作者:Y. Nakamura; T. Malik
- 通讯作者:T. Malik
Comparing containerization-based approaches for reproducible computational modeling of environmental systems
比较基于容器化的环境系统可重复计算建模方法
- DOI:10.1016/j.envsoft.2023.105760
- 发表时间:2023-09
- 期刊:
- 影响因子:4.9
- 作者:Choi, Young;Roy, Binata;Nguyen, Jared;Ahmad, Raza;Maghami, Iman;Nassar, Ayman;Li, Zhiyu;Castronova, Anthony M.;Malik, Tanu;Wang, Shaowen;et al
- 通讯作者:et al
A taxonomy for reproducible and replicable research in environmental modelling
环境建模中可重复和可复制研究的分类法
- DOI:10.1016/j.envsoft.2020.104753
- 发表时间:2020-06
- 期刊:
- 影响因子:4.9
- 作者:Essawy, Bakinam T.;Goodall, Jonathan L.;Voce, Daniel;Morsy, Mohamed M.;Sadler, Jeffrey M.;Choi, Young Don;Tarboton, David G.;Malik, Tanu
- 通讯作者:Malik, Tanu
Provenance-based Workflow Diagnostics Using Program Specification
使用程序规范进行基于来源的工作流程诊断
- DOI:10.1109/hipc56025.2022.00046
- 发表时间:2022-12
- 期刊:
- 影响因子:0
- 作者:Nakamura, Yuta;Malik, Tanu;Kanj, Iyad;Gehani, Ashish
- 通讯作者:Gehani, Ashish
CHEX: Multiversion Replay with Ordered Checkpoints.
CHEX:带有有序检查点的多版本重放。
- DOI:10.14778/3514061.3514075
- 发表时间:2022-01
- 期刊:
- 影响因子:0
- 作者:Naga Nithin Manne; Shilvi Satpati
- 通讯作者:Shilvi Satpati
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Tanu Malik其他文献
Kondo: Efficient Provenance-Driven Data Debloating
Kondo:高效的来源驱动的数据膨胀
- DOI:
- 发表时间:
1970-01-01 - 期刊:
- 影响因子:0
- 作者:
A. Modi;Rohan Tikmany;Tanu Malik;Raghavan Komondoor;Ashish Gehani;Deepak;D’Souza - 通讯作者:
D’Souza
Access-Based Carving of Data for Efficient Reproducibility of Containers
基于访问的数据雕刻可实现容器的高效再现性
- DOI:
10.48550/arxiv.2305.04641 - 发表时间:
2023-05-08 - 期刊:
- 影响因子:0
- 作者:
Rohan Tikmany;A. Modi;Raffay Atiq;Moaz Reyad;Ashish Gehani;Tanu Malik - 通讯作者:
Tanu Malik
The Second Data Release of the Sloan Digital Sky Survey
斯隆数字巡天第二次数据发布
- DOI:
10.1086/421365 - 发表时间:
2004-03-13 - 期刊:
- 影响因子:0
- 作者:
K. Abazajian;J. Adelman;M. Agüeros;S. Allam;Kurt;S. Anderson;S. Anderson;J. Annis;N. Bahcall;I. Baldry;Steven;Bastian;A. Berlind;M. Bernardi;M. Blanton;J. Bochanski;W. Boroski;J. Briggs;J. Brinkmann;R. Brunner;Tamás;Budavári;L. Carey;S. Carliles;F. Cast;er;er;A. Connolly;István;Csabai;M. Doi;F. Dong;D. Eisenstein;M. Evans;Xiaohui Fan;D. Finkbeiner;S. Friedman;J. Frieman;M. Fukugita;Roy;R. Gal;B. Gillespie;K. Glazebrook;James Douglas Allan Gray;E. Grebel;J. Gunn;V. Gurbani;P. Hall;M. Hamabe;F. Harris;C. Hugh;Harris;M. Harvanek;T. Heckman;J. Hendry;G. Hennessy;R. Hindsley;C. Hogan;D. Hogg;D. Holmgren;Shin;Ichikawa;T. Ichikawa;Ž. Ivezić;S. Jester;D. Johnston;Anders;M. Jorgensen;S. Kent;S. Kleinman;G. Knapp;A. Kniazev;R. Kron;J. Krzesinski;P. Kunszt;N. Kuropatkin;Q. Donald;Lamb;H. Lampeitl;Brian C. Lee;R. F. Leger;Nolan Li;Huan Lin;Y. Loh;D. Long;J. Loveday;R. Lupton;Tanu Malik;Bruce;Margon;T. Matsubara;P. McGehee;T. Mckay;Avery;Meiksin;J. Munn;R. Nakajima;T. Nash;E. Neilsen;Heidi Jo;Newberg;P. Newman;R. Nichol;T. Nicinski;M. Nieto;A. Nitta;S. Okamura;W. O'Mullane;J. Ostriker;R. Owen;N. Padmanabhan;J. Peoples;J. Pier;A. Pope;Thomas R. Quinn;G. Richards;M. Richmond;H. Rix;C. Rockosi;D. Schlegel;D. Schneider;R. Scranton;M. Sekiguchi;U. Seljak;G. Sergey;B. Sesar;E. Sheldon;K. Shimasaku;Walter A. Siegmund;N. Silvestri;J. A. Smith;V. Smolčić;S. Snedden;Albert;Stebbins;C. Stoughton;M. Strauss;M. Subbarao;A. Szalay;I. Szapudi;P. Szkody;G. Szokoly;M. Tegmark;Luis Teodoro;Aniruddha;R. Thakar;C. Tremonti;D. Tucker;A. Uomoto;D. Berk;Jan v;enBerg;enBerg;M. Vogeley;W. Voges;N. Vogt;M. Lucianne;Walkowicz;Shu;D. Weinberg;A. West;S. White;Brian;C. Wilhite;Yongzhong Xu;B. Yanny;N. Yasuda;C. Yip;D. Yocum;D. York;I. Zehavi;S. Zibetti;D. Zucker - 通讯作者:
D. Zucker
Expanding the Scope of Artifact Evaluation at HPC Conferences: Experience of SC21
扩大 HPC 会议上的工件评估范围:SC21 的经验
- DOI:
- 发表时间:
2022-01 - 期刊:
- 影响因子:0
- 作者:
Tanu Malik; Anjo Vahldiek - 通讯作者:
Anjo Vahldiek
Genistein and daidzein
金雀异黄酮和大豆黄酮
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
L. Sarao;S. Kaur;Tanu Malik;Ashutosh Kumar Singh - 通讯作者:
Ashutosh Kumar Singh
Tanu Malik的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Tanu Malik', 18)}}的其他基金
CAREER: Advanced Containers for Reproducibility in Computational and Data Science
职业:计算和数据科学中可重复性的先进容器
- 批准号:
1846418 - 财政年份:2019
- 资助金额:
$ 33.19万 - 项目类别:
Continuing Grant
EarthCube Building Block: GeoDataspace: Simplifying Data Management for Geoscience Models
EarthCube 构建模块:GeoDataspace:简化地球科学模型的数据管理
- 批准号:
1722152 - 财政年份:2016
- 资助金额:
$ 33.19万 - 项目类别:
Standard Grant
EarthCube Building Blocks: Collaborative Proposal: GeoTrust: Improving Sharing and Reproducibility of Geoscience Applications
EarthCube 构建模块:协作提案:GeoTrust:提高地球科学应用的共享性和可重复性
- 批准号:
1639759 - 财政年份:2016
- 资助金额:
$ 33.19万 - 项目类别:
Standard Grant
EarthCube IA: Collaborative Proposal: Integrated GeoScience Observatory
EarthCube IA:协作提案:综合地球科学观测站
- 批准号:
1661918 - 财政年份:2016
- 资助金额:
$ 33.19万 - 项目类别:
Standard Grant
EarthCube IA: Collaborative Proposal: Integrated GeoScience Observatory
EarthCube IA:协作提案:综合地球科学观测站
- 批准号:
1540901 - 财政年份:2015
- 资助金额:
$ 33.19万 - 项目类别:
Standard Grant
EarthCube Building Block: GeoDataspace: Simplifying Data Management for Geoscience Models
EarthCube 构建模块:GeoDataspace:简化地球科学模型的数据管理
- 批准号:
1440327 - 财政年份:2014
- 资助金额:
$ 33.19万 - 项目类别:
Standard Grant
相似国自然基金
数据并知识驱动的跨场景老年人平衡能力风险评估方法研究
- 批准号:62302461
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
极端气候对住房综合支付能力的影响效应、机制与政策模拟——基于中国微观大数据的研究
- 批准号:72304248
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
行为数据驱动的在线学习者元认知能力动态建模与智能诊断研究
- 批准号:62307016
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于双模态数据的实验室地震强泛化能力机器学习预测研究
- 批准号:42374070
- 批准年份:2023
- 资助金额:53 万元
- 项目类别:面上项目
基于无标注数据的智能汽车环境感知能力在线自进化机制研究
- 批准号:52372414
- 批准年份:2023
- 资助金额:54 万元
- 项目类别:面上项目
相似海外基金
EarthCube Data Capabilities: Collaborative Proposal: Reducing Time-To-Science in the Earth Sciences: Annotations to foster convergence, inclusion, and credit
EarthCube 数据功能:协作提案:缩短地球科学的科学时间:促进融合、包容和信用的注释
- 批准号:
2246427 - 财政年份:2022
- 资助金额:
$ 33.19万 - 项目类别:
Standard Grant
Collaborative Research: EarthCube Capabilities: Raijin: Community Geoscience Analysis Tools for Unstructured Mesh Data
协作研究:EarthCube 功能:Raijin:非结构化网格数据的社区地球科学分析工具
- 批准号:
2126458 - 财政年份:2021
- 资助金额:
$ 33.19万 - 项目类别:
Standard Grant
Collaborative Research: EarthCube Data Capabilities: Volcanology hub for Interdisciplinary Collaboration, Tools and Resources (VICTOR)
合作研究:EarthCube 数据能力:跨学科合作、工具和资源的火山学中心 (VICTOR)
- 批准号:
2125974 - 财政年份:2021
- 资助金额:
$ 33.19万 - 项目类别:
Standard Grant
EarthCube Capabilities: CloudDrift: a platform for accelerating research with Lagrangian climate data
EarthCube 功能:CloudDrift:利用拉格朗日气候数据加速研究的平台
- 批准号:
2126413 - 财政年份:2021
- 资助金额:
$ 33.19万 - 项目类别:
Standard Grant
Collaborative Research: EarthCube Capabilities: Repurposing FAIR-Compliant Earth Science Data Repositories
协作研究:EarthCube 功能:重新利用符合 FAIR 的地球科学数据存储库
- 批准号:
2126427 - 财政年份:2021
- 资助金额:
$ 33.19万 - 项目类别:
Standard Grant