Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science

合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统

基本信息

  • 批准号:
    2148700
  • 负责人:
  • 金额:
    $ 15.28万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2021
  • 资助国家:
    美国
  • 起止时间:
    2021-10-01 至 2024-08-31
  • 项目状态:
    已结题

项目摘要

Scientists collect terabytes of critical data every year. Recently a strong open science movement has generated traction for the beneficial practice of sharing data across laboratories, universities and research institutions. Yet, sharing data is not enough. Data must be shared using standardized formats and accompanied by curated metadata to allow for tracking, search, and organization. Metadata are essential for scientific discovery, as they are routinely used to complete all data analyses. However, to date, most brain projects focus on collecting or analyzing data, not on metadata management. Typical metadata records consist of heterogeneous study descriptions, developed at study release stage, without consistency across records or standard mechanisms to track changes. This project will increase access to brain data and improve metadata handling by combining two NSF-funded projects. It will develop a first-of-its-kind metadata management system able to track data and metadata distributed across heterogeneous geographical locations, storage systems and data formats. This portion of the project will expand the functionality of a previously funded NSF project DataLad. DataLad will also be enhanced to interoperate with major data repositories such as OSF and Figshare. Furthermore, the project will use the NSF-funded cloud computing platform brainlife.io to create a data and metadata marketplace by gathering data from multiple currently separated repositories into a single ecosystem . The goal is to improve interoperability across open science projects and make data and metadata easily searchable and available for computing on national cyberinfrastructure systems, ultimately advancing scientific discovery by increasing data discoverability, utilization, and publication. This project will generate various technological advances. The core target will be an extensible system capable of automated gathering of metadata from various domains. It will be comprised of two major components: 1) a set of metadata parser algorithms that extract metadata from datasets and individual files using a flexible JSON-LD based data structure (with the ability to encode controlled vocabularies where available) and 2) an aggregation procedure that merges the aggregated metadata across parsers and stores them into compressed files that are optimized for bandwidth-efficient exchange and can be queried directly, or used as input into SQL or graph databases for data discovery applications. Extracted metadata will be included within the same datasets under Git and git-annex version control for unambiguous referencing and versatile data logistics. In parallel development we will improve interoperability of DataLad with existing data publishing portals (such as Figshare and OSF) by taking advantage of extracted metadata (e.g., Author, Description) to prefill required fields, and also by bundling the entire Git object store within the publication to make such published datasets installable back by DataLad without any loss of information. To make such published datasets discoverable, we will establish a crowd-sourced registry (with a RESTful API) which will get announcements on the availability of new datasets upon publication and aggregate their metadata to enable querying across datasets and data hosting providers. The final development will be the integration of DataLad within the brainlife.io data marketplace. This will make it possible to search and install datasets on brainlife.io as well as to process the data utilizing the brainlife.io analyses Apps on various NSF-funded national cyberinfrastructure high-throughput computer systems.A companion project is being funded by the Federal Ministry of Education and Research, Germany (BMBF).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
科学家每年都会收集关键数据的核细胞。最近,一项强大的开放科学运动引起了人们在实验室,大学和研究机构之间共享数据的有益实践的吸引力。但是,共享数据还不够。数据必须使用标准化格式共享,并伴随着策划的元数据,以进行跟踪,搜索和组织。元数据对于科学发现至关重要,因为它们通常用于完成所有数据分析。但是,迄今为止,大多数大脑项目都专注于收集或分析数据,而不是元数据管理。典型的元数据记录由在研究发行阶段开发的异质研究描述组成,在记录或标准机制之间没有跟踪变化的一致性。该项目将通过组合两个NSF资助的项目来增加对大脑数据的访问并改善元数据处理。它将开发一个能够跟踪分布在异质地理位置,存储系统和数据格式的数据和元数据的首个元数据管理系统。该项目的这一部分将扩大先前资助的NSF项目数据的功能。 Datalad还将得到增强,以与OSF和FIGSHARE等主要数据存储库进行互操作。此外,该项目将使用NSF资助的云计算平台Brainlife.io来创建数据和元数据市场,通过将当前分离的存储库中的数据收集到单个生态系统中。目的是改善开放科学项目的互操作性,并使数据和元数据易于搜索,并用于计算国家网络基础设施系统,最终通过增加数据发现性,利用率和发布来提高科学发现。该项目将产生各种技术进步。核心目标将是一个可扩展的系统,能够从各个域中自动收集元数据。它将由两个主要组成部分组成:1)一组元数据解析器算法,使用基于JSON-LD的数据结构提取元数据,从数据集和单个文件中提取元数据(在可用的情况下编码受控词汇的能力(在可用的地方编码受控的词汇)和2)将聚集的过程与跨层的交换过程相融合,以使其跨越范围的范围,并将其置于Parsed portress,并将其置于partew的范围内,并将其置于partew中,并将其置于partew的范围内,并将其置于partew中,并将其压缩为parte,并将其置于partew的范围内。直接查询或用作数据发现应用程序中的SQL或图形数据库的输入。提取的元数据将包括在GIT和GIT-ANNEX版本控制下的同一数据集中,以进行明确的参考和多功能数据物流。在并行开发中,我们将通过利用提取的元数据(例如作者,描述)来改善数据出版门户(例如FigShare和OSF)的互操作性,以预填充所需的领域,并通过将整个GIT对象存储捆绑在出版物中,以使出版物中的数据集在数据集中通过DataLAD返回,而无需任何信息损失。为了使此类发布的数据集可发现,我们将建立一个众包注册表(带有RESTFULE API),该注册表将在出版后获得有关新数据集的可用性的公告,并汇总其元数据以启用跨数据集和数据托管提供者的查询。最终发展将是Datalad在Brainlife.io数据市场中的集成。这将使在Brainlife.io上进行搜索和安装数据集以及处理利用Brainlife.IO的数据分析应用程序对各种NSF资助的国家网络网络基础设施高通量计算机系统的应用程序。A伴侣项目是由联邦教育和研究部提供了由DEVITION的ERAPTION和STAT IDENITY的陪伴资助的。基金会的智力优点和更广泛的影响评论标准。

项目成果

期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Development of white matter tracts between and within the dorsal and ventral streams
  • DOI:
    10.1007/s00429-021-02414-5
  • 发表时间:
    2021-01
  • 期刊:
  • 影响因子:
    3.1
  • 作者:
    S. Vinci-Booher;B. Caron;D. Bullock;K. James;F. Pestilli
  • 通讯作者:
    S. Vinci-Booher;B. Caron;D. Bullock;K. James;F. Pestilli
Neurodesk: an accessible, flexible and portable data analysis environment for reproducible neuroimaging
  • DOI:
    10.1038/s41592-023-02145-x
  • 发表时间:
    2024-01-08
  • 期刊:
  • 影响因子:
    48
  • 作者:
    Renton,Angela I.;Dao,Thuy T.;Bollmann,Steffen
  • 通讯作者:
    Bollmann,Steffen
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Franco Pestilli其他文献

The visual dorsal and ventral streams communicate through the vertical occipital fasciculus
视觉背侧和腹侧流通过垂直枕叶束进行交流
  • DOI:
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hiromasa Takemura;Franco Pestilli;Ariel Rokem;Jonathan Winawer;Jason D. Yeatman;Brian A. Wandell
  • 通讯作者:
    Brian A. Wandell
574. Separable White Matter Pathways Associated With Counterconditioning and Fear Extinction
  • DOI:
    10.1016/j.biopsych.2023.02.814
  • 发表时间:
    2023-05-01
  • 期刊:
  • 影响因子:
  • 作者:
    Patrick Laing;Nicole Keller;Franco Pestilli;Joseph Dunsmoor
  • 通讯作者:
    Joseph Dunsmoor
New technologies for precision brain science: studying individuality and variability in large human populations.
精密脑科学新技术:研究大量人群的个性和变异性。
  • DOI:
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Franco Pestilli;Cesar Caiafa;& 竹村浩昌.
  • 通讯作者:
    & 竹村浩昌.

Franco Pestilli的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Franco Pestilli', 18)}}的其他基金

NCS-FO: Connectome mapping algorithms with application to community services for big data neuroscience
NCS-FO:连接组映射算法及其应用于大数据神经科学社区服务
  • 批准号:
    2203524
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
BD Spokes: SPOKE: MIDWEST: Collaborative: Advanced Computational Neuroscience Network (ACNN)
BD 辐条:辐条:中西部:协作:高级计算神经科学网络 (ACNN)
  • 批准号:
    2148729
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science
合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统
  • 批准号:
    1912270
  • 财政年份:
    2019
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
NCS-FO: Connectome mapping algorithms with application to community services for big data neuroscience
NCS-FO:连接组映射算法及其应用于大数据神经科学社区服务
  • 批准号:
    1734853
  • 财政年份:
    2017
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
BD Spokes: SPOKE: MIDWEST: Collaborative: Advanced Computational Neuroscience Network (ACNN)
BD 辐条:辐条:中西部:协作:高级计算神经科学网络 (ACNN)
  • 批准号:
    1636893
  • 财政年份:
    2016
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant

相似国自然基金

指向提议者的共情关怀对第三方惩罚行为的影响:心理、脑与计算机制
  • 批准号:
    32371102
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
经济博弈中提议者对先前第三方干预者的分配公平性研究
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    24 万元
  • 项目类别:
    青年科学基金项目
基于深度层次特征相似性度量的视觉跟踪方法研究
  • 批准号:
    61773397
  • 批准年份:
    2017
  • 资助金额:
    65.0 万元
  • 项目类别:
    面上项目
构造类型专家系统及其开发工具的研究
  • 批准号:
    68875006
  • 批准年份:
    1988
  • 资助金额:
    2.0 万元
  • 项目类别:
    面上项目

相似海外基金

CRCNS US-German Collaborative Research Proposal: Neural and computational mechanisms of flexible goal-directed decision making
CRCNS 美德合作研究提案:灵活目标导向决策的神经和计算机制
  • 批准号:
    2309022
  • 财政年份:
    2024
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
CRCNS US-Spain Research Proposal: Collaborative Research: Tracking and modeling the neurobiology of multilingual speech recognition
CRCNS 美国-西班牙研究提案:合作研究:跟踪和建模多语言语音识别的神经生物学
  • 批准号:
    2207770
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Continuing Grant
CRCNS US-Spain Research Proposal: Collaborative Research: Tracking and modeling the neurobiology of multilingual speech recognition
CRCNS 美国-西班牙研究提案:合作研究:跟踪和建模多语言语音识别的神经生物学
  • 批准号:
    2207747
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
  • 批准号:
    2207727
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
  • 批准号:
    2207700
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了