Data Science Core
数据科学核心
基本信息
- 批准号:10490235
- 负责人:
- 金额:$ 73.07万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-17 至 2026-08-31
- 项目状态:未结题
- 来源:
- 关键词:AddressAdoptedAlgorithmic AnalysisAlgorithmsAnimal ExperimentsAnimalsArchitectureBehaviorBehavioralBenchmarkingBrainCatalogsCodeCollaborationsCommunicationCommunitiesComputer AnalysisComputer ModelsComputer softwareConsumptionCreativenessCustomDataData AnalysesData Science CoreData SetData Storage and RetrievalDatabasesDisciplineDoctor of PhilosophyDocumentationEnvironmentFeedbackGeneticGoalsGuidelinesIndustrializationLeadLinkMeasurementMetadataMusNeurosciencesOnline SystemsOpticsPostdoctoral FellowProcessReproducibilityResearchResearch PersonnelRestRetrievalRouteRunningSoftware EngineeringSorting - Cell MovementSource CodeSpeedStandardizationStatistical Data InterpretationStudentsSystemTestingTimeTrainingUniversitiesWorkWritingbasecomputerized data processingdata accessdata analysis pipelinedata archivedata formatdata handlingdata managementdata modelingdata pipelinedata sharingdata standardsdatabase queryexperienceexperimental studyflexibilitygraphical user interfacememberopen dataopen sourcerecruitrelating to nervous systemrepositorysearchable databasesuccesstoolworking group
项目摘要
Data Science Core (DSC)
Leads: Krishna Shenoy PhD and Chris Roat PhD (with Surya Ganguli PhD)
Project Summary
Given the large volumes of optical, electrical, genetic and behavioral data that will be generated, stored and
computationally analyzed, it is essential to establish a comprehensive and yet streamlined DSC. There are four
major data challenges that the DSC will address. (1) Data size. Each experimental lab will generate very large,
and rapidly increasing, datasets. We must contend with storing, pre-processing (e.g., spike sorting) and
processing (e.g., single-trial analyses) these large and growing datasets. (2) Metadata. Collaborations between
groups are often hampered by not fully capturing – in a searchable database and linked to the bulk data – all
animal and experiment conditions, or so-called metadata. We will build in capabilities and requirements to
electronically capture full metadata. (3) Data format. Collaborations are also often hampered by the effort
required to understand each lab’s dataset format. Data format often depends on whether a given measurement
system was custom built or relies on a commercial system. We will capture this information as part of the
metadata for historical data relevant to this U19, and moving forward we will adopt the increasingly-popular
NeuroData Without Borders (NWB) data format. Finally, (4) Across animals and labs. Performing large-
scale analyses across many animals and labs is often truly onerous. This is because all three of the challenges
listed above combine, causing one to shy away from anything other than essential analyses (e.g., pooling results
across just a few mice in one specific condition). We will both build our own data pipelines to automatically
query our metadata database and, subsequently, retrieve the indicated experimental data as well as adopt the
increasingly-popular DataJoint pipeline.
Our DSC will be led by Prof. Shenoy, Dr. Roat (with considerable industrial-scale data handling
experience, and now at Stanford) and Prof. Ganguli (RP3 lead). Two full-time software engineers (TBD) will
implement the DSC architecture, including bulk data server, relational meta-database, data standards and
data pipeline. The software engineers will work closely with the rest of the team to help assure good
communication, and to help migrate analysis code and documentation into professional software standards for
dissemination. This will enable storage, retrieval and analysis of data in an efficient and modular way, which
enables rapid replacement of any piece of the data analysis pipeline as is essential for a creative environment
that also promotes rapid feedback of emerging ideas to subsequent experiments. We believe in Open Science,
including open source code (e.g., github) and data formats. We will share data with the broader community,
including with other U19 consortia. Thus our DSC is critical to the success of our proposed research, and
serves as the central hub of our U19 research.
数据科学核心 (DSC)
领导者:Krishna Shenoy 博士和 Chris Roat 博士(与 Surya Ganguli 博士)
项目概要
鉴于将生成、存储和使用大量光学、电学、遗传和行为数据
进行计算分析后,有必要建立一个全面且精简的 DSC。
DSC 将解决的主要数据挑战 (1) 每个实验实验室都会生成非常大的数据。
我们必须应对存储、预处理(例如尖峰排序)和快速增长的数据集。
处理(例如,单次试验分析)这些大型且不断增长的数据集(2)元数据之间的协作。
小组经常因未完全捕获(在可搜索数据库中并链接到批量数据)而受到阻碍
我们将建立动物和实验条件,或所谓的元数据的能力和要求。
(3) 数据格式的协作也常常受到阻碍。
了解每个实验室的数据集格式所需的数据格式通常取决于是否给定的测量。
系统是定制的或依赖于商业系统,我们将捕获此信息作为系统的一部分。
与此 U19 相关的历史数据的元数据,展望未来,我们将采用日益流行的
最后,(4)跨动物和实验室执行神经数据无国界(NWB)数据格式。
对许多动物和实验室进行规模分析通常非常繁重,因为这三个挑战都是如此。
上面列出的组合,导致人们回避除了基本分析之外的任何事情(例如,汇集结果
在一种特定条件下仅通过几只小鼠)我们都将自动构建自己的数据管道。
查询我们的元数据数据库,随后检索指定的实验数据并采用
日益流行的 DataJoint 管道。
我们的 DSC 将由 Shenoy 教授、Roat 博士领导(具有大量工业规模的数据处理能力)
经验,现在在斯坦福大学)和 Ganguli 教授(RP3 领导)将由两名全职软件工程师(TBD)担任。
实现DSC架构,包括批量数据服务器、关系元数据库、数据标准和
软件工程师将与团队其他成员密切合作,以帮助确保良好的数据管道。
沟通,并帮助将分析代码和文档迁移到专业软件标准中
这将以高效和模块化的方式存储、检索和分析数据。
能够快速更换数据分析管道的任何部分,这对于创意环境至关重要
这也促进了新兴想法对后续实验的快速反馈,我们相信开放科学,
包括开源代码(例如 github)和数据格式,我们将与更广泛的社区共享数据,
因此,我们的 DSC 对于我们拟议的研究的成功至关重要,并且
作为我们 U19 研究的中心枢纽。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Krishna V Shenoy其他文献
Initial conditions combine with sensory evidence to induce decision-related dynamics in premotor cortex
初始条件与感觉证据相结合,诱导前运动皮层的决策相关动态
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:16.6
- 作者:
Pierre O Boucher;Tian Wang;Laura Carceroni;Gary A. Kane;Krishna V Shenoy;Chandramouli Chandrasekaran - 通讯作者:
Chandramouli Chandrasekaran
Krishna V Shenoy的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Krishna V Shenoy', 18)}}的其他基金
CRCNS: Extracting Dynamical Structure Embedded in Motor Preparatory Activity
CRCNS:提取运动准备活动中嵌入的动态结构
- 批准号:
7276049 - 财政年份:2005
- 资助金额:
$ 73.07万 - 项目类别:
CRCNS: Extracting Dynamical Structure Embedded in Motor Preparatory Activity
CRCNS:提取运动准备活动中嵌入的动态结构
- 批准号:
7047377 - 财政年份:2005
- 资助金额:
$ 73.07万 - 项目类别:
相似国自然基金
血管内皮细胞通过E2F1/NF-kB/IL-6轴调控巨噬细胞活化在眼眶静脉畸形中的作用及机制研究
- 批准号:82301257
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
睡眠剥夺通过上调BMAL1/IL-17轴促进三级淋巴结构形成加重哮喘的研究
- 批准号:82300039
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
S100A6通过调控ZNF750组蛋白甲基化促进糖尿病角质形成细胞分化障碍的机制研究
- 批准号:82302802
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
肿瘤相关成纤维细胞通过CCL5/CCR5轴促进神经内分泌前列腺癌顺铂耐药的机制研究
- 批准号:82373358
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
鼻腔共生表皮葡萄球菌通过抗菌肽-moDC-CCL17通路抑制过敏性鼻炎的分子机制
- 批准号:82302595
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Achieving Model Fairness on Automatic Primary Open-angle Glaucoma Screening
实现自动原发性开角型青光眼筛查的模型公平性
- 批准号:
10726928 - 财政年份:2023
- 资助金额:
$ 73.07万 - 项目类别:
METEOR-Data Synthesis and Transfer (METEOR-DST)
METEOR-数据合成和传输 (METEOR-DST)
- 批准号:
10715025 - 财政年份:2023
- 资助金额:
$ 73.07万 - 项目类别:
A visualization interface for BRAIN single cell data, integrating transcriptomics, epigenomics and spatial assays
BRAIN 单细胞数据的可视化界面,集成转录组学、表观基因组学和空间分析
- 批准号:
10643313 - 财政年份:2023
- 资助金额:
$ 73.07万 - 项目类别:
Remote Kinesiology for Improving Human Health with Auto-locating Compliant Motion Tracking Stickers and Artificial Intelligence
通过自动定位兼容运动跟踪贴纸和人工智能来改善人类健康的远程运动机能学
- 批准号:
10751952 - 财政年份:2023
- 资助金额:
$ 73.07万 - 项目类别:
Brain Digital Slide Archive: An Open Source Platform for data sharing and analysis of digital neuropathology
Brain Digital Slide Archive:数字神经病理学数据共享和分析的开源平台
- 批准号:
10735564 - 财政年份:2023
- 资助金额:
$ 73.07万 - 项目类别: