AXS - Enabling Analysis of Petascale Astronomical Datasets

AXS - 支持千万亿级天文数据集的分析

基本信息

  • 批准号:
    2003196
  • 负责人:
  • 金额:
    $ 57.93万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-08-01 至 2024-07-31
  • 项目状态:
    已结题

项目摘要

Astronomy is being rapidly transformed by the advent of large, automated, digital sky surveys into a field where petabyte tabular data sets are becoming commonplace. Unfortunately, this increase has not been followed by commensurate improvements in the tools and frameworks: we are now limited not by the richness of our datasets, but by an inability to mine them for knowledge. When the most challenging questions of the day demand repeated, complex processing of large information-rich tabular datasets, scalable and stable tools that are easy to use by scientists are crucial. This is a project to develop, package, and deploy the Astronomical eXtensions for Spark (AXS), a scalable open-source astronomical data analysis framework built on Apache Spark. AXS will make it possible for astronomers, including and perhaps especially those who are not data management experts, to devise and execute astronomical big data analyses using industry-standard tools. This will be a transformative increase in the community's ability to extract knowledge from datasets collected at great expense, thus unlocking their value across all areas of astronomy. There will be opportunities for knowledge transfers and partnerships between industry and academia. The techniques used and created will be taught within the astronomy curriculum, and those curriculum materials will be made public. This will improve the competitiveness of astronomy students in careers beyond astronomy, and it will help to develop a globally competitive STEM workforce.AXS will enable astronomers to scale their analysis from a personal laptop to thousands of nodes on either cloud or NSF-supported cyberinfrastructure (CI). This system has already been prototyped, and leverages Spark, a state-of-the-art industry-standard engine for big data processing, to make it possible to query and analyze almost arbitrarily large astronomical catalogs while supporting complex workflows with astronomy-specific operations. The tool will be accompanied by a hosted demonstration service, documentation, and support for deployment on NSF CI resources and public cloud platforms. For long-term sustainability, AXS will be developed in a tight loop with major stakeholders, built on open source tools and processes, and strongly integrated with AstroPy and the PyData stack, which are widely used in astronomy. AXS will also robustly scale to large computational clusters, making both NSF-supported and public CI more accessible to astronomers. Developments by this project will enable other industrial and academic applications, especially those dealing with large, tabular, spatio-temporal datasets indexed on a sphere, such as geospatial analysis.This award by the Division of Astronomical Sciences within the NSF Directorate of Mathematical and Physical Sciences is jointly supported by the NSF Office of Advanced Cyberinfrastructure.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
随着大型、自动化、数字天空勘测的出现,天文学正在迅速转变为拍字节表格数据集变得司空见惯的领域。不幸的是,这种增长并没有伴随着工具和框架的相应改进:我们现在不仅受到数据集丰富性的限制,而且还受到无法挖掘数据集知识的限制。 当当今最具挑战性的问题需要重复时,对信息丰富的大型表格数据集进行复杂的处理、易于科学家使用的可扩展且稳定的工具至关重要。这是一个开发、打包和部署 Astronomical eXtensions for Spark (AXS) 的项目,这是一个基于 Apache Spark 构建的可扩展开源天文数据分析框架。 AXS 将使天文学家(包括尤其是那些不是数据管理专家的天文学家)能够使用行业标准工具设计和执行天文大数据分析。 这将彻底提高社区从花费巨资收集的数据集中提取知识的能力,从而释放其在天文学所有领域的价值。 工业界和学术界之间将有知识转移和伙伴关系的机会。 使用和创建的技术将在天文学课程中教授,并且这些课程材料将公开。这将提高天文学学生在天文学以外的职业中的竞争力,并将有助于培养一支具有全球竞争力的 STEM 劳动力队伍。AXS 将使天文学家能够将他们的分析从个人笔记本电脑扩展到云或 NSF 支持的网络基础设施上的数千个节点( CI)。 该系统已经完成原型设计,并利用 Spark(一种用于大数据处理的最先进的行业标准引擎),可以查询和分析几乎任意大型的天文目录,同时支持具有天文学特定操作的复杂工作流程。 该工具将附带托管演示服务、文档以及在 NSF CI 资源和公共云平台上部署的支持。 为了长期可持续发展,AXS 将与主要利益相关者紧密循环开发,基于开源工具和流程构建,并与天文学中广泛使用的 AstroPy 和 PyData 堆栈紧密集成。 AXS 还将稳健地扩展到大型计算集群,使天文学家更容易使用 NSF 支持的 CI 和公共 CI。该项目的开发将使其他工业和学术应用成为可能,特别是那些处理在球体上索引的大型表格时空数据集的应用,例如地理空间分析。该奖项由美国国家科学基金会数学和物理理事会天文科学部颁发科学由 NSF 高级网络基础设施办公室共同支持。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
The Astronomy Commons Platform: A Deployable Cloud-based Analysis Platform for Astronomy
  • DOI:
    10.3847/1538-3881/ac77fb
  • 发表时间:
    2022-06
  • 期刊:
  • 影响因子:
    0
  • 作者:
    S. Stetzler;M. Juri'c;Kyle Boone;Andrew J. Connolly;C. Slater;Petar Zevcevi'c
  • 通讯作者:
    S. Stetzler;M. Juri'c;Kyle Boone;Andrew J. Connolly;C. Slater;Petar Zevcevi'c
A Scalable Cloud-Based Analysis Platform for Survey Astronomy
用于巡天天文学的可扩展的基于云的分析平台
  • DOI:
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Stetzler, S;Slater, C;Zecevic, P;Juric, M.
  • 通讯作者:
    Juric, M.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Mario Juric其他文献

Jupiter’s Metastable Companions
木星的亚稳态同伴
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    7.9
  • 作者:
    S. Greenstreet;B. Gladman;Mario Juric
  • 通讯作者:
    Mario Juric

Mario Juric的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

Investigating how bHLH circuits integrate signals for cell fate decisions
研究 bHLH 电路如何整合信号以决定细胞命运
  • 批准号:
    10722452
  • 财政年份:
    2023
  • 资助金额:
    $ 57.93万
  • 项目类别:
Next generation massively multiplexed combinatorial genetic screens
下一代大规模多重组合遗传筛选
  • 批准号:
    10587354
  • 财政年份:
    2023
  • 资助金额:
    $ 57.93万
  • 项目类别:
Investigation of STAT2 Signaling in the tumor microenvironment
肿瘤微环境中 STAT2 信号传导的研究
  • 批准号:
    10661993
  • 财政年份:
    2023
  • 资助金额:
    $ 57.93万
  • 项目类别:
Enabling the Assessment of Fetal Brain Development and Degeneration with Machine Learning
通过机器学习评估胎儿大脑发育和退化
  • 批准号:
    10659817
  • 财政年份:
    2023
  • 资助金额:
    $ 57.93万
  • 项目类别:
Development of a peptide/HLA complex isolation system enabling comprehensive sarcoma antigen analysis
开发肽/HLA 复合物分离系统,实现全面的肉瘤抗原分析
  • 批准号:
    23K15720
  • 财政年份:
    2023
  • 资助金额:
    $ 57.93万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了