Collaborative Research: Frameworks: Scalable Performance and Accuracy analysis for Distributed and Extreme-scale systems (SPADE)

协作研究:框架:分布式和超大规模系统的可扩展性能和准确性分析 (SPADE)

基本信息

  • 批准号:
    2311707
  • 负责人:
  • 金额:
    $ 210.27万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2023
  • 资助国家:
    美国
  • 起止时间:
    2023-09-15 至 2027-08-31
  • 项目状态:
    未结题

项目摘要

Advances in computer simulations have made scientific discoveries more accessible. However, with the evolution of computing technology, each new generation of hardware and software presents unique performance and reliability challenges. These challenges must be addressed to fully harness the potential of these evolving technologies. SPADE is a project aimed to tackle these issues head-on. At its core, SPADE builds on the PAPI performance monitoring library - a tool used by the High-Performance Computing (HPC) community for over two decades. SPADE aims to enhance this legacy by creating methods that can assess and improve performance and accuracy on a wide range of advanced and evolving hardware and software technologies. This endeavor is not just about improving computational science but also about fostering diversity and education of a new generation of application scientists, engineers, and computer scientists. By providing an understanding of, and the ability to, navigate the intricate details of emerging technologies in the computing realm, SPADE is directly contributing to the advancement of this field. This will also democratize access to HPC, allowing a more diverse range of researchers and institutions to contribute to scientific discovery. Moreover, as SPADE aims to improve the capabilities of computer simulations, it enhances the ability to tackle a broad range of challenges - from understanding climate change to drug discovery. In essence, beyond advancing the HPC field, SPADE intends delivering a real-world impact by unlocking the full potential of computational science.The SPADE project focuses on advancing the monitoring, optimization, evaluation, and decision-making capabilities for extreme-scale systems. These critical capabilities are pivotal for both the High-Performance Computing (HPC) community and the scientific applications community that leverage these systems. With the evolution of HPC resources toward extreme scale, there is an increasing need for integrated performance and accuracy analysis frameworks to understand and mitigate performance and reliability challenges. To meet these needs, SPADE aims to deliver software and application programming interfaces (APIs) that broaden support for heterogeneity and scalability across a diverse range of computing platforms, including emerging vendor technologies. The SPADE project intends to utilize the established PAPI performance monitoring library to address the demands of scientific and machine learning applications effectively. Specifically, SPADE's mission includes: (1) developing monitoring capabilities for innovative and advanced technologies across the hardware stack; (2) designing novel abstractions that encapsulate the internal behavior of software components and facilitate interoperability across the software stack; (3) implementing a new performance and accuracy analysis framework that capitalizes on the efficiency and flexibility of C++'s object-oriented nature; (4) integrating new analysis functionality with various software stack layers and scientific and machine learning applications; and (5) examining new accuracy vs. performance trade-offs introduced with low-precision floating-point types. In essence, SPADE facilitates innovations in cyberinfrastructure development by enabling efficient and comprehensive resource utilization of extreme-scale platforms.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
计算机模拟的进步使科学发现更加易于访问。但是,随着计算技术的发展,每个新一代硬件和软件都会列出独特的性能和可靠性挑战。必须解决这些挑战,以充分利用这些不断发展的技术的潜力。 Spade是一个旨在直接解决这些问题的项目。 Spade的核心是在Papi Performance Monitoring库上建立的 - 高性能计算(HPC)社区使用的工具已有二十年了。 Spade旨在通过创建可以评估和提高各种高级和不断发展的硬件和软件技术的方法和准确性来增强这种遗产。这项工作不仅在于改善计算科学,还涉及促进新一代应用科学家,工程师和计算机科学家的多样性和教育。通过提供对计算领域中新兴技术的复杂细节的理解和能力,Spade直接有助于该领域的发展。这也将民主化对HPC的访问,从而使更多的研究人员和机构为科学发现做出了贡献。此外,随着Spade旨在提高计算机模拟的功能,它增强了应对广泛挑战的能力 - 从了解气候变化到药物发现。从本质上讲,除了推进HPC领域外,Spade还打算通过释放计算科学的全部潜力来产生现实世界的影响。Spade项目的重点是推进极端规模系统的监视,优化,评估和决策能力。这些关键功能对于高性能计算(HPC)社区和利用这些系统的科学应用社区都是关键的。随着HPC资源向极端规模的发展,越来越需要综合性能和准确性分析框架来理解和减轻性能和可靠性挑战。为了满足这些需求,Spade旨在提供软件和应用程序编程界面(API),以扩大对异质性和可扩展性的支持,包括各种计算平台,包括新兴供应商技术。 Spade项目打算利用既定的Papi绩效监控库有效地解决科学和机器学习应用程序的需求。具体而言,Spade的使命包括:(1)在整个硬件堆栈中开发用于创新和高级技术的监视功能; (2)设计新颖的抽象,以封装软件组件的内部行为并促进软件堆栈中的互操作性; (3)实施一个新的绩效和准确性分析框架,该框架利用了C ++面向对象的性质的效率和灵活性; (4)将新的分析功能与各种软件堆栈层以及科学和机器学习应用程序集成; (5)检查具有低精度浮点类类型的新准确性与性能权衡。从本质上讲,Spade通过实现高度尺度平台的有效和全面的资源利用来促进网络基础设施开发的创新。该奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子优点和更广泛的影响审查标准通过评估来进行评估的。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Heike Jagode其他文献

Task placement of parallel multi-dimensional FFTs on a mesh communication network
网状通信网络上并行多维 FFT 的任务放置
  • DOI:
  • 发表时间:
    2008
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Heike Jagode;J. Hein;A. Trew
  • 通讯作者:
    A. Trew
Counter Inspection Toolkit: Making Sense Out of Hardware Performance Events
计数器检查工具包:了解硬件性能事件
  • DOI:
    10.1007/978-3-030-11987-4_2
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Anthony Danalis;Heike Jagode;Hanumantharayappa;Sangamesh Ragate;J. Dongarra
  • 通讯作者:
    J. Dongarra
Power-aware computing: Measurement, control, and performance analysis for Intel Xeon Phi
功耗感知计算:英特尔至强融核的测量、控制和性能分析
Fourier Transforms for the BlueGene / L Communication Network
BlueGene / L 通信网络的傅里叶变换
  • DOI:
  • 发表时间:
    2006
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Heike Jagode
  • 通讯作者:
    Heike Jagode
Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs
使用 GPU 的异构并行系统的并行性能测量
  • DOI:
    10.1109/icpp.2011.71
  • 发表时间:
    2011
  • 期刊:
  • 影响因子:
    0
  • 作者:
    A. Malony;Scott Biersdorff;S. Shende;Heike Jagode;S. Tomov;Guido Juckeland;Robert Dietrich;D. Poole;Christopher Lamb
  • 通讯作者:
    Christopher Lamb

Heike Jagode的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Heike Jagode', 18)}}的其他基金

SHF: Small: PAPI-V Hardware Performance Monitoring for Virtualized Environments
SHF:小型:虚拟化环境的 PAPI-V 硬件性能监控
  • 批准号:
    1117058
  • 财政年份:
    2011
  • 资助金额:
    $ 210.27万
  • 项目类别:
    Standard Grant

相似国自然基金

多价框架核酸与CRISPR/Cas协作传感平台研究及三阴性乳腺癌术后监测应用
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
多价框架核酸与CRISPR/Cas协作传感平台研究及三阴性乳腺癌术后监测应用
  • 批准号:
    22204104
  • 批准年份:
    2022
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
基于高阶正则化半监督学习的多跟踪器框架模型及融合策略研究
  • 批准号:
    61571362
  • 批准年份:
    2015
  • 资助金额:
    57.0 万元
  • 项目类别:
    面上项目
表示模型框架下高光谱遥感影像分类若干技术研究
  • 批准号:
    61571033
  • 批准年份:
    2015
  • 资助金额:
    57.0 万元
  • 项目类别:
    面上项目
随机几何框架下的多层异构蜂窝网中物理层安全问题研究
  • 批准号:
    61401510
  • 批准年份:
    2014
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: Frameworks: MobilityNet: A Trustworthy CI Emulation Tool for Cross-Domain Mobility Data Generation and Sharing towards Multidisciplinary Innovations
协作研究:框架:MobilityNet:用于跨域移动数据生成和共享以实现多学科创新的值得信赖的 CI 仿真工具
  • 批准号:
    2411152
  • 财政年份:
    2024
  • 资助金额:
    $ 210.27万
  • 项目类别:
    Standard Grant
Collaborative Research: Frameworks: hpcGPT: Enhancing Computing Center User Support with HPC-enriched Generative AI
协作研究:框架:hpcGPT:通过 HPC 丰富的生成式 AI 增强计算中心用户支持
  • 批准号:
    2411297
  • 财政年份:
    2024
  • 资助金额:
    $ 210.27万
  • 项目类别:
    Standard Grant
Collaborative Research: Frameworks: hpcGPT: Enhancing Computing Center User Support with HPC-enriched Generative AI
协作研究:框架:hpcGPT:通过 HPC 丰富的生成式 AI 增强计算中心用户支持
  • 批准号:
    2411298
  • 财政年份:
    2024
  • 资助金额:
    $ 210.27万
  • 项目类别:
    Standard Grant
Collaborative Research: Scalable Manufacturing of Large-Area Thin Films of Metal-Organic Frameworks for Separations Applications
合作研究:用于分离应用的大面积金属有机框架薄膜的可扩展制造
  • 批准号:
    2326714
  • 财政年份:
    2024
  • 资助金额:
    $ 210.27万
  • 项目类别:
    Standard Grant
Collaborative Research: AF: Small: Structural Graph Algorithms via General Frameworks
合作研究:AF:小型:通过通用框架的结构图算法
  • 批准号:
    2347322
  • 财政年份:
    2024
  • 资助金额:
    $ 210.27万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了