SHF: Large: Collaborative Research: Next Generation Communication Mechanisms exploiting Heterogeneity, Hierarchy and Concurrency for Emerging HPC Systems
SHF:大型:协作研究:利用新兴 HPC 系统的异构性、层次结构和并发性的下一代通信机制
基本信息
- 批准号:1565414
- 负责人:
- 金额:$ 117.19万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-08-15 至 2020-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This award was partially supported by the CIF21 Software Reuse Venture whose goals are to support pathways towards sustainable software elements through their reuse, and to emphasize the critical role of reusable software elements in a sustainable software cyberinfrastructure to support computational and data-enabled science and engineering.Parallel programming based on MPI (Message Passing Interface) is being used with increased frequency in academia, government (defense and non-defense uses), as well as emerging uses in scalable machine learning and big data analytics. The emergence of Dense Many-Core (DMC) architectures like Intel's Knights Landing (KNL) and accelerator/co-processor architectures like NVIDIA GPGPUs are enabling the design of systems with high compute density. This, coupled with the availability of Remote Direct Memory Access (RDMA)-enabled commodity networking technologies like InfiniBand, RoCE, and 10/40GigE with iWARP, is fueling the growth of multi-petaflop and ExaFlop systems. These DMC architectures have the following unique characteristics: deeper levels of hierarchical memory; revolutionary network interconnects; and heterogeneous compute power and data movement costs (with heterogeneity at chip-level and node-level). For these emerging systems, a combination of MPI and other programming models, known as MPI+X (where X can be PGAS, Tasks, OpenMP, OpenACC, or CUDA), are being targeted. The current generation communication protocols and mechanisms for MPI+X programming models cannot efficiently support the emerging DMC architectures. This leads to the following broad challenges: 1) How can high-performance and scalable communication mechanisms for next generation DMC architectures be designed to support MPI+X (including Task-based) programming models? and 2) How can the current and next generation applications be designed/co-designed with the proposed communication mechanisms?A synergistic and comprehensive research plan, involving computer scientists from The Ohio State University (OSU) and Ohio Supercomputer Center (OSC) and computational scientists from the Texas Advanced Computing Center (TACC), San Diego Supercomputer Center (SDSC) and University of California San Diego (UCSD), is proposed to address the above broad challenges with innovative solutions. The research will be driven by a set of applications from established NSF computational science researchers running large scale simulations on Stampede and Comet and other systems at OSC and OSU. The proposed designs will be integrated into the widely-used MVAPICH2 library and made available for public use. Multiple graduate and undergraduate students will be trained under this project as future scientists and engineers in HPC. The established national-scale training and outreach programs at TACC, SDSC and OSC will be used to disseminate the results of this research to XSEDE users. Tutorials will be organized at XSEDE, SC and other conferences to share the research results and experience with the community.
该奖项得到了CIF21软件重用风险投资的部分支持,该公司的目标是通过其重新使用来支持通往可持续软件元素的途径,并强调可重复使用的软件元素在可持续软件Cyberinfrststructure中的关键作用,以支持计算和数据支持科学和数据的科学和工程基于MPI(消息传递界面)的平行编程正在学术界,政府(防御和非防御用途)以及可扩展的机器学习和大数据分析的新兴用途中使用。诸如英特尔骑士降落(KNL)和加速器/协同处理器架构(如NVIDIA GPGPU)等多核(DMC)建筑的繁殖型的出现使得能够高度计算密度的系统。加上远程直接内存访问(RDMA)的可用性(启用了具有Infiniband,ROCE和IWARP 10/40GIGE)等启用的商品网络技术,正在助长多Petaflop和Exaflop Systems的增长。这些DMC架构具有以下独特的特征:更深层次的分层内存;革命网络互连;以及异质的计算功率和数据运动成本(芯片级别和节点级别的异质性)。对于这些新兴系统,正在靶向MPI和其他编程模型(称为MPI+X)的组合(其中X可以是PGA,任务,OpenMP,OpenACC或CUDA)。 MPI+X编程模型的当前一代通信协议和机制无法有效支持新兴的DMC体系结构。 这导致了以下广泛的挑战:1)如何设计用于支持MPI+X(包括基于任务的)编程模型的下一代DMC体系结构的高性能和可扩展的通信机制? 2)如何设计/共同设计当前和下一代的应用程序?一项协同和全面的研究计划,涉及俄亥俄州立大学(OSU)和俄亥俄州超级计算机中心(OSC)和计算机的计算机科学家提议来自德克萨斯州高级计算中心(TACC),圣地亚哥超级计算机中心(SDSC)和加利福尼亚大学圣地亚哥分校(UCSD)的科学家,以通过创新解决方案来应对上述广泛的挑战。 这项研究将由已建立的NSF计算科学研究人员进行的一系列应用程序驱动,OSC和OSU的Stampede和Comet以及其他系统进行了大规模模拟。 拟议的设计将集成到广泛使用的MVAPICH2库中,并可供公开使用。 多个毕业生和本科生将在该项目的基础上接受HPC的未来科学家和工程师的培训。 TACC,SDSC和OSC的既定国家规模培训和外展计划将用于将这项研究的结果传播给XSEDE用户。教程将在XSEDE,SC和其他会议上组织,以与社区分享研究结果和经验。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Dhabaleswar Panda其他文献
Dhabaleswar Panda的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Dhabaleswar Panda', 18)}}的其他基金
CSR: Small: CONCERT: Designing Scalable Communication Runtimes with On-the-fly Compression for HPC and AI Applications on Heterogeneous Architectures
CSR:小型:CONCERT:为异构架构上的 HPC 和 AI 应用程序设计具有动态压缩的可扩展通信运行时
- 批准号:
2312927 - 财政年份:2023
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
Travel: Student Travel Support for MVAPICH User Group (MUG) 2023 Conference
旅行:MVAPICH 用户组 (MUG) 2023 年会议的学生旅行支持
- 批准号:
2331223 - 财政年份:2023
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
Collaborative Research: Frameworks: Performance Engineering Scientific Applications with MVAPICH and TAU using Emerging Communication Primitives
合作研究:框架:使用新兴通信原语的 MVAPICH 和 TAU 的性能工程科学应用
- 批准号:
2311830 - 财政年份:2023
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
Travel: Student Travel Support for MVAPICH User group (MUG) 2022 Conference
旅行:MVAPICH 用户组 (MUG) 2022 年会议的学生旅行支持
- 批准号:
2231825 - 财政年份:2022
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
AI Institute for Intelligent CyberInfrastructure with Computational Learning in the Environment (ICICLE)
环境中具有计算学习功能的智能网络基础设施人工智能研究所 (ICICLE)
- 批准号:
2112606 - 财政年份:2021
- 资助金额:
$ 117.19万 - 项目类别:
Cooperative Agreement
MRI: RADiCAL: Reconfigurable Major Research Cyberinfrastructure for Advanced Computational Data Analytics and Machine Learning
MRI:RADiCAL:用于高级计算数据分析和机器学习的可重构主要研究网络基础设施
- 批准号:
2018627 - 财政年份:2020
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
OAC Core: Small: Next-Generation Communication and I/O Middleware for HPC and Deep Learning with Smart NICs
OAC 核心:小型:使用智能 NIC 实现 HPC 和深度学习的下一代通信和 I/O 中间件
- 批准号:
2007991 - 财政年份:2020
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
Student Travel Support for MVAPICH User Group (MUG) Meeting
MAPICH 用户组 (MUG) 会议的学生旅行支持
- 批准号:
1930003 - 财政年份:2019
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
Collaborative Research: Frameworks: Designing Next-Generation MPI Libraries for Emerging Dense GPU Systems
协作研究:框架:为新兴密集 GPU 系统设计下一代 MPI 库
- 批准号:
1931537 - 财政年份:2019
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
Student Travel Support for MVAPICH User Group (MUG) Meeting
MAPICH 用户组 (MUG) 会议的学生旅行支持
- 批准号:
1839739 - 财政年份:2018
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
相似国自然基金
开发区跨界合作网络的形成机理与区域效应:以三大城市群为例
- 批准号:42301183
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于fMRI大尺度时变网络变异性的个体ERP波形预测研究
- 批准号:82372084
- 批准年份:2023
- 资助金额:48 万元
- 项目类别:面上项目
大环超分子对有机污染物及其降解中间体的自由基激发与诱导机制
- 批准号:52370168
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
抵挡汤早期干预抑制外膜滋养血管新生减轻血管钙化延缓2型糖尿病大血管病变发生的作用机制研究
- 批准号:82374247
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
利用衬底轨道过滤效应构筑大能隙二维拓扑绝缘体的研究
- 批准号:12304199
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: SHF: Medium: Enabling Graphics Processing Unit Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的图形处理单元性能仿真
- 批准号:
2402804 - 财政年份:2024
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Enabling GPU Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的 GPU 性能仿真
- 批准号:
2402806 - 财政年份:2024
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Enabling GPU Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的 GPU 性能仿真
- 批准号:
2402805 - 财政年份:2024
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant
SHF: Large: Collaborative Research: Molecular computing for the real world
SHF:大型:协作研究:现实世界的分子计算
- 批准号:
1832985 - 财政年份:2018
- 资助金额:
$ 117.19万 - 项目类别:
Continuing Grant
SHF: Large: Collaborative Research: Next Generation Communication Mechanisms exploiting Heterogeneity, Hierarchy and Concurrency for Emerging HPC Systems
SHF:大型:协作研究:利用新兴 HPC 系统的异构性、层次结构和并发性的下一代通信机制
- 批准号:
1565336 - 财政年份:2016
- 资助金额:
$ 117.19万 - 项目类别:
Standard Grant