Collaborative Research: SI2-SSI: EVOLVE: Enhancing the Open MPI Software for Next Generation Architectures and Applications
合作研究:SI2-SSI:EVOLVE:增强下一代架构和应用的开放式 MPI 软件
基本信息
- 批准号:1663887
- 负责人:
- 金额:$ 30.88万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-06-01 至 2022-05-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
For nearly two decades, the Message Passing Interface (MPI) has been an essential part of the High-Performance Computing ecosystem and consequently a key enabler for important scientific breakthroughs. It is a fundamental building block for most large-scale simulations from physics, chemistry, biology, material sciences as engineering. Open MPI is an open source implementation of the MPI specification, widely used and adopted by the research community as well as industry. The Open MPI library is jointly developed and maintained by a consortium of academic institutions, national labs and industrial partners. It is installed on virtually all large-scale computer systems in the US as well as in the rest of the world. The goal of this project is to enhance and modernize the Open MPI library in the context of the ongoing evolution of modern computer systems, and to ensure its future operability on all upcoming architectures. We aim at implementing fundamental software techniques that can be used in many-core systems to execute MPI-based parallel applications more efficiently, and to tolerate process and memory failures at all scales, from current systems, up to the extreme scales expected before the end of the decade.Open MPI is an open source implementation of the Message Passing Interface (MPI) specification. The MPI API is currently being extended to consider the needs of application developers in terms of efficiency, productivity and resilience. The project will also support academic involvement in the design, development and evaluation of the Open MPI software, and ensure academic presence in the MPI Forum. The goal of this proposal is to enhance the Open MPI software library, focusing on two aspects: (1) Extend Open MPI to support new features of the MPI specification. Open MPI will continue to support all new features of current and upcoming MPI specifications. The two most significant areas within the context of this proposal are (a) extensions to better support hybrid programming models and (b) support for fault tolerance in MPI applications. To improve support for hybrid programming models, the MPI Forum is currently considering introducing the notion of MPI Endpoints, which could be used by different threads of an MPI rank to instantiate multiple separate communication contexts. The goal within this project is to develop an implementation of endpoints to support effective hybrid programming model, and to extend the concept to other aspects of parallel applications such as File I/O operations. One of the project partners (UTK) leads the current proposal in the MPI Forum to expose failures and ensure the continuation of the execution of MPI applications. In the context of this SSI proposal, the goal is to harden, improve, and expand the support of the existing ULFM implementation in Open MPI and thus enable end-users to design application-specific resilience approaches for future platforms. (2) Enhance the Open MPI core to support new architectures and improve scalability. While Open MPI has demonstrated very good scalability in the past, there is significant work to be done to ensure similarly good performance on future architectures. Specifically, we propose a groundbreaking rework of the startup environment that will improve process launch scalability, increase support for asynchronous progress of operations, enable support for accelerators, and reduce sensitivity to system noise. The project would also enhance the support for File I/O operations as part of the Open MPI package by expanding our work on highly scalable collective I/O operations through delegation and exploring the utilization of burst buffers as temporary storage.
近二十年来,信息传递界面(MPI)一直是高性能计算生态系统的重要组成部分,因此是重要科学突破的关键推动力。它是大多数来自物理,化学,生物学,材料科学作为工程的大规模模拟的基本构建基础。 Open MPI是MPI规范的开源实施,研究界和行业广泛使用和采用。开放的MPI图书馆由学术机构,国家实验室和工业合作伙伴共同开发和维护。它几乎安装在美国以及世界其他地区的所有大型计算机系统上。该项目的目的是在现代计算机系统持续发展的背景下增强和现代化开放的MPI库,并确保其对所有即将到来的架构的未来可操作性。我们旨在实施基本的软件技术,这些技术可以在多核系统中使用,以更有效地执行基于MPI的并行应用程序,并在所有规模上,从当前系统到十年结束之前预期的极端尺度,可以在所有规模上忍受过程和内存故障。目前正在扩展MPI API,以考虑应用程序开发人员的需求,从效率,生产力和弹性方面。该项目还将支持对开放MPI软件的设计,开发和评估的学术参与,并确保在MPI论坛中的学术存在。该建议的目的是增强开放的MPI软件库,重点关注两个方面:(1)扩展开放MPI以支持MPI规范的新功能。 Open MPI将继续支持当前和即将发布的MPI规格的所有新功能。本提案背景下的两个最重要的领域是(a)扩展以更好地支持混合编程模型,以及(b)支持MPI应用程序中的容错性。为了提高对混合编程模型的支持,MPI论坛当前正在考虑引入MPI端点的概念,MPI等级的不同线程可以使用,以实例化多个单独的通信上下文。该项目中的目标是开发端点的实现,以支持有效的混合编程模型,并将概念扩展到并行应用程序(例如文件I/O操作)的其他方面。项目合作伙伴之一(UTK)领导MPI论坛中的当前建议,以暴露失败并确保继续执行MPI申请。在该SSI建议的背景下,目标是在开放MPI中硬化,改善和扩展现有ULFM实施的支持,从而使最终用户能够为未来平台设计特定应用程序的弹性方法。 (2)增强开放的MPI核心,以支持新的体系结构并提高可扩展性。尽管Open MPI过去表现出非常好的可扩展性,但仍有重要的工作要确保在未来的体系结构上表现出色。具体来说,我们提出了对启动环境的开创性返工,这将改善过程的启动可伸缩性,增加对操作的同步进展的支持,为加速器提供支持,并降低对系统噪声的敏感性。该项目还将通过授权和探索爆发缓冲区作为临时存储的利用来扩展我们对高度可扩展的集体I/O操作的工作,从而增强对文件I/O操作的支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
On Overlapping Communication and File I/O in Collective Write Operation
- DOI:10.1109/ipdpsw50202.2020.00175
- 发表时间:2020-05
- 期刊:
- 影响因子:0
- 作者:Raafat Feki;E. Gabriel
- 通讯作者:Raafat Feki;E. Gabriel
Parallel I/O on Compressed Data Files: Semantics, Algorithms, and Performance Evaluation
- DOI:10.1109/ccgrid49817.2020.00-74
- 发表时间:2020-05
- 期刊:
- 影响因子:0
- 作者:S. Singh;E. Gabriel
- 通讯作者:S. Singh;E. Gabriel
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Edgar Gabriel其他文献
Edgar Gabriel的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Edgar Gabriel', 18)}}的其他基金
SI2-SSE: Collaborative Research: ADAPT: Next Generation Message Passing Interface (MPI) Library - Open MPI
SI2-SSE:协作研究:ADAPT:下一代消息传递接口 (MPI) 库 - 开放 MPI
- 批准号:
1339763 - 财政年份:2013
- 资助金额:
$ 30.88万 - 项目类别:
Standard Grant
SI2-SSI: Collaborative Research: A Glass Box Approach to Enabling Open, Deep Interactions in the HPC Toolchain
SI2-SSI:协作研究:在 HPC 工具链中实现开放、深度交互的玻璃盒方法
- 批准号:
1148052 - 财政年份:2012
- 资助金额:
$ 30.88万 - 项目类别:
Standard Grant
II-NEW: A Heterogeneous Testbed for Exploring Emerging HPC Tools, Programming Languages, and Applications
II-新:用于探索新兴 HPC 工具、编程语言和应用程序的异构测试平台
- 批准号:
0958464 - 财政年份:2010
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
CAREER: Dynamic Run-Time Optimization of Parallel, Adaptive and Hybrid Applications
职业:并行、自适应和混合应用程序的动态运行时优化
- 批准号:
0846002 - 财政年份:2009
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
相似国自然基金
支持二维毫米波波束扫描的微波/毫米波高集成度天线研究
- 批准号:62371263
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
腙的Heck/脱氮气重排串联反应研究
- 批准号:22301211
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
水系锌离子电池协同性能调控及枝晶抑制机理研究
- 批准号:52364038
- 批准年份:2023
- 资助金额:33 万元
- 项目类别:地区科学基金项目
基于人类血清素神经元报告系统研究TSPYL1突变对婴儿猝死综合征的致病作用及机制
- 批准号:82371176
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
FOXO3 m6A甲基化修饰诱导滋养细胞衰老效应在补肾法治疗自然流产中的机制研究
- 批准号:82305286
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: SI2-SSI: Expanding Volunteer Computing
合作研究:SI2-SSI:扩展志愿者计算
- 批准号:
2039142 - 财政年份:2020
- 资助金额:
$ 30.88万 - 项目类别:
Standard Grant
SI2-SSI: Collaborative Research: Einstein Toolkit Community Integration and Data Exploration
SI2-SSI:协作研究:Einstein Toolkit 社区集成和数据探索
- 批准号:
2114580 - 财政年份:2020
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
Collaborative Research: SI2-SSI: Expanding Volunteer Computing
合作研究:SI2-SSI:扩展志愿者计算
- 批准号:
2001752 - 财政年份:2019
- 资助金额:
$ 30.88万 - 项目类别:
Standard Grant
Collaborative Research: NISC SI2-S2I2 Conceptualization of CFDSI: Model, Data, and Analysis Integration for End-to-End Support of Fluid Dynamics Discovery and Innovation
合作研究:NISC SI2-S2I2 CFDSI 概念化:模型、数据和分析集成,用于流体动力学发现和创新的端到端支持
- 批准号:
1743178 - 财政年份:2018
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
Collaborative Research: NISC SI2-S2I2 Conceptualization of CFDSI: Model, Data, and Analysis Integration for End-to-End Support of Fluid Dynamics Discovery and Innovation
合作研究:NISC SI2-S2I2 CFDSI 概念化:模型、数据和分析集成,用于流体动力学发现和创新的端到端支持
- 批准号:
1743185 - 财政年份:2018
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant