SI2-SSE: Fast Dynamic Load Balancing Tools for Extreme Scale Systems

SI2-SSE:适用于超大规模系统的快速动态负载平衡工具

基本信息

  • 批准号:
    1533581
  • 负责人:
  • 金额:
    $ 50万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2015
  • 资助国家:
    美国
  • 起止时间:
    2015-10-01 至 2020-09-30
  • 项目状态:
    已结题

项目摘要

Massively parallel computing combined with scalable simulation workflows that can reliably model systems of interest are central to the continued quest of scientists, engineers, and other practitioners to address advances in scientific discovery, engineering design, and medical treatment. However, to meet their potential, these methods must be able to operate efficiently and scale on massively parallel computers executing millions of processes. Reaching the goal of millions of parallel processes requires new methods in which the computational workload is extremely well balanced and interprocessor communications overheads are minimized. Attaining such parallel performance is greatly complicated in realistic simulation workflows where the models and their discrete computer representation must evolve to ensure simulation reliability, or to account for changing input streams. To address the need to obtain workload balance with controlled communications, various algorithms and associated software, referred to as load balancing procedures, have been, and continue to be, developed. To be effective in the execution of simulation workflows in which the workload evolves, the load balancing procedures must be applied dynamically at multiple points in the simulation. Current load balancing techniques demonstrate two deficiencies when applied as dynamic load balancing procedures at very large numbers of compute cores (e.g., greater than 100,000 cores): They become a major fraction of the total parallel computation (in some cases never finishing within an allocation) and they do not maintain good load balance for simulation steps that must balance based on multiple criteria. Building on initial efforts to improve dynamic load balancing methods for adaptive unstructured mesh applications, the goal of the proposed research is to develop fast multicriteria dynamic load balancing methods that are capable of quickly producing well balanced computations, with well controlled communications, for a wide variety of applications. An important characteristic of the dynamic load balancing procedures to be developed is generalizing the graph to account for multiple types of computational entities and interactions. The initial ideas for supporting multiple entity types came from consideration balancing finite element calculations that must consider multiple orders of mesh entities. These concepts will be refined and generalized to support multiple applications areas. An additional development will be fast hybrid dynamic load balancing methods that are combinations of "geometric", standard graph, and multicriteria graph methods in which the individual methods can be executed globally of at a more local level (such as at the node level). The dynamic load balancing method to be developed will be demonstrated on three applications in which the workload, and its distribution, is changing as the simulation proceeds. The applications will be adaptive mesh simulations, adaptive multiscale modeling, and massive scale free graphs. These applications will be carried out on available massively parallel computers where examples on 1 million cores will be demonstrated. A goal of the dynamic load balancing methods to be developed will be to attain scalability, and do so with controlled data movement such that the wall clock time and energy used is substantially less than that required for an equivalent accuracy non-adaptive calculation.The software produced by this project will be made available as open source components. These developments coupled with efforts to support users in applying them in the development of new simulation tools will impact many research communities. Based on past and present efforts, the PIs fully expect that technologies developed in this project will also be integrated into future industrial software systems.
大规模并行计算与可扩展的仿真工作流程相结合,可以可靠地对感兴趣的系统进行建模,这是科学家、工程师和其他从业者不断追求科学发现、工程设计和医疗进步的核心。然而,为了发挥其潜力,这些方法必须能够在执行数百万个进程的大规模并行计算机上高效运行并扩展。实现数百万个并行进程的目标需要新的方法,其中计算工作量能够得到极好的平衡,并且处理器间的通信开销被最小化。在现实的仿真工作流程中,获得这种并行性能非常复杂,其中模型及其离散计算机表示必须不断发展,以确保仿真可靠性,或考虑不断变化的输入流。为了满足通过受控通信获得工作负载平衡的需要,已经并将继续开发各种算法和相关软件(称为负载平衡过程)。为了有效执行工作负载变化的模拟工作流程,必须在模拟中的多个点动态应用负载平衡过程。当在大量计算核心(例如,超过 100,000 个核心)上用作动态负载平衡过程时,当前的负载平衡技术表现出两个缺陷: 它们成为总并行计算的主要部分(在某些情况下永远不会在分配内完成)并且它们不能为必须基于多个标准进行平衡的模拟步骤保持良好的负载平衡。 基于改进自适应非结构​​化网格应用的动态负载平衡方法的初步努力,本研究的目标是开发快速多标准动态负载平衡方法,该方法能够快速生成良好平衡的计算,并具有良好的受控通信,适用于各种应用。应用程序。要开发的动态负载平衡过程的一个重要特征是概括图形以考虑多种类型的计算实体和交互。支持多种实体类型的最初想法来自于平衡有限元计算的考虑,该计算必须考虑网格实体的多阶。这些概念将得到完善和推广,以支持多个应用领域。另一项发展将是快速混合动态负载平衡方法,该方法是“几何”、标准图和多标准图方法的组合,其中各个方法可以在更局部的级别(例如在节点级别)全局执行。将在三个应用程序中演示要开发的动态负载平衡方法,其中工作负载及其分布随着模拟的进行而变化。这些应用将是自适应网格模拟、自适应多尺度建模和大规模自由图。这些应用程序将在可用的大规模并行计算机上执行,其中将演示 100 万个内核的示例。待开发的动态负载平衡方法的目标是实现可扩展性,并通过受控数据移动来实现这一目标,从而使所使用的挂钟时间和能量大大少于同等精度非自适应计算所需的时间和能量。该软件该项目产生的产品将作为开源组件提供。这些发展加上支持用户将其应用于新模拟工具开发的努力将影响许多研究社区。基于过去和现在的努力,PI们充分期望该项目开发的技术也将被集成到未来的工业软件系统中。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Mark Shephard其他文献

Making Democracy Work by Early Formal Engagement? A Comparative Exploration of Youth Parliaments in the EU
通过早期正式参与使民主发挥作用?
  • DOI:
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Mark Shephard;Stratos Patrikios
  • 通讯作者:
    Stratos Patrikios
Multiple Audiences, Multiple Messages? An Exploration of the Dynamics between the Party, the Candidates and the Various Constituencies
多个受众,多个消息?
  • DOI:
  • 发表时间:
    2007
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Mark Shephard
  • 通讯作者:
    Mark Shephard
Facing the Voters: The Potential Impact of Ballot Paper Photographs in British Elections
面对选民:选票照片在英国选举中的潜在影响
  • DOI:
    10.1111/j.1467-9248.2010.00874.x
  • 发表时间:
    2011
  • 期刊:
  • 影响因子:
    0
  • 作者:
    R. Johns;Mark Shephard
  • 通讯作者:
    Mark Shephard
A Face for Radio? How Viewers and Listeners Reacted Differently to the Third Leaders' Debate in 2010
电台面孔?
Defending the Rights of the Poor: Framing Policy or Delivering the Goods? Conservative and Liberals Versus Labour
捍卫穷人的权利:制定政策还是交付货物?
  • DOI:
    10.2139/ssrn.1915043
  • 发表时间:
    2011
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Mark Shephard
  • 通讯作者:
    Mark Shephard

Mark Shephard的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Mark Shephard', 18)}}的其他基金

Collaborative Research: Frameworks: A Software Ecosystem for Plasma Science and Space Weather Applications
合作研究:框架:等离子体科学和空间天气应用的软件生态系统
  • 批准号:
    2209472
  • 财政年份:
    2022
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: NISC SI2-S2I2 Conceptualization of CFDSI: Model, Data, and Analysis Integration for End-to-End Support of Fluid Dynamics Discovery and Innovation
合作研究:NISC SI2-S2I2 CFDSI 概念化:模型、数据和分析集成,用于流体动力学发现和创新的端到端支持
  • 批准号:
    1743185
  • 财政年份:
    2018
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
PFI-BIC: Partnership for Interoperable Components for Parallel Engineering Simulations
PFI-BIC:并行工程仿真可互操作组件的合作伙伴关系
  • 批准号:
    1237555
  • 财政年份:
    2012
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Adaptive Multimodel Simulation for Engineering Innovation
工程创新的自适应多模型仿真
  • 批准号:
    1068419
  • 财政年份:
    2011
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Petascale Computational Fluid Dynamics
Petascale 计算流体动力学
  • 批准号:
    0749152
  • 财政年份:
    2007
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
MRI: Acquisition of Infrastructure for Research in Grid Computing and Multiscale Systems Computation
MRI:获取网格计算和多尺度系统计算研究基础设施
  • 批准号:
    0420703
  • 财政年份:
    2004
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Multiscale Systems Engineering Research Center (MSERC)
多尺度系统工程研究中心(MSERC)
  • 批准号:
    0310596
  • 财政年份:
    2003
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
Postdoc: Parallel Adaptive Partition of Unity Methods
博士后:Unity方法的并行自适应划分
  • 批准号:
    9704696
  • 财政年份:
    1997
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Academic Research Infrastructure: Development of a Distributed High-Performance Computing Environment for Research in Science and Engineering
学术研究基础设施:为科学和工程研究开发分布式高性能计算环境
  • 批准号:
    9601797
  • 财政年份:
    1996
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Engineering Research Equipment Grant: Workstations for Advanced Research in Computer Intergrated Engineering
工程研究设备补助金:计算机集成工程高级研究工作站
  • 批准号:
    8713805
  • 财政年份:
    1987
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant

相似国自然基金

化脓性链球菌分泌性酯酶Sse抑制LC3相关吞噬促其侵袭的机制研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
太阳能电池Cu2ZnSn(SSe)4/CdS界面过渡层结构模拟及缺陷态消除研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    55 万元
  • 项目类别:
    面上项目
掺杂实现Cu2ZnSn(SSe)4吸收层表层稳定弱n型特性的第一性原理研究
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    24 万元
  • 项目类别:
    青年科学基金项目
基于SSE的航空信息系统信息安全保障评价指标体系的研究
  • 批准号:
    60776808
  • 批准年份:
    2007
  • 资助金额:
    19.0 万元
  • 项目类别:
    联合基金项目

相似海外基金

異常検知手法と大気ノイズ補正を併用したInSAR時系列による未知のSSE検出手法の確立
利用异常检测方法和大气噪声校正建立利用InSAR时间序列的未知SSE检测方法
  • 批准号:
    24K07168
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
A study on vibration theory for defect detection by acoustic excitation using SSE analysis
基于SSE分析的声激励缺陷检测振动理论研究
  • 批准号:
    23K03995
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Revealing spatiotemporal slow slip evolution at higher temporal resolution by kinematic GNSS
通过运动 GNSS 揭示更高时间分辨率的时空慢滑演化
  • 批准号:
    21K14007
  • 财政年份:
    2022
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Study on defect detection by spatial spectral entropy (SSE) and healthy part evaluation for noncontact acoustic inspection
非接触声学检测中空间谱熵(SSE)缺陷检测和健康部位评估研究
  • 批准号:
    19K04414
  • 财政年份:
    2019
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Numerical simulations of earthquake and SSE triggering by dynamic stress changes
动态应力变化引发地震和SSE的数值模拟
  • 批准号:
    18K03775
  • 财政年份:
    2018
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了