SI2-SSE Collaborative Research: SPIKE-An Implementation of a Recursive Divide-and-Conquer Parallel Strategy for Solving Large Systems of Liner Equations
SI2-SSE 协作研究:SPIKE-求解大型线性方程组的递归分治并行策略的实现
基本信息
- 批准号:1147422
- 负责人:
- 金额:$ 24万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2012
- 资助国家:美国
- 起止时间:2012-06-01 至 2016-05-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Drs. Negrut, Sameh, and Knepley will investigate, produce, and maintain a methodology and its software implementation that leverage emerging heterogeneous hardware architectures to solve billion-unknowns linear systems in a robust, scalable, and efficient fashion. The two classes of problems targeted under this project are banded dense and sparse general linear systems.This project is motivated by the observation that the task of solving a linear system is one of the most ubiquitous ingredients in the numerical solution of Applied Mathematics problems. It is relied upon for the implicit integration of Ordinary Differential Equation (ODE) and Differential Algebraic Equation (DAE) problems, in the numerical solution of Partial Differential Equation (PDE) problems, in interior point optimization methods, in least squares approximations, in solving eigenvalue problems, and in data analysis. In fact, the vast majority of nonlinear problems in Scientific Computing are solved iteratively by drawing on local linearizations of nonlinear operators and the solution of linear systems. Recent advances in (a) hardware architecture; i.e., the emergence of General Purpose Graphics Processing Unit (GP-GPU) cards, and (b) scalable solution algorithms, provide an opportunity to develop a new class of parallel algorithms, called SPIKE, which can robustly and efficiently solve very large linear systems of equations.Drawing on its divide-and-conquer paradigm, SPIKE builds on several algorithmic primitives: matrix reordering strategies, dense linear algebra operations, sparse direct solvers, and Krylov subspace methods. It provides a scalable solution that can be deployed in a heterogeneous hardware ecosystem and has the potential to solve billion-unknown linear systems in the cloud or on tomorrow?s exascale supercomputers. Its high degree of scalability and improved efficiency stem from (i) optimized memory access pattern owing to an aggressive pre-processing stage that reduces a generic sparse matrix to a banded one through a novel reordering strategy; (ii) good exposure of coarse and fine grain parallelism owing to a recursive, divide-and-conquer solution strategy; (iii) efficient vectorization in evaluating the coupling terms in the divide-and-conquer stage owing to a CPU+GPU heterogeneous computing approach; and (iv) algorithmic polymorphism, given that SPIKE can serve both as a direct solver or an effective preconditioner in an iterative Krylov-type method.In Engineering, SPIKE will provide the Computer Aided Engineering (CAE) community with a key component; i.e., fast solution of linear systems, required by the analysis of complex problems through computer simulation. Examples of applications that would benefit from this technology are Structural Mechanics problems (Finite Element Analysis in car crash simulation), Computational Fluid Dynamics problems (solving Navier-Stokes equations in the simulation of turbulent flow around a wing profile), and Computational Multibody Dynamics problems (solving Newton-Euler equations in large granular dynamics problems).SPIKE will also be interfaced to the Portable, Extensible Toolkit for Scientific Computation (PETSc), a two decades old flexible and scalable framework for solving Science and Engineering problems on supercomputers. Through PETSc, SPIKE will be made available to a High Performance Computing user community with more than 20,000 members worldwide. PETSc users will be able to run SPIKE without any modifications on vastly different supercomputer architectures such as the IBM BlueGene/P and BlueGene/Q, or the Cray XT5. SPIKE will thus run scalably on the largest machines in the world and will be tuned for very different network and hardware topologies while maintaining a simple code base.The experience collected and lessons learned in this project will augment a graduate level class, ?High Performance Computing for Engineering Applications? taught at the University of Wisconsin-Madison. A SPIKE tutorial and research outcomes will be presented each year at the International Conference for High Performance Computing, Networking, Storage and Analysis. A one day High Performance Computing Boot Camp will be organized each year in conjunction with the American Society of Mechanical Engineers (ASME) conference and used to disseminate the software outcomes of this effort. Finally, this project will shape the research agendas of two graduate students working on advanced degrees in Computational Science.
博士。 Negrut、Sameh 和 Knepley 将研究、制定和维护一种方法及其软件实现,利用新兴的异构硬件架构以稳健、可扩展和高效的方式解决数十亿个未知的线性系统。该项目针对的两类问题是带状密集和稀疏一般线性系统。该项目的动机是观察到求解线性系统的任务是应用数学问题数值求解中最普遍的成分之一。它依赖于常微分方程 (ODE) 和微分代数方程 (DAE) 问题的隐式积分、偏微分方程 (PDE) 问题的数值求解、内点优化方法、最小二乘近似、求解特征值问题和数据分析。事实上,科学计算中的绝大多数非线性问题都是通过利用非线性算子的局部线性化和线性系统的解来迭代解决的。 (a) 硬件架构的最新进展;即通用图形处理单元 (GP-GPU) 卡的出现,以及 (b) 可扩展的解决方案算法,为开发一类新型并行算法(称为 SPIKE)提供了机会,该算法可以稳健且高效地求解超大型线性系统借鉴其分治范式,SPIKE 建立在多种算法原语之上:矩阵重新排序策略、密集线性代数运算、稀疏直接求解器和 Krylov 子空间方法。它提供了一个可扩展的解决方案,可以部署在异构硬件生态系统中,并且有潜力解决云中或未来的百亿亿级超级计算机上的数十亿个未知线性系统。其高度可扩展性和效率的提高源于(i)优化的内存访问模式,这是由于积极的预处理阶段通过新颖的重新排序策略将通用稀疏矩阵减少为带状矩阵; (ii) 由于递归、分而治之的解决策略,很好地展现了粗粒度和细粒度并行性; (iii) 由于CPU+GPU异构计算方法,在分而治之阶段评估耦合项时的有效矢量化; (iv) 算法多态性,因为 SPIKE 既可以作为直接求解器,也可以作为迭代 Krylov 型方法中的有效预处理器。在工程领域,SPIKE 将为计算机辅助工程 (CAE) 社区提供关键组件;即通过计算机模拟分析复杂问题所需的线性系统的快速求解。受益于该技术的应用示例包括结构力学问题(车祸模拟中的有限元分析)、计算流体动力学问题(在机翼轮廓周围的湍流模拟中求解纳维-斯托克斯方程)和计算多体动力学问题(求解大颗粒动力学问题中的牛顿-欧拉方程)。SPIKE 还将连接到便携式、可扩展的科学计算工具包(PETSc),一个有二十年历史的灵活且可扩展的框架,用于在超级计算机上解决科学和工程问题。通过 PETSc,SPIKE 将提供给全球拥有 20,000 多名成员的高性能计算用户社区。 PETSc 用户将能够在截然不同的超级计算机架构(例如 IBM BlueGene/P 和 BlueGene/Q 或 Cray XT5)上运行 SPIKE,而无需进行任何修改。因此,SPIKE 将在世界上最大的机器上可扩展地运行,并将针对不同的网络和硬件拓扑进行调整,同时保持简单的代码库。在该项目中收集的经验和吸取的教训将增强研究生水平课程“高性能计算”用于工程应用?任教于威斯康星大学麦迪逊分校。 SPIKE 教程和研究成果将在每年的高性能计算、网络、存储和分析国际会议上发布。每年都会与美国机械工程师协会 (ASME) 会议结合举办为期一天的高性能计算训练营,并用于传播这项工作的软件成果。最后,该项目将制定两名攻读计算科学高级学位的研究生的研究议程。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ahmed Sameh其他文献
A finite element analysis study on different angle correction designs for inclined implants in All-On-Four protocol
All-On-Four协议中倾斜种植体不同角度校正设计的有限元分析研究
- DOI:
10.1186/s12903-024-04091-2 - 发表时间:
2024-03-13 - 期刊:
- 影响因子:2.9
- 作者:
Christine Raouf Micheal Ibrahim;Ahmed Sameh;Osama Askar - 通讯作者:
Osama Askar
Efficacy and safety of stem cell transplantation for multiple sclerosis: a systematic review and meta-analysis of randomized controlled trials
干细胞移植治疗多发性硬化症的疗效和安全性:随机对照试验的系统评价和荟萃分析
- DOI:
10.1038/s41598-024-62726-4 - 发表时间:
2024-05-31 - 期刊:
- 影响因子:4.6
- 作者:
A. A. Nawar;Aml Mostafa Farid;Rim Wally;Engy K. Tharwat;Ahmed Sameh;Yomna Elkaramany;M. M. Asla;W. Kamel - 通讯作者:
W. Kamel
International Journal of Video& Image Processing and Network Security Calibrating Camera Shake Photographs Using Parallel De-convolution
国际视频杂志
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Ahmed Sameh;Nazzly El Shazzly - 通讯作者:
Nazzly El Shazzly
Brain Decoding using EEG Signals: Detection for Human Daily Activities
使用脑电图信号解码大脑:检测人类日常活动
- DOI:
10.1109/miucc58832.2023.10278316 - 发表时间:
2023-09-27 - 期刊:
- 影响因子:0
- 作者:
Ahmed Sameh;Helmy Magdy;Mario Shady;Hady Wael;Shereen Essam Elbohy - 通讯作者:
Shereen Essam Elbohy
Ahmed Sameh的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ahmed Sameh', 18)}}的其他基金
CSR: Large: Collaborative Research: Kali: A System for Sequential Programming of Multicore Processors
CSR:大型:协作研究:Kali:多核处理器顺序编程系统
- 批准号:
1111691 - 财政年份:2011
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
Collaborative Research: Developing A Robust Parallel Hybrid System Solver
协作研究:开发鲁棒的并行混合系统求解器
- 批准号:
0635169 - 财政年份:2006
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
ITR/AP: Collaborative Research: Model Reduction of Dynamical Systems for Real-time Control
ITR/AP:协作研究:用于实时控制的动态系统模型简化
- 批准号:
0325227 - 财政年份:2003
- 资助金额:
$ 24万 - 项目类别:
Continuing Grant
Efficent Algorithms for Large-Scale Dynamical Systems
大规模动力系统的高效算法
- 批准号:
9912388 - 财政年份:2000
- 资助金额:
$ 24万 - 项目类别:
Continuing Grant
Innovative Algorithms and Techniques for Large Scale Simulations
大规模模拟的创新算法和技术
- 批准号:
9972533 - 财政年份:1999
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
MRI: Acquisition of a Computational Environment for Scientific Computing
MRI:获取科学计算的计算环境
- 批准号:
9871053 - 财政年份:1998
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
CISE PostDoc: Computational Methods in VLSI Design
CISE博士后:VLSI设计中的计算方法
- 批准号:
9805743 - 财政年份:1998
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
High Performance Computing for Large Dynamical Systems
大型动态系统的高性能计算
- 批准号:
9619763 - 财政年份:1997
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
Acquisition of a Workstation Cluster for Research in High - Performance Computing
采购工作站集群用于高性能计算研究
- 批准号:
9414015 - 财政年份:1994
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
Collaborative Research: Hierarchically Parallel Algorithms for Portable and Scalable Performance
协作研究:可移植和可扩展性能的分层并行算法
- 批准号:
9396332 - 财政年份:1993
- 资助金额:
$ 24万 - 项目类别:
Continuing Grant
相似国自然基金
化脓性链球菌分泌性酯酶Sse抑制LC3相关吞噬促其侵袭的机制研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
太阳能电池Cu2ZnSn(SSe)4/CdS界面过渡层结构模拟及缺陷态消除研究
- 批准号:
- 批准年份:2022
- 资助金额:55 万元
- 项目类别:面上项目
掺杂实现Cu2ZnSn(SSe)4吸收层表层稳定弱n型特性的第一性原理研究
- 批准号:
- 批准年份:2020
- 资助金额:24 万元
- 项目类别:青年科学基金项目
基于SSE的航空信息系统信息安全保障评价指标体系的研究
- 批准号:60776808
- 批准年份:2007
- 资助金额:19.0 万元
- 项目类别:联合基金项目
相似海外基金
Collaborative Research: SI2: SSE: Extending the Physics Reach of LHCb in Run 3 Using Machine Learning in the Real-Time Data Ingestion and Reduction System
合作研究:SI2:SSE:在运行 3 中使用实时数据摄取和还原系统中的机器学习扩展 LHCb 的物理范围
- 批准号:
1739772 - 财政年份:2017
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
Collaborative Research: NSCI: SI2-SSE: Time Stepping and Exchange-Correlation Modules for Massively Parallel Real-Time Time-Dependent DFT
合作研究:NSCI:SI2-SSE:大规模并行实时瞬态 DFT 的时间步进和交换相关模块
- 批准号:
1740219 - 财政年份:2017
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
Collaborative Research: SI2-SSE: An open source multi-physics platform to advance fundamental understanding of plasma physics and enable impactful application of plasma systems
合作研究:SI2-SSE:一个开源多物理平台,可促进对等离子体物理学的基本理解并实现等离子体系统的有效应用
- 批准号:
1740300 - 财政年份:2017
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
Collaborative Research: SI2:SSE: Extending the Physics Reach of LHCb in Run 3 Using Machine Learning in the Real-Time Data Ingestion and Reduction System
合作研究:SI2:SSE:在运行 3 中使用实时数据摄取和还原系统中的机器学习扩展 LHCb 的物理范围
- 批准号:
1740102 - 财政年份:2017
- 资助金额:
$ 24万 - 项目类别:
Standard Grant
Collaborative Research: NSCI: SI2-SSE: Time Stepping and Exchange-Correlation Modules for Massively Parallel Real-Time Time-Dependent DFT
合作研究:NSCI:SI2-SSE:大规模并行实时瞬态 DFT 的时间步进和交换相关模块
- 批准号:
1740204 - 财政年份:2017
- 资助金额:
$ 24万 - 项目类别:
Standard Grant