Computation for the Endless Frontier
无尽前沿的计算
基本信息
- 批准号:1818253
- 负责人:
- 金额:$ 6000万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Cooperative Agreement
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-09-01 至 2025-02-28
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Computation is critical to our nation's progress in science and engineering. Whether through simulation of phenomena where experiments are costly or impossible, large scale data analysis to sift the enormous quantities of digital data scientific instruments can produce, or machine learning to find patterns and suggest hypothesis from this vast array of data, computation is the universal tool upon which nearly every field of science and engineering relies upon to hasten their advance. This project will deploy a powerful new system, called "Frontier", that builds upon a design philosophy and operations approach proven by the success of the Texas Advanced Computing Center (TACC) in delivering leading instruments for computational science. Frontier provides a system of unprecedented scale in the NSF cyberinfrastructure that will yield productive science on day one, while also preparing the research community for the shift to much more capable systems in the future. Frontier is a hybrid system of conventional Central Processing Units (CPU) and Graphics Processing Units (GPU), with performance capabilities that significantly exceeds prior leadership-class computing investments made by NSF. Importantly, the design of Frontier will support the seamless transition of current NSF leadership-class computing applications to the new system, as well as enable new large-scale data-intensive and machine learning workloads that are expected in the future. Following deployment, the project will operate the system in partnership with ten academic partners. In addition, the project will begin planning activities in collaboration with leading computational scientists and technologists from around the country, and will leverage strategic public-private partnerships to design a leadership-class computing facility with at least ten times more performance capabilities for Science and Engineering research, ensuring the economic competitiveness and prosperity for our nation at large.TACC, in partnerships with Dell EMC and Intel, will deploy Frontier, a hybrid system offering 39 PF (double precision) of Intel Xeon processors, complemented by 11 PF (single precision) of GPU cards for machine learning applications. In addition to 3x the per node memory of NSF's prior leadership-class computing system primary compute nodes, Frontier will have 2x the storage bandwidth in a storage hierarchy that includes 55PB of usable disk-based storage and 3PB of 'all flash' storage, to enable next generation data-intensive applications and support for the data science community. Frontier will be deployed in TACC's state-of-the-art datacenter which is configured to supply 30% of the system's power needs from renewable energy. Frontier will include support for science and engineering in virtually all disciplines through its software environment support for application containers, as well as through its partnership with ten academic institutions providing deep computational science expertise in support of users on the system. The project planning effort for a Phase 2 system with at least 10x performance improvement will incorporate a community-driven process that will include leading computational scientists and technologists from around the country and leverage strategic public-private partnerships. This process will ensure the design of a future NSF leadership-class computing facility that incorporates the most productive near-term technologies, and anticipates the most likely future technological capabilities for all of science and engineering requiring leadership-class computational and data-analytics capabilities. Furthermore, the project is expected to develop new expertise and techniques for leadership-class computing and data-driven applications that will benefit future users worldwide through publications, training, and consulting. The project will leverage the team's unique approach to education, outreach, and training activities to encourage, educate, and develop the next generation of leadership-class computational science researchers. The team includes leaders in campus bridging, minority-serving institute (MSI) outreach, and data technologies who will oversee efforts to use Frontier to increase the diversity of groups using leadership-class computing for traditional and data-driven applications.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
计算对于我们国家的科学和工程进步至关重要。无论是通过模拟实验成本高昂或不可能的现象,还是通过大规模数据分析来筛选科学仪器可以产生的大量数字数据,或者通过机器学习从大量数据中寻找模式并提出假设,计算都是通用工具几乎每个科学和工程领域都依赖它来加速进步。该项目将部署一个名为“Frontier”的强大新系统,该系统建立在德克萨斯高级计算中心(TACC)在为计算科学提供领先仪器方面所取得的成功所证明的设计理念和操作方法的基础上。 Frontier 在 NSF 网络基础设施中提供了一个规模空前的系统,该系统将在第一天产生富有成效的科学成果,同时也为研究界在未来转向功能更强大的系统做好准备。 Frontier 是传统中央处理单元 (CPU) 和图形处理单元 (GPU) 的混合系统,其性能远远超过 NSF 之前的领先级计算投资。 重要的是,Frontier 的设计将支持当前 NSF 领先级计算应用程序向新系统的无缝过渡,并支持未来预期的新的大规模数据密集型和机器学习工作负载。 部署后,该项目将与十个学术合作伙伴合作运营该系统。 此外,该项目将开始与全国领先的计算科学家和技术专家合作规划活动,并将利用战略性公私合作伙伴关系,设计一个领先级的计算设施,其科学和工程性能至少提高十倍TACC 与 Dell EMC 和 Intel 合作,将部署 Frontier,这是一种混合系统,提供 39 PF(双精度)Intel Xeon 处理器,辅以 11 PF(单精度)处理器)的GPU用于机器学习应用的卡。除了 NSF 先前领先级计算系统主计算节点的每节点内存增加 3 倍之外,Frontier 还将在存储层次结构中拥有 2 倍的存储带宽,其中包括 55PB 的可用磁盘存储和 3PB 的“全闪存”存储,以实现下一代数据密集型应用程序并为数据科学界提供支持。 Frontier 将部署在 TACC 最先进的数据中心内,该数据中心配置为通过可再生能源满足系统 30% 的电力需求。 Frontier 将通过其对应用程序容器的软件环境支持,以及通过与十个学术机构的合作,提供深厚的计算科学专业知识来支持系统用户,从而为几乎所有学科的科学和工程提供支持。性能提升至少 10 倍的第二阶段系统的项目规划工作将纳入社区驱动的流程,其中包括来自全国各地的领先计算科学家和技术专家,并利用战略性公私合作伙伴关系。 这一过程将确保未来 NSF 领导级计算设施的设计融入最高效的近期技术,并预测所有需要领导级计算和数据分析能力的科学和工程未来最有可能的技术能力。 此外,该项目预计将为领先级计算和数据驱动应用程序开发新的专业知识和技术,这将通过出版物、培训和咨询使全世界的未来用户受益。 该项目将利用该团队独特的教育、推广和培训活动方法来鼓励、教育和培养下一代领导级计算科学研究人员。该团队包括校园桥接、少数族裔服务机构 (MSI) 外展和数据技术方面的领导者,他们将监督使用 Frontier 的工作,以利用传统和数据驱动应用程序的领导级计算来增加团体的多样性。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(12)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
OMB-Py: Python Micro-Benchmarks for Evaluating Performance of MPI Libraries on HPC Systems
OMB-Py:用于评估 HPC 系统上 MPI 库性能的 Python 微基准
- DOI:10.1109/ipdpsw55747.2022.00143
- 发表时间:2022-05
- 期刊:
- 影响因子:0
- 作者:Alnaasan, Nawras;Jain, Arpan;Shafi, Aamir;Subramoni, Hari;Panda, Dhabaleswar K
- 通讯作者:Panda, Dhabaleswar K
Designing Hierarchical Multi-HCA Aware Allgather in MPI
在 MPI 中设计分层多 HCA 感知 Allgather
- DOI:10.1145/3547276.3548524
- 发表时间:2022-08-29
- 期刊:
- 影响因子:0
- 作者:Tu Tran;Benjamin Michalowicz;B. Ramesh;H. Subramoni;A. Shafi;D. P;a;a
- 通讯作者:a
Hy-Fi: Hybrid Five-Dimensional Parallel DNN Training on High-Performance GPU Clusters
Hy-Fi:高性能 GPU 集群上的混合五维并行 DNN 训练
- DOI:10.1007/978-3-031-07312-0_6
- 发表时间:2022-05
- 期刊:
- 影响因子:0
- 作者:Jain, A;Shafi, A.;Anthony, Q.;Kousha, P.;Subramoni, H.;Panda, DK.
- 通讯作者:Panda, DK.
Highly Efficient Alltoall and Alltoallv Communication Algorithms for GPU Systems
适用于 GPU 系统的高效 Alltoall 和 Alltoallv 通信算法
- DOI:10.1109/ipdpsw55747.2022.00014
- 发表时间:2022-05
- 期刊:
- 影响因子:0
- 作者:Chen, Chen;Khorassani, Kawthar Shafie;Anthony, Quentin G.;Shafi, Aamir;Subramoni, Hari;Panda, Dhabaleswar K.
- 通讯作者:Panda, Dhabaleswar K.
“Hey CAI” - Conversational AI Enabled User Interface for HPC Tools
–Hey CAI – 适用于 HPC 工具的对话式 AI 用户界面
- DOI:10.1007/978-3-031-07312-0_5
- 发表时间:2022-05
- 期刊:
- 影响因子:0
- 作者:Kousha, P.;Jain, A.;Kolli, A.;Prasanna, S.;Miriyala, S.;Subramoni, H.;Shafi, A.;Panda, DK.
- 通讯作者:Panda, DK.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Daniel Stanzione其他文献
Daniel Stanzione的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Daniel Stanzione', 18)}}的其他基金
Final Design Planning for the Leadership-Class Computing Facility
领先级计算设施的最终设计规划
- 批准号:
2212090 - 财政年份:2022
- 资助金额:
$ 6000万 - 项目类别:
Cooperative Agreement
Characteristic Science Applications for the Leadership Class Computing Facility
领先级计算设施的特色科学应用
- 批准号:
2139536 - 财政年份:2021
- 资助金额:
$ 6000万 - 项目类别:
Cooperative Agreement
Collaborative Research: Chameleon Phase III: A Large-Scale, Reconfigurable Experimental Environment for Cloud Research
合作研究:Chameleon 第三阶段:用于云研究的大规模、可重构实验环境
- 批准号:
2027176 - 财政年份:2020
- 资助金额:
$ 6000万 - 项目类别:
Cooperative Agreement
Preliminary Design Planning for the Leadership-Class Computing Facility
领先级计算设施的初步设计规划
- 批准号:
2033468 - 财政年份:2020
- 资助金额:
$ 6000万 - 项目类别:
Cooperative Agreement
Planning for the Leadership-Class Computing Facility
规划领先级计算设施
- 批准号:
1940979 - 财政年份:2019
- 资助金额:
$ 6000万 - 项目类别:
Cooperative Agreement
Operations & Maintenance for the Endless Frontier
运营
- 批准号:
1854828 - 财政年份:2019
- 资助金额:
$ 6000万 - 项目类别:
Cooperative Agreement
Planning for the Leadership-Class Computing Facility
规划领先级计算设施
- 批准号:
1925096 - 财政年份:2019
- 资助金额:
$ 6000万 - 项目类别:
Cooperative Agreement
Planning for the Leadership-Class Computing Facility
规划领先级计算设施
- 批准号:
1925096 - 财政年份:2019
- 资助金额:
$ 6000万 - 项目类别:
Cooperative Agreement
Collaborative Research: Chameleon: A Large-Scale, Reconfigurable Experimental Environment for Cloud Research
协作研究:Chameleon:用于云研究的大规模、可重构实验环境
- 批准号:
1743354 - 财政年份:2017
- 资助金额:
$ 6000万 - 项目类别:
Cooperative Agreement
Stampede 2: Operations and Maintenance for the Next Generation of Petascale Computing
Stampede 2:下一代千万亿次计算的运维
- 批准号:
1663578 - 财政年份:2017
- 资助金额:
$ 6000万 - 项目类别:
Cooperative Agreement
相似海外基金
Digitiation PEN: Augmenting the Endless Forms TCN: digitization of imperiled plants with unique morphological adaptations
数字化 PEN:增强无尽形式 TCN:具有独特形态适应性的濒危植物的数字化
- 批准号:
2001358 - 财政年份:2020
- 资助金额:
$ 6000万 - 项目类别:
Standard Grant
Operations & Maintenance for the Endless Frontier
运营
- 批准号:
1854828 - 财政年份:2019
- 资助金额:
$ 6000万 - 项目类别:
Cooperative Agreement
Reprint of Vannevar Bush: Science The Endless Frontier
重印万尼瓦尔布什:科学无尽的前沿
- 批准号:
1941003 - 财政年份:2019
- 资助金额:
$ 6000万 - 项目类别:
Contract Interagency Agreement
Digitization TCN: Collaborative Research: Digitizing "endless forms": Facilitating Research on Imperiled Plants with Extreme Morphologies
数字化 TCN:合作研究:数字化“无尽形式”:促进对具有极端形态的濒危植物的研究
- 批准号:
1802019 - 财政年份:2018
- 资助金额:
$ 6000万 - 项目类别:
Continuing Grant
Digitization TCN: Collaborative Research: Digitizing "endless forms": Facilitating Research on Imperiled Plants with Extreme Morphologies
数字化 TCN:合作研究:数字化“无尽形式”:促进对具有极端形态的濒危植物的研究
- 批准号:
1802209 - 财政年份:2018
- 资助金额:
$ 6000万 - 项目类别:
Standard Grant