SPX: Collaborative Research: Cross-stack Memory Optimizations for Boosting I/O Performance of Deep Learning HPC Applications

SPX:协作研究:用于提升深度学习 HPC 应用程序 I/O 性能的跨堆栈内存优化

基本信息

项目摘要

New computing applications are emerging in smart networks, scientific explorations, business management, security, and healthcare. These applications depend on very large amounts of data. This data must be used in a fast and efficient manner. The use of large supercomputers to analyze such data is on the rise. The techniques they use are referred to as deep learning (DL) high-performance computing (HPC). Researchers are using DL HPC to make sense of this flood of data and obtain useful information. To do this they must redesign HPC systems. A key challenge is how to use resources such as data storage and computer memory at a huge scale. This project will build Metis, a high-performance data storage system that uses new, end-to-end, hardware-supported memory and storage design to meet the needs of DL HPC applications. The goal is to satisfy the challenge posed by increasing data management performance for next-generation supercomputers. The project will connect several different computing communities and increase interactions among them. The project includes educational and engagement activities which will greatly increase the community's understanding of HPC systems. These activities include broadening participation activities to attract and retain new students. Special emphasis will be given to students from underrepresented groups. The project will encourage student interest in design and research in large-scale computing systems design.This project brings together researchers in micro-architecture, distributed computing systems, namely cloud and HPC systems, storage systems, and power/energy modeling to boost DL HPC data processing performance. The research will yield a fundamentally new software-hardware co-designed memory compression technique that transparently compresses DL application memories with negligible runtime performance overhead. Metis will leverage the novel compression substrate to enable a distributed, intelligent, operating-system-level data cache that effectively exploits the physical memory freed via program-memory compression. The developed techniques will open doors for innovative HPC and scientific applications in a broad range of disciplines, which have not been previously possible. Metis' focus on addressing the challenges of increasing performance in the Exascale era, along with engaging researchers from multiple areas, aligns it very well with the goals and objectives of the SPX program. Additionally, the research will also create new knowledge on design principles of memory compression, and yield insights to provide seamless integration of DL applications into the next-generation DL-aware supercomputer infrastructure.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

项目成果

期刊论文数量(31)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Towards a Serverless Bioinformatics Cyberinfrastructure Pipeline
Application-Attuned Memory Management for Containerized HPC Workflows
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
    †. MoizArif;†. AvinashMaurya;†. M.MustafaRafique;Dimitrios S. Nikolopoulos;A. R. Butt
  • 通讯作者:
    †. MoizArif;†. AvinashMaurya;†. M.MustafaRafique;Dimitrios S. Nikolopoulos;A. R. Butt
In Search of a Fast and Efficient Serverless DAG Engine
Heterogeneity-Aware Adaptive Federated Learning Scheduling
  • DOI:
    10.1109/bigdata55660.2022.10020721
  • 发表时间:
    2022-12
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jingoo Han;Ahmad Faraz Khan;Syed Zawad;A. Anwar;Nathalie Baracaldo Angel;Yi Zhou;Feng Yan;A. Butt
  • 通讯作者:
    Jingoo Han;Ahmad Faraz Khan;Syed Zawad;A. Anwar;Nathalie Baracaldo Angel;Yi Zhou;Feng Yan;A. Butt
Towards Efficient Python Interpreter for Tiered Memory Systems
面向分层内存系统的高效 Python 解释器
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ali Butt其他文献

Ali Butt的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ali Butt', 18)}}的其他基金

Collaborative Research: CNS Core: Medium:HardLambda: A new FaaS Abstraction for Cross-Stack Resource Management in Disaggregated Datacenters
协作研究:CNS 核心:Medium:HardLambda:分解数据中心跨堆栈资源管理的新 FaaS 抽象
  • 批准号:
    2106634
  • 财政年份:
    2021
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
Workshop on Data Storage Research Vision
数据存储研究愿景研讨会
  • 批准号:
    1829096
  • 财政年份:
    2018
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
CSR: Small: Collaborative Research: Scalable Fine-Grained Cloud Monitoring for Empowering IoT
CSR:小型:协作研究:支持物联网的可扩展细粒度云监控
  • 批准号:
    1615411
  • 财政年份:
    2016
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
Student Travel Support for IEEE 23rd International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2015)
IEEE 第 23 届计算机和电信系统建模、分析和仿真国际研讨会 (MASCOTS 2015) 学生旅行支持
  • 批准号:
    1541504
  • 财政年份:
    2015
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
CSR: Medium: Pythia: An Application Analysis and Online Modeling Based Prediction Framework for Scalable Resource Management
CSR:中:Pythia:基于应用分析和在线建模的可扩展资源管理预测框架
  • 批准号:
    1405697
  • 财政年份:
    2014
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Continuing Grant
DC: Small: Collaborative Research: Exploring Energy-Reliability Trade-offs in Data Storage Systems
DC:小型:协作研究:探索数据存储系统中的能源可靠性权衡
  • 批准号:
    1016408
  • 财政年份:
    2010
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
Increasing Student Participation in Cluster Computing through IEEE Cluster 2010 Attendance
通过出席 IEEE Cluster 2010 提高学生对集群计算的参与
  • 批准号:
    1049858
  • 财政年份:
    2010
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
CSR: Small: Towards Realizing Cloud HPC: An Adaptive Programming Model for Accelerator-based Clusters
CSR:小:迈向实现云 HPC:基于加速器的集群的自适应编程模型
  • 批准号:
    1016793
  • 财政年份:
    2010
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
U.S. - Pakistan International Planning Visit: Economical Computing Substrate for Developing Regions
美国-巴基斯坦国际规划访问:发展中地区的经济计算基板
  • 批准号:
    0940048
  • 财政年份:
    2009
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
CAREER: A Scalable Hierarchical Framework for High-Performance Data Storage
职业:高性能数据存储的可扩展分层框架
  • 批准号:
    0746832
  • 财政年份:
    2008
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Continuing Grant

相似国自然基金

基于交易双方异质性的工程项目组织间协作动态耦合研究
  • 批准号:
    72301024
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
面向5G超高清移动视频传输的协作NOMA系统可靠性研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
面向协作感知车联网的信息分发时效性保证关键技术研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
数据物理驱动的车间制造服务协作可靠性机理与优化方法研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
医保基金战略性购买促进远程医疗协作网价值共创的制度创新研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    45 万元
  • 项目类别:
    面上项目

相似海外基金

SPX: Collaborative Research: Automated Synthesis of Extreme-Scale Computing Systems Using Non-Volatile Memory
SPX:协作研究:使用非易失性存储器自动合成超大规模计算系统
  • 批准号:
    2408925
  • 财政年份:
    2023
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
SPX: Collaborative Research: Scalable Neural Network Paradigms to Address Variability in Emerging Device based Platforms for Large Scale Neuromorphic Computing
SPX:协作研究:可扩展神经网络范式,以解决基于新兴设备的大规模神经形态计算平台的可变性
  • 批准号:
    2401544
  • 财政年份:
    2023
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
SPX: Collaborative Research: Intelligent Communication Fabrics to Facilitate Extreme Scale Computing
SPX:协作研究:促进超大规模计算的智能通信结构
  • 批准号:
    2412182
  • 财政年份:
    2023
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
SPX: Collaborative Research: Cross-stack Memory Optimizations for Boosting I/O Performance of Deep Learning HPC Applications
SPX:协作研究:用于提升深度学习 HPC 应用程序 I/O 性能的跨堆栈内存优化
  • 批准号:
    2318628
  • 财政年份:
    2022
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
SPX: Collaborative Research: FASTLEAP: FPGA based compact Deep Learning Platform
SPX:协作研究:FASTLEAP:基于 FPGA 的紧凑型深度学习平台
  • 批准号:
    2333009
  • 财政年份:
    2022
  • 资助金额:
    $ 95.29万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了