Collaborative Research: CNS Core: Medium:HardLambda: A new FaaS Abstraction for Cross-Stack Resource Management in Disaggregated Datacenters
协作研究:CNS 核心:Medium:HardLambda:分解数据中心跨堆栈资源管理的新 FaaS 抽象
基本信息
- 批准号:2106634
- 负责人:
- 金额:$ 42万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-06-01 至 2025-05-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Datacenters use computer servers that are no longer able to address the performance and scaling demands of emerging applications such as those in healthcare, smart infrastructure design, and high-speed physics. There is a fundamental mismatch between the capabilities of traditionally designed servers and the dynamic requirements of modern applications. This mismatch leads to poor utilization and significant waste of resources. A new way to design datacenters, called the disaggregated approach, can address this problem by creating a need-based on-demand model for computing. Here, servers are specialized to perform specific functions, and applications use only those specialized servers that best perform the functions needed by each application. While the disaggregated approach improves utilization and makes datacenters easier to manage, it comes at a performance cost: disaggregation requires applications to access critical resources spread across a set of specialized servers over the datacenter network. To mitigate such challenges of resource disaggregation, this project designs HardLambda, a new Function-as-a-Service (FaaS) abstraction that brings the functional and hardware requirements of an application together in a unified fashion. HardLambda enables datacenters to allocate resources in ways that best meet application needs while retaining the resource utilization and management flexibility of disaggregated hardware. The designed algorithms and system software will enable scalable control and sharing of disaggregated resources, and create new approaches to adaptive resource allocation. HardLambda will make disaggregated datacenters a viable and sustainable option for numerous applications in science and industry. The project especially targets machine and deep learning (ML/DL) applications due to their increasingly crucial role in many aspects of modern computing-powered life. At the same time, HardLambda will improve the sustainability of large-scale datacenters, where high utilization, efficiency, and continuous adaptation to application requirements are all essential factors. The research will create new knowledge on hardware and software co-designed FaaS systems and services, and yield insights for efficiently supporting ML/DL applications at extremely large scales. The project will engage with partners in industry and national research laboratories to deploy HardLambda in real systems and will undertake educational and broadening participation activities to improve community awareness and understanding of the scaling and sustainability challenges of large-scale computing infrastructure. Special emphasis will be given to engaging students from underrepresented groups in the research and educational activities.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据中心使用的计算机服务器不再能够解决新兴应用程序的性能和扩展需求,例如医疗保健,智能基础设施设计和高速物理学。传统设计的服务器的功能与现代应用的动态要求之间存在根本的不匹配。这种不匹配导致利用率不佳和资源的大量浪费。设计数据中心的一种新方法,称为分类方法,可以通过创建基于需求的按需模型来解决此问题。在这里,服务器专门执行特定功能,并且应用程序仅使用那些最能执行每个应用程序所需功能的专门服务器。尽管分类方法可以改善利用率并使数据中心更易于管理,但它以性能成本进行:分类需要应用程序访问分布在数据中心网络上的一组专用服务器的关键资源。为了减轻资源分类的这种挑战,该项目设计了Hardlambda,这是一种新的功能 - 服务(FAAS)抽象,以统一的方式将应用程序的功能和硬件要求融合在一起。 Hardlambda使数据中心能够以最能满足应用程序需求的方式分配资源,同时保留分类硬件的资源利用和管理灵活性。设计的算法和系统软件将启用可扩展的控制和共享分解资源,并创建新方法来自适应资源分配。 Hardlambda将使分解数据中心成为科学和工业中众多应用程序的可行和可持续选择。该项目尤其针对机器和深度学习(ML/DL)应用程序,因为它们在现代计算驱动的生活的许多方面都越来越重要。同时,Hardlambda将提高大规模数据中心的可持续性,在这种情况下,高利用,效率和对应用程序要求的持续适应都是基本因素。该研究将创建有关硬件和软件共同设计的FAAS系统和服务的新知识,并产生见解,以有效地支持非常大的ML/DL应用程序。该项目将与行业和国家研究实验室的合作伙伴互动,以在实际系统中部署Hardlambda,并将开展教育和扩大参与活动,以提高社区意识,并了解大规模计算基础设施的规模和可持续性挑战。将特别强调来自代表性不足小组的研究和教育活动的吸引学生。该奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子优点和更广泛的影响评估标准通过评估来获得支持的。
项目成果
期刊论文数量(23)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Tokenized Incentive for Federated Learning.
联邦学习的代币化激励。
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Han, Jingoo;Khan, Ahmad Faraz;Zawad, Syed;Anwar, Ali;Angel, Nathalie Baracaldo;Zhou, Yi;Butt, Ali R.
- 通讯作者:Butt, Ali R.
COLTI: Towards Concurrent and Co-located DNN Training and Inference
COLTI:迈向并发和同地 DNN 训练和推理
- DOI:10.1145/3588195.3595940
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Mobin, Jaiaid;Maurya, Avinash;Rafique, M. Mustafa
- 通讯作者:Rafique, M. Mustafa
AI-driven Storage Resource Provisioning and Operations: Revisiting Old Assumptions and Meeting New Expectations.
人工智能驱动的存储资源配置和运营:重新审视旧假设并满足新期望。
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Anantharaj, Valentine;da Silva, Rafael Ferreira;Butt, Ali R.;Oral, Sarp;Tiwari. Devesh
- 通讯作者:Tiwari. Devesh
Translation-optimized Memory Compression for Capacity
- DOI:10.1109/micro56248.2022.00073
- 发表时间:2022-10
- 期刊:
- 影响因子:0
- 作者:Gagandeep Panwar;Muhammad Laghari;D. Bears;Yuqing Liu;Chandler Jearls;Esha Choukse;K. Cameron;A. Butt;Xun Jian
- 通讯作者:Gagandeep Panwar;Muhammad Laghari;D. Bears;Yuqing Liu;Chandler Jearls;Esha Choukse;K. Cameron;A. Butt;Xun Jian
SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Redwan Ibne Seraj Khan;Ahmad Hossein Yazdani;Yuqi Fu;Arnab K. Paul;Bo Ji;Xun Jian;Yue Cheng;A. R. Butt
- 通讯作者:Redwan Ibne Seraj Khan;Ahmad Hossein Yazdani;Yuqi Fu;Arnab K. Paul;Bo Ji;Xun Jian;Yue Cheng;A. R. Butt
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ali Butt其他文献
Ali Butt的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ali Butt', 18)}}的其他基金
SPX: Collaborative Research: Cross-stack Memory Optimizations for Boosting I/O Performance of Deep Learning HPC Applications
SPX:协作研究:用于提升深度学习 HPC 应用程序 I/O 性能的跨堆栈内存优化
- 批准号:
1919113 - 财政年份:2019
- 资助金额:
$ 42万 - 项目类别:
Standard Grant
Workshop on Data Storage Research Vision
数据存储研究愿景研讨会
- 批准号:
1829096 - 财政年份:2018
- 资助金额:
$ 42万 - 项目类别:
Standard Grant
CSR: Small: Collaborative Research: Scalable Fine-Grained Cloud Monitoring for Empowering IoT
CSR:小型:协作研究:支持物联网的可扩展细粒度云监控
- 批准号:
1615411 - 财政年份:2016
- 资助金额:
$ 42万 - 项目类别:
Standard Grant
Student Travel Support for IEEE 23rd International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2015)
IEEE 第 23 届计算机和电信系统建模、分析和仿真国际研讨会 (MASCOTS 2015) 学生旅行支持
- 批准号:
1541504 - 财政年份:2015
- 资助金额:
$ 42万 - 项目类别:
Standard Grant
CSR: Medium: Pythia: An Application Analysis and Online Modeling Based Prediction Framework for Scalable Resource Management
CSR:中:Pythia:基于应用分析和在线建模的可扩展资源管理预测框架
- 批准号:
1405697 - 财政年份:2014
- 资助金额:
$ 42万 - 项目类别:
Continuing Grant
DC: Small: Collaborative Research: Exploring Energy-Reliability Trade-offs in Data Storage Systems
DC:小型:协作研究:探索数据存储系统中的能源可靠性权衡
- 批准号:
1016408 - 财政年份:2010
- 资助金额:
$ 42万 - 项目类别:
Standard Grant
Increasing Student Participation in Cluster Computing through IEEE Cluster 2010 Attendance
通过出席 IEEE Cluster 2010 提高学生对集群计算的参与
- 批准号:
1049858 - 财政年份:2010
- 资助金额:
$ 42万 - 项目类别:
Standard Grant
CSR: Small: Towards Realizing Cloud HPC: An Adaptive Programming Model for Accelerator-based Clusters
CSR:小:迈向实现云 HPC:基于加速器的集群的自适应编程模型
- 批准号:
1016793 - 财政年份:2010
- 资助金额:
$ 42万 - 项目类别:
Standard Grant
U.S. - Pakistan International Planning Visit: Economical Computing Substrate for Developing Regions
美国-巴基斯坦国际规划访问:发展中地区的经济计算基板
- 批准号:
0940048 - 财政年份:2009
- 资助金额:
$ 42万 - 项目类别:
Standard Grant
CAREER: A Scalable Hierarchical Framework for High-Performance Data Storage
职业:高性能数据存储的可扩展分层框架
- 批准号:
0746832 - 财政年份:2008
- 资助金额:
$ 42万 - 项目类别:
Continuing Grant
相似国自然基金
IL-17A通过STAT5影响CNS2区域甲基化抑制调节性T细胞功能在银屑病发病中的作用和机制研究
- 批准号:82304006
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
miR-20a通过调控CD4+T细胞焦亡促进CNS炎性脱髓鞘疾病的发生及机制研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
生物工程化微泡干扰MAPK通路重编程CNS微环境起始脑胶质瘤免疫检查点抑制剂的应答研究
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
胱硫醚-β-合成酶介导小胶质细胞极化致糖皮质激素CNS毒性作用及机制研究
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
血浆CNS来源外泌体中寡聚磷酸化α-synuclein对PD病程的提示研究
- 批准号:82101506
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: CNS Core: Medium: Reconfigurable Kernel Datapaths with Adaptive Optimizations
协作研究:CNS 核心:中:具有自适应优化的可重构内核数据路径
- 批准号:
2345339 - 财政年份:2023
- 资助金额:
$ 42万 - 项目类别:
Standard Grant
Collaborative Research: CNS Core: Small: A Compilation System for Mapping Deep Learning Models to Tensorized Instructions (DELITE)
合作研究:CNS Core:Small:将深度学习模型映射到张量化指令的编译系统(DELITE)
- 批准号:
2230945 - 财政年份:2023
- 资助金额:
$ 42万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CNS Core: Small: Towards Scalable and Al-based Solutions for Beyond-5G Radio Access Networks
合作研究:NSF-AoF:CNS 核心:小型:面向超 5G 无线接入网络的可扩展和基于人工智能的解决方案
- 批准号:
2225578 - 财政年份:2023
- 资助金额:
$ 42万 - 项目类别:
Standard Grant
Collaborative Research: CNS Core: Medium: Movement of Computation and Data in Splitkernel-disaggregated, Data-intensive Systems
合作研究:CNS 核心:媒介:Splitkernel 分解的数据密集型系统中的计算和数据移动
- 批准号:
2406598 - 财政年份:2023
- 资助金额:
$ 42万 - 项目类别:
Continuing Grant
Collaborative Research: CNS Core: Small: SmartSight: an AI-Based Computing Platform to Assist Blind and Visually Impaired People
合作研究:中枢神经系统核心:小型:SmartSight:基于人工智能的计算平台,帮助盲人和视障人士
- 批准号:
2418188 - 财政年份:2023
- 资助金额:
$ 42万 - 项目类别:
Standard Grant