FoMR: DeepFetch: Compact Deep Learning based Prefetcher on Configurable Hardware
FoMR:DeepFetch:可配置硬件上基于紧凑深度学习的预取器
基本信息
- 批准号:1912680
- 负责人:
- 金额:$ 20万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-10-01 至 2022-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Fast computer processors, tensor processing units, hardware accelerators, and heterogeneous architectures have enabled large-scale speed-ups in computational power, but memory speeds have not kept pace at the same time. Memory performance therefore has become the bottleneck in many applications that rely on heavy memory access. Several emerging memory technologies such 3D-Stacked Dynamic Random Access Memory (3D-DRAM) and non-volatile memory attempt to address memory bottleneck issues from a hardware perspective, but with a tradeoff among bandwidth, power, latency, and cost. Rather than redesigning existing algorithms to suit specific memory technology, this project will develop a Machine Learning-based approach that automatically learns access patterns which may be used to optimally prefetch data. Specifically, highly compact Long short-term memory (LSTM) models will be used as the centerpiece of the prefetcher for predicting memory accesses. Through novel model compression techniques, hierarchical memory modeling and dedicated hardware, this project will overcome barriers of fully exploiting machine learning and emerging hardware to improve prefetching. Successful completion of this project will lead to improved memory performance for applications, including signal processing, computer vision, and language processing.A practical LSTM based prefetcher implementation on hardware requires dealing with certain challenges that will be addressed in this endeavor: (i) training a small model (to enable fast inference) with large traces that is highly accurate in predicting memory accesses for multiple applications; (ii) model compression to ensure real-time inference; (iii) retraining the model online on-demand to learn application specific models, which would require fast learning with small amount of data; (iv) making prefetching decisions in real-time based on the prediction and uncertainty of the model ''what'', ''when'', and ''where'' to prefetch, which also requires careful modeling of the target memory hierarchy; (vi) based on the predictions, deciding in real-time if reordering data (dynamic data layout) can improve the latency, making future prefetches more effective; (vii) mapping the framework of predictions and decision making on limited available configurable hardware in - ensuring low latency training and high-throughput prefetching utilizing small area/power.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
快速的计算机处理器、张量处理单元、硬件加速器和异构架构已经实现了计算能力的大规模加速,但内存速度却没有跟上。因此,内存性能已成为许多依赖大量内存访问的应用程序的瓶颈。 3D堆叠动态随机存取存储器(3D-DRAM)和非易失性存储器等多种新兴存储器技术试图从硬件角度解决存储器瓶颈问题,但需要在带宽、功耗、延迟和成本之间进行权衡。该项目不会重新设计现有算法以适应特定的内存技术,而是开发一种基于机器学习的方法,自动学习可用于最佳预取数据的访问模式。具体来说,高度紧凑的长短期记忆(LSTM)模型将用作预取器的核心,用于预测内存访问。通过新颖的模型压缩技术、分层内存建模和专用硬件,该项目将克服充分利用机器学习和新兴硬件来改进预取的障碍。该项目的成功完成将提高应用程序的内存性能,包括信号处理、计算机视觉和语言处理。在硬件上实现基于 LSTM 的实用预取器需要应对本次工作中将解决的某些挑战:(i) 训练具有大轨迹的小型模型(以实现快速推理),可以高度准确地预测多个应用程序的内存访问; (ii) 模型压缩以确保实时推理; (iii) 按需在线重新训练模型以学习特定于应用的模型,这需要使用少量数据进行快速学习; (iv) 根据模型“什么”、“何时”和“何处”预取的预测和不确定性实时做出预取决策,这也需要对目标内存层次结构进行仔细建模; (vi) 根据预测,实时决定重新排序数据(动态数据布局)是否可以改善延迟,使未来的预取更加有效; (vii) 在有限的可用可配置硬件上映射预测和决策框架,确保利用小面积/功率进行低延迟训练和高吞吐量预取。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准。
项目成果
期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
RAOP: Recurrent Neural Network Augmented Offset Prefetcher
- DOI:10.1145/3422575.3422807
- 发表时间:2020-09
- 期刊:
- 影响因子:0
- 作者:Pengmiao Zhang;Ajitesh Srivastava;Benjamin Brooks;R. Kannan;V. Prasanna
- 通讯作者:Pengmiao Zhang;Ajitesh Srivastava;Benjamin Brooks;R. Kannan;V. Prasanna
SHARP: Software Hint-Assisted Memory Access Prediction for Graph Analytics
- DOI:10.1109/hpec55821.2022.9926307
- 发表时间:2022-09
- 期刊:
- 影响因子:0
- 作者:Pengmiao Zhang;R. Kannan;Xiangzhi Tong;Anant V. Nori;V. Prasanna
- 通讯作者:Pengmiao Zhang;R. Kannan;Xiangzhi Tong;Anant V. Nori;V. Prasanna
ReSemble: reinforced ensemble framework for data prefetching
ReSemble:用于数据预取的增强型集成框架
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Zhang, Pengmiao;Kannan, Rajgopal;Srivastava, Ajitesh;Nori, Anant V.;Prasanna, Viktor K.
- 通讯作者:Prasanna, Viktor K.
MemMAP: Compact and Generalizable Meta-LSTM Models for Memory Access Prediction
- DOI:10.1007/978-3-030-47436-2_5
- 发表时间:2020-04-17
- 期刊:
- 影响因子:0
- 作者:Srivastava A;Wang TY;Zhang P;De Rose CA;Kannan R;Prasanna VK
- 通讯作者:Prasanna VK
TransforMAP: Transformer for Memory Access Prediction
TransforMAP:用于内存访问预测的变压器
- DOI:
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Zhang, Pengmiao;Srivastava, Ajitesh;Kannan, Rajgopal;Nori, Anant V.;Prasanna, Viktor K.
- 通讯作者:Prasanna, Viktor K.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Viktor Prasanna其他文献
Accelerating Deep Neural Network guided MCTS using Adaptive Parallelism
使用自适应并行加速深度神经网络引导的 MCTS
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Yuan Meng;Qian Wang;Tianxin Zu;Viktor Prasanna - 通讯作者:
Viktor Prasanna
Accelerating GNN Training on CPU+Multi-FPGA Heterogeneous Platform
在 CPU 多 FPGA 异构平台上加速 GNN 训练
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Yi-Chien Lin;Bingyi Zhang;Viktor Prasanna - 通讯作者:
Viktor Prasanna
PEARL: Enabling Portable, Productive, and High-Performance Deep Reinforcement Learning using Heterogeneous Platforms
PEARL:使用异构平台实现便携式、高效且高性能的深度强化学习
- DOI:
10.1145/3649153.3649193 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Yuan Meng;Michael Kinsner;Deshanand Singh;Mahesh Iyer;Viktor Prasanna - 通讯作者:
Viktor Prasanna
Viktor Prasanna的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Viktor Prasanna', 18)}}的其他基金
IUCRC Phase I University of Southern California: Center for Intelligent Distributed Embedded Applications and Systems (IDEAS)
IUCRC 第一期南加州大学:智能分布式嵌入式应用和系统中心 (IDEAS)
- 批准号:
2231662 - 财政年份:2023
- 资助金额:
$ 20万 - 项目类别:
Continuing Grant
Elements: Portable Library for Homomorphic Encrypted Machine Learning on FPGA Accelerated Cloud Cyberinfrastructure
元素:FPGA 加速云网络基础设施上同态加密机器学习的便携式库
- 批准号:
2311870 - 财政年份:2023
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
OAC Core: Scalable Graph ML on Distributed Heterogeneous Systems
OAC 核心:分布式异构系统上的可扩展图 ML
- 批准号:
2209563 - 财政年份:2022
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
SaTC: CORE: Small: Accelerating Privacy Preserving Deep Learning for Real-time Secure Applications
SaTC:核心:小型:加速实时安全应用程序的隐私保护深度学习
- 批准号:
2104264 - 财政年份:2021
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
Collaborative Research:PPoSS:Planning: Streamware - A Scalable Framework for Accelerating Streaming Data Science
合作研究:PPoSS:规划:Streamware - 加速流数据科学的可扩展框架
- 批准号:
2119816 - 财政年份:2021
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
RAPID: ReCOVER: Accurate Predictions and Resource Allocation for COVID-19 Epidemic Response
RAPID:ReCOVER:COVID-19 流行病应对的准确预测和资源分配
- 批准号:
2027007 - 财政年份:2020
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
CNS Core: Small: AccelRITE: Accelerating ReInforcemenT Learning based AI at the Edge Using FPGAs
CNS 核心:小型:AccelRITE:使用 FPGA 在边缘加速基于强化学习的 AI
- 批准号:
2009057 - 财政年份:2020
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
OAC Core: Small: Scalable Graph Analytics on Emerging Cloud Infrastructure
OAC 核心:小型:新兴云基础设施上的可扩展图形分析
- 批准号:
1911229 - 财政年份:2019
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
CNS: CSR: Small: Exploiting 3D Memory for Energy-Efficient Memory-Driven Computing
CNS:CSR:小型:利用 3D 内存实现节能内存驱动计算
- 批准号:
1643351 - 财政年份:2016
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
EAGER: Safer Connected Communities Through Integrated Data-driven Modeling, Learning, and Optimization
EAGER:通过集成的数据驱动建模、学习和优化打造更安全的互联社区
- 批准号:
1637372 - 财政年份:2016
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
相似国自然基金
基于数据与知识驱动的湍流深度特征提取与本构关系建模
- 批准号:12372288
- 批准年份:2023
- 资助金额:53 万元
- 项目类别:面上项目
多模态遥感数据深度融合的农业塑料大棚精准提取方法
- 批准号:42371400
- 批准年份:2023
- 资助金额:48 万元
- 项目类别:面上项目
面向异质遥感数据的深度增量学习地物要素提取研究
- 批准号:62371373
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
基于深度学习的膜蛋白颗粒原位提取技术研究
- 批准号:22307115
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于深度域适应的数字病理图像鲁棒特征提取研究
- 批准号:82272071
- 批准年份:2022
- 资助金额:52 万元
- 项目类别:面上项目