Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
基本信息
- 批准号:2328975
- 负责人:
- 金额:$ 9.97万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2024
- 资助国家:美国
- 起止时间:2024-01-01 至 2026-12-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
In traditional Von Neumann computing systems, a significant bottleneck arises because the data transfer speed to and from the computing units has considerably fallen behind capacity, processing speed, and efficiency. To mitigate this bottleneck by bridging the gap between storage and computation, many innovative storage technologies have been introduced, along with near- and in-memory processing solutions designed for both emerging and traditional memory systems. Nonetheless, a considerable challenge remains: the prototyping and characterization of actual fabricated systems, especially those encompassing both mature technologies and cutting-edge technologies. To overcome this challenge, this project develops a cutting-edge Retunable and Reconfigurable Acceleration Platform (R3AP) based on emerging racetrack memory, leveraging a device-architecture-application co-design approach. The standout features of R3AP include its ability to function as a reconfigurable logic, a processing-in-memory (PIM) accelerator, and a high-density memory storage. It is retunable, meaning it can operate with bit-wise, integer, and floating-point precision, and can simulate analog-like storage and processing. R3AP effectively mitigates data movement inefficiencies while offering domain-specific acceleration and adaptability. With its dense, reliable, energy-efficient, and ultra-low latency computational capability, R3AP has the potential to revolutionize the storage and processing capabilities of future computing systems, such as those in Internet of Things (IoT) and Cyber-Physical Systems (CPS). It can also be applied to high-performance and cloud computing systems. The project's findings are shared through publications, workshops, design contests, tutorials, industrial courses, and technology transfer activities. Educational resources and outreach activity plans are made available on the project website, and software artifacts are released on GitHub.To realize R3AP, the project comprises a series of interrelated research tasks spanning multiple system layers. At the device level, the project integrates the voltage-controlled skyrmion motion mechanism with the industrial-grade 8-inch wafer magnetic tunneling junction stack and demonstrates a fully functional Skyrmion racetrack memory (SRTM), including the formation, shifting, and detection of the skyrmion stream. Additionally, it evaluates the performance of SRTM, focusing on aspects such as write-error-rate, shift-error-rate, read-error-rate, operation speed, and energy consumption. It also addresses and mitigates non-idealities, such as the pinning effect, and goes on to develop and demonstrate CMOS-integrated SRTM. On the architecture and circuit layers, the project involves the creation of a mutable lookup table, compute, and memory unit. This unit performs like multi-context Field-Programmable Gate Array (FPGA) logic, parallel PIM logic, massively parallel accumulators, and analog-like storage and compute structures, leveraging the unique properties of SRTM. This layer ensures high-speed memory access from a hierarchy consisting of banks, subarrays, tiles, etc., and further adds links via configurable switch boxes and a mesh-based network-on-chip to enable data movement operations for PIM that would otherwise be challenging. At the application layer, the project develops novel modeling, analysis, design space exploration, and runtime adjustment techniques to exploit the high degree of reconfigurability provided by R3AP. The goal is to adapt future IoT and CPS applications to changing environments and requirements, optimize resource usage, withstand external disturbances, and enhance overall system performance, resilience, and sustainability. Across all these layers, the project develops a scalable computer-aided design (CAD) flow. This involves a multi-level intermediate representation-based compilation flow, which can compile high-level description languages such as PyTorch and C/C++ into binaries for the R3AP device. This flow uses a multi-level hierarchy including front-end, middle-end, and back-end compilation of the designs, and abstracts various optimization and management problems to a suitable level for efficient resolution.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
在传统的冯诺依曼计算系统中,由于计算单元之间的数据传输速度大大落后于容量、处理速度和效率,因此出现了严重的瓶颈。为了通过弥合存储和计算之间的差距来缓解这一瓶颈,人们引入了许多创新的存储技术,以及为新兴和传统内存系统设计的近内存和内存处理解决方案。尽管如此,仍然存在一个相当大的挑战:实际制造系统的原型设计和表征,特别是那些既包含成熟技术又包含尖端技术的系统。为了克服这一挑战,该项目开发了一种基于新兴赛道存储器的尖端可重调和可重配置加速平台(R3AP),利用设备-架构-应用程序协同设计方法。 R3AP 的突出特点包括其作为可重构逻辑、内存处理 (PIM) 加速器和高密度内存存储的能力。它是可重调的,这意味着它可以按位、整数和浮点精度进行操作,并且可以模拟类似模拟的存储和处理。 R3AP 有效缓解数据移动效率低下的问题,同时提供特定于域的加速和适应性。凭借其密集、可靠、节能和超低延迟的计算能力,R3AP 有潜力彻底改变未来计算系统的存储和处理能力,例如物联网 (IoT) 和网络物理系统 (Cyber-Physical Systems) 中的计算系统。 CPS)。它还可以应用于高性能和云计算系统。该项目的研究成果通过出版物、研讨会、设计竞赛、教程、工业课程和技术转让活动进行分享。教育资源和推广活动计划在项目网站上提供,软件工件在 GitHub 上发布。为了实现 R3AP,该项目包括一系列跨越多个系统层的相互关联的研究任务。在器件层面,该项目将压控斯格明子运动机制与工业级8英寸晶圆磁隧道结堆栈集成,并演示了功能齐全的斯格明子赛道存储器(SRTM),包括形成、移位和检测斯格明子流。此外,它还评估了SRTM的性能,重点关注写入错误率、移位错误率、读取错误率、运行速度和能耗等方面。它还解决和减轻非理想问题,例如钉扎效应,并继续开发和演示 CMOS 集成的 SRTM。在架构和电路层,该项目涉及创建可变查找表、计算和存储单元。该单元的性能类似于多上下文现场可编程门阵列 (FPGA) 逻辑、并行 PIM 逻辑、大规模并行累加器以及类模拟存储和计算结构,利用 SRTM 的独特属性。该层确保从由存储体、子阵列、块等组成的层次结构进行高速内存访问,并通过可配置的开关盒和基于网状的片上网络进一步添加链接,以实现 PIM 的数据移动操作,否则这些操作将无法进行。具有挑战性。在应用层,该项目开发了新颖的建模、分析、设计空间探索和运行时调整技术,以利用 R3AP 提供的高度可重构性。目标是使未来的物联网和 CPS 应用适应不断变化的环境和要求,优化资源使用,抵御外部干扰,并增强整体系统性能、弹性和可持续性。在所有这些层面上,该项目开发了一个可扩展的计算机辅助设计 (CAD) 流程。这涉及到基于多级中间表示的编译流程,可以将 PyTorch 和 C/C++ 等高级描述语言编译为 R3AP 设备的二进制文件。该流程采用包括前端、中端和后端编译设计在内的多级层次结构,将各种优化和管理问题抽象到合适的级别以进行高效解决。该奖项体现了 NSF 的法定使命,并得到了美国国家科学基金会 (NSF) 的认可。通过使用基金会的智力优点和更广泛的影响审查标准进行评估,认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Mimi Xie其他文献
M2M-Routing: Environmental Adaptive Multi-agent Reinforcement Learning based Multi-hop Routing Policy for Self-Powered IoT Systems
M2M-Routing:用于自供电物联网系统的基于环境自适应多代理强化学习的多跳路由策略
- DOI:
10.23919/date54114.2022.9774779 - 发表时间:
2022-03-14 - 期刊:
- 影响因子:0
- 作者:
Wen Zhang;J. Zhang;Mimi Xie;Tao Liu;Wenlu Wang;Chen Pan - 通讯作者:
Chen Pan
Nonvolatile main memory aware garbage collection in high-level language virtual machine
高级语言虚拟机中的非易失性主存感知垃圾回收
- DOI:
10.1109/emsoft.2015.7318275 - 发表时间:
2015-10-01 - 期刊:
- 影响因子:0
- 作者:
Chen Pan;Mimi Xie;Chengmo Yang;Z. Shao;J. Hu - 通讯作者:
J. Hu
Deep Learning Tackles Temporal Predictions on Charging Loads of Electric Vehicles
深度学习解决电动汽车充电负载的时间预测
- DOI:
10.1109/ecce50734.2022.9947901 - 发表时间:
2022-10-09 - 期刊:
- 影响因子:0
- 作者:
Eugenia Cadete;Raul Alva;Albert Zhang;Caiwen Ding;Mimi Xie;Sara Ahmed;Yufang Jin - 通讯作者:
Yufang Jin
Maximize energy utilization for ultra-low energy harvesting powered embedded systems
最大限度地提高超低能量收集供电嵌入式系统的能源利用率
- DOI:
10.1109/rtcsa.2017.8046325 - 发表时间:
2017-08-01 - 期刊:
- 影响因子:0
- 作者:
Chen Pan;Mimi Xie;J. Hu - 通讯作者:
J. Hu
Autotile: Autonomous Task-tiling for Deep Inference on Battery-less Embedded System
Autotile:用于无电池嵌入式系统深度推理的自主任务平铺
- DOI:
10.1145/3649476.3658798 - 发表时间:
2024-06-12 - 期刊:
- 影响因子:0
- 作者:
Jishnu Banerjee;Sahidul Islam;Wei Wei;Chen Pan;Mimi Xie - 通讯作者:
Mimi Xie
Mimi Xie的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Mimi Xie', 18)}}的其他基金
SCC-PG: Bridge: An AI-Enabled Platform to Support Coordinated Care for Children with Autism
SCC-PG:Bridge:支持自闭症儿童协调护理的人工智能平台
- 批准号:
2306596 - 财政年份:2023
- 资助金额:
$ 9.97万 - 项目类别:
Standard Grant
SCC-PG: Bridge: An AI-Enabled Platform to Support Coordinated Care for Children with Autism
SCC-PG:Bridge:支持自闭症儿童协调护理的人工智能平台
- 批准号:
2306596 - 财政年份:2023
- 资助金额:
$ 9.97万 - 项目类别:
Standard Grant
相似国自然基金
IGF-1R调控HIF-1α促进Th17细胞分化在甲状腺眼病发病中的机制研究
- 批准号:82301258
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
CTCFL调控IL-10抑制CD4+CTL旁观者激活促口腔鳞状细胞癌新辅助免疫治疗抵抗机制研究
- 批准号:82373325
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
RNA剪接因子PRPF31突变导致人视网膜色素变性的机制研究
- 批准号:82301216
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
血管内皮细胞通过E2F1/NF-kB/IL-6轴调控巨噬细胞活化在眼眶静脉畸形中的作用及机制研究
- 批准号:82301257
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于多元原子间相互作用的铝合金基体团簇调控与强化机制研究
- 批准号:52371115
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328972 - 财政年份:2024
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328974 - 财政年份:2024
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328973 - 财政年份:2024
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328974 - 财政年份:2024
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328973 - 财政年份:2024
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant