SHF: Small: Collaborative Research: Accelerated Data Transformation: A Software-Hardware Stack for Transducers
SHF:小型:协作研究:加速数据转换:传感器的软件硬件堆栈
基本信息
- 批准号:1907863
- 负责人:
- 金额:$ 25.8万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-10-01 至 2024-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Recent years have seen an explosive rise of "big data" and data-intensive computing. Many scientific and data analytics applications that operate on large data sets perform data transformation at their core. For example, many genomics applications translate DNA sequences into protein sequences and must perform this transformation on large volumes of data (petabytes) generated by DNA sequencers. Recent studies have shown that popular data analytics systems spend significant amount of time performing data transformation operations such as data compression, decompression, serialization, deserialization and error correction. While application-specific hardware accelerators can be useful, their narrow applicability can significantly limit their impact. On the other hand, accelerating a common computation at the core of many applications can have a broader impact, and benefit not only existing, but also future applications. This research targets the problem of general acceleration of data transformation. More specifically, to allow breadth of utility, the project aims to provide a software-hardware stack to accelerate the computational abstraction at the core of data transformation, namely, finite-state transducers. Given the societal importance of big data computing, a significant broader impact of this work is the uptake of research ideas and technology into the scientific base, and their resulting impact on a wide range of 'big data' applications for science, industry, and society. In addition, this project allows students to experience in first hand how abstract concepts such as finite-state transducers can be applied to practical problems, connecting elements of theory of computation, algorithm design and optimization, applications and systems architecture.The research investigates the transducers computational model and its efficient implementation with the goal of providing performance and energy-efficiency gains in data analytics systems all of which rely on data transformation. In particular, this work aims to reduce transducer theory to practical use by mapping transducer programs onto emerging data processing accelerators. To this end, this work targets the following issues. First, design a software stack to map transducers onto novel hardware accelerators. In particular, the investigators build on their previous work on the design and implementation of the Unstructured Data Processor, a novel hardware accelerator for data transformation shown to give high performance, but that at present lacks a high-level programming model. Accomplishing this goal requires investigating a set of platform-independent and platform-specific optimizations aimed to minimize the code size, minimize the memory utilization, and leverage the coarse- and fine-grained parallelism inherent in the computation. Second, improve and extend the underlying hardware accelerator based on the insights acquired in the design of the software stack. Third, extend the transducer model to express the full range of data transformations in popular data analytics systems.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
近年来,“大数据”和数据密集型计算呈爆炸式增长。 许多在大型数据集上运行的科学和数据分析应用程序的核心都是执行数据转换。 例如,许多基因组学应用将 DNA 序列转换为蛋白质序列,并且必须对 DNA 测序仪生成的大量数据(拍字节)执行此转换。最近的研究表明,流行的数据分析系统花费大量时间执行数据转换操作,例如数据压缩、解压缩、序列化、反序列化和纠错。 虽然特定于应用程序的硬件加速器可能很有用,但其狭窄的适用性可能会显着限制其影响。另一方面,加速许多应用程序核心的通用计算可以产生更广泛的影响,不仅有利于现有应用程序,而且有利于未来的应用程序。本研究针对数据转换的普遍加速问题。更具体地说,为了实现实用性的广泛性,该项目旨在提供一个软件硬件堆栈来加速数据转换核心的计算抽象,即有限状态传感器。鉴于大数据计算的社会重要性,这项工作的更广泛影响是将研究思想和技术纳入科学基础,及其对科学、工业和社会的广泛“大数据”应用产生的影响。此外,该项目使学生能够亲身体验如何将有限状态传感器等抽象概念应用于实际问题,连接计算理论、算法设计和优化、应用和系统架构的各个要素。该研究调查了传感器计算模型及其有效实施,其目标是在数据分析系统中提供性能和能源效率收益,所有这些都依赖于数据转换。特别是,这项工作旨在通过将传感器程序映射到新兴的数据处理加速器上,将传感器理论简化为实际应用。为此,本工作针对以下问题。首先,设计一个软件堆栈,将传感器映射到新颖的硬件加速器上。特别是,研究人员以他们之前关于非结构化数据处理器的设计和实现的工作为基础,这是一种新颖的数据转换硬件加速器,显示出高性能,但目前缺乏高级编程模型。实现这一目标需要研究一组独立于平台和特定于平台的优化,旨在最小化代码大小、最小化内存利用率并利用计算中固有的粗粒度和细粒度并行性。其次,根据在软件堆栈设计中获得的见解来改进和扩展底层硬件加速器。第三,扩展传感器模型以表达流行数据分析系统中的全方位数据转换。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力优点和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A GPU-accelerated Data Transformation Framework Rooted in Pushdown Transducers
植根于下推传感器的 GPU 加速数据转换框架
- DOI:10.1109/hipc56025.2022.00038
- 发表时间:2022-12
- 期刊:
- 影响因子:0
- 作者:Nguyen, Tri;Becchi, Michela
- 通讯作者:Becchi, Michela
Data Transformation Acceleration using Deterministic Finite-State Transducers
使用确定性有限状态传感器加速数据转换
- DOI:10.1109/bigdata55660.2022.10020756
- 发表时间:2022-12
- 期刊:
- 影响因子:0
- 作者:Nourian, Marziyeh;Nguyen, Tri;Chien, Andrew A.;Becchi, Michela
- 通讯作者:Becchi, Michela
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michela Becchi其他文献
FuseIM: Fusing Probabilistic Traversals for Influence Maximization on Exascale Systems
FuseIM:融合概率遍历以实现百亿亿级系统的影响最大化
- DOI:
10.1145/3650200.3656621 - 发表时间:
2024-05-30 - 期刊:
- 影响因子:0
- 作者:
Reece Neff;Mostafa Eghbali Zarch;Marco Minutoli;M. Halappanavar;Antonino Tumeo;Anantharaman Kalyanaraman;Michela Becchi - 通讯作者:
Michela Becchi
Michela Becchi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michela Becchi', 18)}}的其他基金
CSR: Small: Middleware Technologies for Multi-Accelerator Clusters
CSR:小型:多加速器集群的中间件技术
- 批准号:
1812727 - 财政年份:2018
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
NeTS: Small: A Language-Based Approach to Deep Packet Inspection: from Theory to Practice
NeTS:Small:基于语言的深度数据包检测方法:从理论到实践
- 批准号:
1724934 - 财政年份:2017
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
SHF:Medium:Collaborative Research:A comprehensive methodology to pursue reproducible accuracy in ensemble scientific simulations on multi- and many-core platforms
SHF:中:协作研究:在多核和众核平台上追求集合科学模拟的可重复精度的综合方法
- 批准号:
1728850 - 财政年份:2017
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
SHF: Small: Collaborative Research: The Automata Programming Paradigm for Genomic Analysis
SHF:小型:协作研究:基因组分析的自动机编程范式
- 批准号:
1740583 - 财政年份:2017
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
CAREER: Compiler and Runtime Support for Irregular Applications on Many-core Processors
职业:多核处理器上不规则应用程序的编译器和运行时支持
- 批准号:
1741683 - 财政年份:2017
- 资助金额:
$ 25.8万 - 项目类别:
Continuing Grant
SHF:Medium:Collaborative Research:A comprehensive methodology to pursue reproducible accuracy in ensemble scientific simulations on multi- and many-core platforms
SHF:中:协作研究:在多核和众核平台上追求集合科学模拟的可重复精度的综合方法
- 批准号:
1513603 - 财政年份:2015
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
CAREER: Compiler and Runtime Support for Irregular Applications on Many-core Processors
职业:多核处理器上不规则应用程序的编译器和运行时支持
- 批准号:
1452454 - 财政年份:2015
- 资助金额:
$ 25.8万 - 项目类别:
Continuing Grant
SHF: Small: Collaborative Research: The Automata Programming Paradigm for Genomic Analysis
SHF:小型:协作研究:基因组分析的自动机编程范式
- 批准号:
1421765 - 财政年份:2014
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
NeTS: Small: A Language-Based Approach to Deep Packet Inspection: from Theory to Practice
NeTS:Small:基于语言的深度数据包检测方法:从理论到实践
- 批准号:
1319748 - 财政年份:2013
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
CSR: Small: Scheduling and Virtualization Technologies for Heterogeneous Clusters with Many-core Devices
CSR:小:多核设备异构集群的调度和虚拟化技术
- 批准号:
1216756 - 财政年份:2012
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
相似国自然基金
小分子代谢物Catechin与TRPV1相互作用激活外周感觉神经元介导尿毒症瘙痒的机制研究
- 批准号:82371229
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
DHEA抑制小胶质细胞Fis1乳酸化修饰减轻POCD的机制
- 批准号:82301369
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
异常激活的小胶质细胞通过上调CTSS抑制微血管特异性因子MFSD2A表达促进1型糖尿病视网膜病变的免疫学机制研究
- 批准号:82370827
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
SETDB1调控小胶质细胞功能及参与阿尔茨海默病发病机制的研究
- 批准号:82371419
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
PTBP1驱动H4K12la/BRD4/HIF1α复合物-PKM2正反馈环路促进非小细胞肺癌糖代谢重编程的机制研究及治疗方案探索
- 批准号:82303616
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: SHF: Small: LEGAS: Learning Evolving Graphs At Scale
协作研究:SHF:小型:LEGAS:大规模学习演化图
- 批准号:
2331301 - 财政年份:2024
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: Efficient and Scalable Privacy-Preserving Neural Network Inference based on Ciphertext-Ciphertext Fully Homomorphic Encryption
合作研究:SHF:小型:基于密文-密文全同态加密的高效、可扩展的隐私保护神经网络推理
- 批准号:
2412357 - 财政年份:2024
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: LEGAS: Learning Evolving Graphs At Scale
协作研究:SHF:小型:LEGAS:大规模学习演化图
- 批准号:
2331302 - 财政年份:2024
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: Technical Debt Management in Dynamic and Distributed Systems
合作研究:SHF:小型:动态和分布式系统中的技术债务管理
- 批准号:
2232720 - 财政年份:2023
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: Quasi Weightless Neural Networks for Energy-Efficient Machine Learning on the Edge
合作研究:SHF:小型:用于边缘节能机器学习的准失重神经网络
- 批准号:
2326895 - 财政年份:2023
- 资助金额:
$ 25.8万 - 项目类别:
Standard Grant