SHF: Small: Sparsity-Aware Hardware Accelerators for Natural Language Processing with Transformers
SHF:小型:使用 Transformer 进行自然语言处理的稀疏感知硬件加速器
基本信息
- 批准号:2007362
- 负责人:
- 金额:$ 50万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-10-01 至 2024-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Natural Language Processing (NLP) enables people to interact with machines in the same manner as with each other. More importantly, it provides machines with the ability to access the information and knowledge that are readily available in books, articles, and various unstructured documents. Because the quality and usability of NLP-powered services depends primarily on the quantity of text the system is able to process, the computational demands of advanced NLP applications far exceed the capabilities of general-purpose computers and continue to grow. This project aims to greatly improve the performance of NLP applications based on transformers, a class of neural networks used in most state-of-the-art NLP technology. This project will significantly improve performance and efficiency for NLP applications, enabling their widespread deployment in emerging datacenters and thus enhancing the quality of human interactions with machines and each other.This project advances the state of the art of accelerators (hardware and compilers) for natural language processing, focusing primarily on sparsity-aware inference in large multi-layered self-attention based models, which have so far received limited attention from the architecture community. The project also advances NLP knowledge of sparse attention functions, studies design techniques that allow for repurposing pre-trained models to run faster, and improves the effectiveness in applications which diverge from its training setting. The investigation focuses on the key observation that the massive growth in computational complexity can be mitigated by dynamically identifying inherent sparsity and ineffectual computation in models, refitting the model to induce sparsity with the goal of either approximating or entirely avoiding parts of the computation that have limited impact on the model results. This investigation will demonstrate the performance improvement obtained by these techniques, leveraging sparsity and dynamic predictions within a novel sparsity-aware hardware acceleration framework, implemented on a field-programmable gate array (FPGA).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
自然语言处理(NLP)使人们能够以与彼此相同的方式与机器互动。更重要的是,它为机器提供了访问书籍,文章和各种非结构化文档中随时可用的信息和知识的能力。由于NLP驱动服务的质量和可用性主要取决于系统能够处理的文本数量,因此高级NLP应用程序的计算需求远远超过了通用计算机的功能并继续增长。该项目旨在大大提高基于变形金刚的NLP应用程序的性能,Transformers是大多数最先进的NLP技术中使用的一类神经网络。该项目将显着提高NLP应用程序的性能和效率,使其在新兴数据中心中广泛部署,从而提高人类与机器的互动质量,彼此之间的交互质量。该项目促进了加速器(硬件和编译器)对自然语言处理,主要基于sparsity-aware aware aware aware interioned interive interive interive interive interive interive interive interive interive interive interion for interion for interion for y rayie layi-lay layi frays lay lay lay a的质量。建筑社区。该项目还提高了NLP注意力稀疏功能的知识,研究设计技术允许重新利用预训练的模型更快地运行,并提高了与训练环境不同的应用程序的有效性。该研究的重点是关键观察,即通过动态识别模型中固有的稀疏性和无效计算,可以减轻计算复杂性的大规模增长,从而促进模型诱导稀疏性,目的是近似或完全避免计算部分对模型结果影响有限的部分。这项调查将证明这些技术获得的性能提高,利用稀疏性和动态预测在一个新颖的意识到的硬件加速框架中实施,该框架在现场可编程的门阵列(FPGA)上实施。本奖奖反映了NSF的法定任务,并通过评估智力效果和广泛的评估,并被视为值得通过评估的支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
IrEne: Interpretable Energy Prediction for Transformers
- DOI:10.18653/v1/2021.acl-long.167
- 发表时间:2021-06
- 期刊:
- 影响因子:0
- 作者:Qingqing Cao;Yash Kumar Lal;H. Trivedi;A. Balasubramanian;Niranjan Balasubramanian
- 通讯作者:Qingqing Cao;Yash Kumar Lal;H. Trivedi;A. Balasubramanian;Niranjan Balasubramanian
On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers
- DOI:10.18653/v1/2021.findings-acl.363
- 发表时间:2021-06
- 期刊:
- 影响因子:0
- 作者:Tianchu Ji;Shraddhan Jain;M. Ferdman;Peter Milder;H. A. Schwartz;Niranjan Balasubramanian
- 通讯作者:Tianchu Ji;Shraddhan Jain;M. Ferdman;Peter Milder;H. A. Schwartz;Niranjan Balasubramanian
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Peter Milder其他文献
"Smart" design space sampling to predict Pareto-optimal solutions
“智能”设计空间采样来预测帕累托最优解决方案
- DOI:
10.1145/2248418.2248436 - 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
M. Zuluaga;Andreas Krause;Peter Milder;Markus Püschel - 通讯作者:
Markus Püschel
Domain-specific library generation for parallel software and hardware platforms
用于并行软件和硬件平台的特定领域库生成
- DOI:
- 发表时间:
2008 - 期刊:
- 影响因子:0
- 作者:
F. Franchetti;Y. Voronenko;Peter Milder;S. Chellappa;Marek R. Telgarsky;Hao Shen;P. D'Alberto;Frédéric de Mesmay;J. Hoe;José M. F. Moura;Markus Püschel - 通讯作者:
Markus Püschel
Wireless Multicast Rate Control Adaptive to Application Goodput and Loss Requirements
适应应用吞吐量和丢失要求的无线组播速率控制
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Mohammed Elbadry;Fan Ye;Peter Milder - 通讯作者:
Peter Milder
Generation and transmission of 85.4 Gb/s real-time 16QAM coherent optical OFDM signals over 400 km SSMF with preamble-less reception.
在 400 km SSMF 上生成和传输 85.4 Gb/s 实时 16QAM 相干光 OFDM 信号,并具有无前导码接收功能。
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:3.8
- 作者:
R. Bouziane;R. Schmogrow;D. Hillerkuss;Peter Milder;C. Koos;W. Freude;J. Leuthold;P. Bayvel;R. Killey - 通讯作者:
R. Killey
Quantifying Energy and Latency Improvements of FPGA-Based Sensors for Low-Cost Spectrum Monitoring
量化基于 FPGA 的传感器的能耗和延迟改进,以实现低成本频谱监控
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
A. Bhattacharya;Han Chen;Peter Milder;Samir R Das - 通讯作者:
Samir R Das
Peter Milder的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
基于眼-脑跨模态影像构建稀疏贝叶斯线性回归模型预测脑小血管病程度的研究
- 批准号:
- 批准年份:2022
- 资助金额:52 万元
- 项目类别:面上项目
基于眼-脑跨模态影像构建稀疏贝叶斯线性回归模型预测脑小血管病程度的研究
- 批准号:82272072
- 批准年份:2022
- 资助金额:52.00 万元
- 项目类别:面上项目
基于Q*-Ricker小波基的GPR数据稀疏表示方法研究
- 批准号:
- 批准年份:2020
- 资助金额:24 万元
- 项目类别:青年科学基金项目
融合对抗学习和稀疏互学习的雾霾场景低空非合作小目标检测方法研究
- 批准号:
- 批准年份:2020
- 资助金额:24 万元
- 项目类别:青年科学基金项目
基于Rational Krylov法和小波域稀疏约束的时间域海洋电磁三维正反演研究
- 批准号:41804098
- 批准年份:2018
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
相似海外基金
SHF: Small: Domain-Specific FPGAs to Accelerate Unrolled DNNs with Fine-Grained Unstructured Sparsity and Mixed Precision
SHF:小型:特定领域 FPGA 加速具有细粒度非结构化稀疏性和混合精度的展开 DNN
- 批准号:
2303626 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Addressing Sparsity in Metabolomics Data Analysis
解决代谢组学数据分析中的稀疏性
- 批准号:
10396831 - 财政年份:2021
- 资助金额:
$ 50万 - 项目类别:
AF: Small: Sparsity in Local Computation
AF:小:局部计算的稀疏性
- 批准号:
2006664 - 财政年份:2020
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CIF: Small: A Systematic Approach to Adversarial Machine Learning: Sparsity-based Defenses and Locally Linear Attacks
CIF:小型:对抗性机器学习的系统方法:基于稀疏性的防御和局部线性攻击
- 批准号:
1909320 - 财政年份:2019
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Addressing Sparsity in Metabolomics Data Analysis
解决代谢组学数据分析中的稀疏性
- 批准号:
10007593 - 财政年份:2018
- 资助金额:
$ 50万 - 项目类别: