CAREER: Scalable and Adaptable Sparsity-driven Methods for more Efficient AI Systems
职业:可扩展且适应性强的稀疏驱动方法,可实现更高效的人工智能系统
基本信息
- 批准号:2238291
- 负责人:
- 金额:$ 55.03万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-03-01 至 2028-02-29
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Artificial Intelligence (AI) and, in particular, Deep Neural Networks (DNN) have achieved better than human accuracy on many cognitive tasks involving images, natural language processing, and protein structure, among others. Unfortunately, due to high data processing demands, AI systems are typically run on power-hungry specialized computing hardware. Quantization, or approximation to smaller numerical values, has been used to reduce computing requirements. However, the fixed low bit-width DNNs may suffer losses in accuracy due to quantization errors. Many existing software solutions for quantization are also fixed or limited in bit-width choices. To address this trade-off and leverage data sparsity, the research team will investigate state-of-the-art methods and develop novel data quantization, encoding, and compression algorithms to integrate with existing AI systems. The methods developed have the potential to not only improve performance but also to reduce power requirements and boost the energy efficiency of AI systems. They will enable AI applications such as DNN inference on small devices, thus reducing the load on cloud infrastructure, improving user experience, providing data privacy, and avoiding security risks. The work proposed in this project has the potential to push the boundaries in many AI applications that run on energy storage-constrained devices, such as smart sensing, wearable devices, and autonomous driving. The research and educational tools will facilitate and increase student and research community participation in advancing AI research. The work will be conducted at a minority-serving institution, and the funding will support students from underrepresented groups.The research goal of this project is to investigate quantization and compression methods that can leverage sparsity and improve efficiency in AI systems. The principal investigator (PI) plans to study adaptable quantization and compression methods to leverage sparsity in AI systems while minimizing the overhead in non-sparse situations and minimizing accuracy loss. The trade-off between accuracy and performance with the proposed methods will be studied and defined for automated tunable prioritization of either accuracy, performance, or energy efficiency. The PI plans to develop a prototype with parallel execution of the proposed methods to make the proposed methods truly effective for data centers and advanced hardware architectures. The proposed methods will be packaged into an AI vector primitives library that will be integrated with several popular Deep Learning frameworks as proof of concept, primarily targeting GPU and CPU systems. An integration API will be developed for frameworks like Pytorch or TensorFlow to allow easy integration with other vector primitives. Software libraries will be integrated with a web-based learning platform with automated feedback and a motivating environment to encourage student participation in solving AI challenges.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
人工智能 (AI),特别是深度神经网络 (DNN) 在涉及图像、自然语言处理和蛋白质结构等的许多认知任务上取得了优于人类的准确性。 不幸的是,由于高数据处理需求,人工智能系统通常在耗电的专用计算硬件上运行。 量化或近似较小的数值已被用来降低计算要求。 然而,固定的低位宽 DNN 可能会因量化误差而损失准确性。 许多现有的量化软件解决方案在位宽选择方面也是固定或有限的。 为了解决这种权衡并利用数据稀疏性,研究团队将研究最先进的方法并开发新颖的数据量化、编码和压缩算法,以与现有的人工智能系统集成。 所开发的方法不仅有可能提高人工智能系统的性能,还可以降低功耗需求并提高能源效率。 它们将使小型设备上的 DNN 推理等人工智能应用成为可能,从而减少云基础设施的负载,改善用户体验,提供数据隐私并避免安全风险。 该项目提出的工作有可能突破许多在能量存储受限的设备上运行的人工智能应用的界限,例如智能传感、可穿戴设备和自动驾驶。 研究和教育工具将促进和增加学生和研究社区参与推进人工智能研究。 这项工作将在一家为少数族裔服务的机构进行,资金将支持来自代表性不足群体的学生。该项目的研究目标是研究可以利用稀疏性并提高人工智能系统效率的量化和压缩方法。 首席研究员 (PI) 计划研究适应性量化和压缩方法,以利用人工智能系统的稀疏性,同时最大限度地减少非稀疏情况下的开销并最大限度地减少准确性损失。 将研究和定义所提出的方法的准确性和性能之间的权衡,以实现准确性、性能或能源效率的自动可调优先级。 PI 计划开发一个并行执行所提出方法的原型,以使所提出的方法对数据中心和先进硬件架构真正有效。 所提出的方法将被打包到一个 AI 向量基元库中,该库将与几个流行的深度学习框架集成作为概念证明,主要针对 GPU 和 CPU 系统。 将为 Pytorch 或 TensorFlow 等框架开发集成 API,以便与其他矢量基元轻松集成。 软件库将与基于网络的学习平台集成,提供自动反馈和激励环境,以鼓励学生参与解决人工智能挑战。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力优势和更广泛的评估进行评估,被认为值得支持。影响审查标准。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Gheorghi Guzun其他文献
Faster Multidimensional Data Queries on Infrastructure Monitoring Systems
基础设施监控系统上更快的多维数据查询
- DOI:
10.1016/j.bdr.2021.100288 - 发表时间:
2021-11-01 - 期刊:
- 影响因子:0
- 作者:
Yinghua Qin;Gheorghi Guzun - 通讯作者:
Gheorghi Guzun
Scalable preference queries for high-dimensional data using map-reduce
使用 Map-Reduce 对高维数据进行可扩展的偏好查询
- DOI:
10.1109/bigdata.2015.7364013 - 发表时间:
2015-10-29 - 期刊:
- 影响因子:0
- 作者:
Gheorghi Guzun;Joel E. Tosado;G. Canahuate - 通讯作者:
G. Canahuate
A tunable compression framework for bitmap indices
位图索引的可调压缩框架
- DOI:
10.1109/icde.2014.6816675 - 发表时间:
2014-05-19 - 期刊:
- 影响因子:0
- 作者:
Gheorghi Guzun;G. Canahuate;David Chiu;Jason Sawin - 通讯作者:
Jason Sawin
Slicing the Dimensionality: Top-k Query Processing for High-Dimensional Spaces
维度切片:高维空间的 Top-k 查询处理
- DOI:
10.1007/978-3-662-45714-6_2 - 发表时间:
2014 - 期刊:
- 影响因子:0
- 作者:
Gheorghi Guzun;Joel E. Tosado;G. Canahuate - 通讯作者:
G. Canahuate
Multidimensional Preference Query Optimization on Infrastructure Monitoring Systems
基础设施监控系统多维偏好查询优化
- DOI:
10.1109/bigdata47090.2019.9005666 - 发表时间:
2019-12-01 - 期刊:
- 影响因子:0
- 作者:
Yinghua Qin;Gheorghi Guzun - 通讯作者:
Gheorghi Guzun
Gheorghi Guzun的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
基于可扩展去蜂窝架构的大规模低时延高可靠通信研究
- 批准号:62371039
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
具备可扩展性与隐私保障的数据驱动分布式优化方法及其在需求响应中的应用
- 批准号:72301008
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
自动驾驶场景下基于强化学习的可扩展多智能体协同策略研究
- 批准号:62306062
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于无监督持续学习的单细胞多组学数据可扩展整合方法研究
- 批准号:62303488
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
区块链系统中面向业务优化的混合状态验证机制的可扩展性研究
- 批准号:62302202
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Adaptable and scalable electroporation for cellular therapy
用于细胞治疗的适应性和可扩展的电穿孔
- 批准号:
10545845 - 财政年份:2022
- 资助金额:
$ 55.03万 - 项目类别:
FMRG: Adaptable and Scalable Robot Teleoperation for Human-in-the-Loop Assembly
FMRG:用于人在环装配的适应性和可扩展的机器人远程操作
- 批准号:
2037101 - 财政年份:2021
- 资助金额:
$ 55.03万 - 项目类别:
Standard Grant
ePACE: automation platforms for adaptable and scalable continuous evolution of biomolecules with therapeutic potential
ePACE:自动化平台,用于具有治疗潜力的生物分子的适应性和可扩展的持续进化
- 批准号:
10734591 - 财政年份:2019
- 资助金额:
$ 55.03万 - 项目类别:
CAREER: Scalable and Adaptable Cross-Domain Autonomous Health Assessment
职业:可扩展且适应性强的跨域自主健康评估
- 批准号:
1750936 - 财政年份:2018
- 资助金额:
$ 55.03万 - 项目类别:
Continuing Grant
EAGER/Collaborative Research: Web-architectures for Extensible, Adaptable and Scalable Manufacturing
EAGER/协作研究:可扩展、适应性和可扩展制造的网络架构
- 批准号:
1322748 - 财政年份:2013
- 资助金额:
$ 55.03万 - 项目类别:
Standard Grant