ASCENT: Collaborative Research: Scaling Distributed AI Systems based on Universal Optical I/O
ASCENT:协作研究:基于通用光学 I/O 扩展分布式人工智能系统
基本信息
- 批准号:2023751
- 负责人:
- 金额:$ 32.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-08-15 至 2023-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Our society is rapidly becoming reliant on neural networks based artificial intelligence computation. New algorithms are invented daily, increasing the memory and computational requirements for both inference and training. This explosive growth has created an enormous demand for distributed machine learning (ML) training and inference. Estimates by OpenAI illustrate the steady growth of computational requirements of 100x every two years since 2012, which is a 50x faster than the rate of computation improvements enabled previously through Moore’s Law of semiconductor industry that we have enjoyed in the last half-century. This new computation demand has been partly met by rapid development of hardware accelerators and software stacks to support these specialized computations. Hardware accelerators have provided a significant amount of speed-up but today’s training tasks can still take days and even weeks. The reason for this: as the number of workers (e.g. compute nodes) increases, the computation time per worker decreases, but the communication requirements between the nodes increase, creating a bottleneck in the interconnect between the compute nodes. Future distributed ML systems will require 1-2 orders of magnitude higher interconnect bandwidth per node, creating a pressing need for entirely new ways to build interconnects for distributed ML systems. This proposal aims to create a new paradigm for scaling distributed ML computation, by developing a scalable interconnect solution based on advancing the integrated electronics and photonics technology that enables direct node-to-node optical fiber connectivity. The proposed cross-stack collaborative multi-disciplinary work will enable the education and training of a unique crop of engineers and scientists that cross the boundaries of machine learning, networking, and electronic-photonic systems and devices, which are in severe demand. The principal investigators have an established track record of direct engagement with high-school students providing summer internships at Berkeley Wireless Research Center and MIT’s Women’s Technology Program, as well as exemplary undergraduate research activities at Boston University. The educational and outreach activities the PIs have put in place will ensure early exposure and continued training of new generation of leaders in this field, from K-12, through undergraduate and graduate studies, and continuing workforce education, with special focus on underrepresented students.The interconnect has emerged as the key bottleneck in enabling the full potential of distributed ML. Future ML workloads are likely to require tens of Tbps of bandwidth per device. Ubiquitous deployment of logically-connected, physically distributed computation across shelf, rack and row scale can only be enabled by a new universal I/O that enables socket to socket communication at the energy, latency and bandwidth density of in-package interconnects. No such technology currently exists. Silicon-photonics based optical I/O has the potential to address this critical challenge, but fundamental advances–from chip manufacturing to routing algorithms–are still needed to ensure the scalability of these interconnect systems. To enable high-bandwidth density and energy-efficiency, dense wavelength division multiplexing must be used. High-efficiency ring resonator-based modulators and comb laser sources are needed to enable Tbps rates over each fiber connection and socket bandwidth scaling from 10s to 100s of Tbps. New link architectures like the proposed laser-forwarded coherent link are needed to enable high-efficiency external centralized comb laser sources with modest (sub-mW) power per wavelength per fiber port. The proposed work will also develop new scheduling algorithms, network architectures, and workload parallelism strategy to leverage the bandwidth density and low-latency of the universal optical I/O, to map large AI workloads with massive datasets to a scalable distributed compute system.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
我们的社会正在迅速与基于神经网络的人工智能计算相关。每天都会发明新算法,从而增加了推理和培训的记忆和计算要求。这种爆炸性的增长创造了对分布式机器学习(ML)培训和推理的增强需求。 Openai的估计说明,自2012年以来,计算需求的稳定增长每两年每两年100倍,这比以前通过摩尔的半导体行业定律所享有的计算率提高了50倍,而我们在上半个世纪中享有的。通过快速开发硬件加速器和软件堆栈以支持这些专业计算,这一新计算需求得到了部分满足。硬件加速器提供了大量加速,但是当今的培训任务仍然可能需要几天甚至数周。原因是:随着工人数量(例如计算节点)的增加,每个工人的计算时间减少,但是节点之间的通信要求增加,从而在计算节点之间的互连中产生了瓶颈。未来的分布式ML系统将需要每个节点的互连带宽较高1-2个数量级,从而迫切需要全新的方法来构建分布式ML系统的互连。该建议旨在通过基于推进集成电子和光子技术技术来开发可扩展的互连解决方案来创建一个新的范式来扩展分布式ML计算,该解决方案可以启用直接节点 - 节点光纤连接性。拟议的跨堆栈协作多学科工作将使跨越机器学习,网络,电子 - 光电系统和设备的界限的独特工程师和科学家的教育和培训,这些界限迫在眉睫。首席调查人员拥有与高中生直接互动的既定记录,该学生在伯克利无线研究中心和麻省理工学院的女性技术计划以及波士顿大学的典范本质研究活动中提供暑期实习。 PIS进行的教育和外展活动将确保早期接触并继续培训该领域的新一代领导者,从K-12,通过本科和研究生学习以及继续进行劳动力教育,并特别关注代表性不足的学生。互连已成为启用分布式ML的全部潜力的关键瓶颈。未来的ML工作负载可能需要每台设备几十TBP的带宽。在逻辑连接,物理分布的跨架子,机架和行尺度上的无处不在的部署只能通过新的通用I/O来启用,该通用I/O可以使套接字在包装中互连的能量,延迟和带宽密度以插座通信。目前没有这样的技术。基于硅 - 光子学的光学I/O具有应对这一关键挑战的潜力,但是从芯片制造到路由算法的基本进展仍然需要确保这些互连系统的可扩展性。为了实现高带宽密度和能效,必须使用致密波长的多路复用。基于高效率的环谐振器调节器和梳子激光源可以使每个光纤连接的TBPS速率和插座带宽从10s到100s的TBPS缩放。需要新的链接体系结构,例如所提出的激光反向连贯链接,以使每个波长每个波长每个波长具有适中的(子-MW)功率的高效外部集中式梳子激光源。拟议的工作还将开发新的时间表算法,网络架构和工作量并行策略,以利用通用光学I/O的带宽密度和低延迟的带宽密度和低延迟,以将大量的AI工作负载绘制为大型AI工作负载,以大规模的数据集为可伸缩的分布式系统,这些奖项通过评估NSF的合法任务和支持的良好的支持。 标准。
项目成果
期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Photonic crystal modulator in a CMOS foundry platform
CMOS 代工平台中的光子晶体调制器
- DOI:
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Al Qubaisi, Kenaish;Onural, Deniz;Gevorgyan, Hayk;Popović, Miloš A.
- 通讯作者:Popović, Miloš A.
Reflectionless standing-wave operation in microring resonators
微环谐振器中的无反射驻波操作
- DOI:10.1364/ofc.2022.m3e.3
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Al Qubaisi, Kenaish;Gluhović, Ðorđe;Onural, Deniz;Popović, Miloš A.
- 通讯作者:Popović, Miloš A.
Microring Modulators in a New Silicon Photonics-Optimized 45 nm Monolithic Electronics-Photonics SOI CMOS Platform
新型硅光子优化 45 nm 单片电子光子 SOI CMOS 平台中的微环调制器
- DOI:
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Al Qubaisi, Kenaish;Khilo, Anatol;Gevorgyan, Hayk;Popović, Miloš A.
- 通讯作者:Popović, Miloš A.
Photonic resonators with microring-like behavior based on standing wave cavity pairs with opposite-symmetry modes
- DOI:10.1364/fio.2020.ftu8e.2
- 发表时间:2020-09
- 期刊:
- 影响因子:0
- 作者:K. A. Qubaisi;M. Popović
- 通讯作者:K. A. Qubaisi;M. Popović
Miniature, highly sensitive MOSCAP ring modulators in co-optimized electronic-photonic CMOS
- DOI:10.1364/prj.438047
- 发表时间:2022-01-01
- 期刊:
- 影响因子:7.6
- 作者:Gevorgyan, Hayk;Khilo, Anatol;Popovi, Milos A.
- 通讯作者:Popovi, Milos A.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Milos Popovic其他文献
The landscape of high-affinity human antibodies against intratumoral antigens
针对肿瘤内抗原的高亲和力人类抗体的前景
- DOI:
10.1101/2021.02.06.430058 - 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
G. Rakocevic;I. Glotova;I. de Santiago;B. Ç. Toptas;Milena Popovic;Milos Popovic;D. Leone;A. Stachyra;R. Rozenfeld;Deniz Kural;D. Biasci - 通讯作者:
D. Biasci
Keeping Friends Close, and Their Oil Closer: Rethinking the Role of the Shanghai Cooperation Organization in China's Strive for Energy Security in Kazakhstan
拉近朋友,拉近石油:重新思考上海合作组织在中国争取哈萨克斯坦能源安全中的作用
- DOI:
- 发表时间:
2010 - 期刊:
- 影响因子:0
- 作者:
Milos Popovic - 通讯作者:
Milos Popovic
Fragile Proxies: Explaining Rebel Defection Against Their State Sponsors
脆弱的代理人:解释叛乱分子背叛其国家赞助者的原因
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Milos Popovic - 通讯作者:
Milos Popovic
Poster 42 Impact of increasing intensity of occupational therapy on functional outcomes in sub-acute SCI
- DOI:
10.1016/j.apmr.2013.08.247 - 发表时间:
2013-10-01 - 期刊:
- 影响因子:
- 作者:
Milos Popovic - 通讯作者:
Milos Popovic
Managing Internationalized Civil Wars
管理国际化内战
- DOI:
10.1093/acrefore/9780190228637.013.573 - 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Erin K. Jenne;Milos Popovic - 通讯作者:
Milos Popovic
Milos Popovic的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Milos Popovic', 18)}}的其他基金
Collaborative Research: FuSe: Collaborative Optically Disaggregated Arrays of Extreme-MIMO Radio Units (CODAeMIMO)
合作研究:FuSe:Extreme-MIMO 无线电单元的协作光学分解阵列 (CODAeMIMO)
- 批准号:
2328946 - 财政年份:2023
- 资助金额:
$ 32.5万 - 项目类别:
Continuing Grant
RAISE-EQuIP: Single-Chip, Wall-Plug Photon Pair Source and CMOS Quantum Systems on Chip
RAISE-EQuIP:单芯片、壁插式光子对源和 CMOS 量子片上系统
- 批准号:
1842692 - 财政年份:2018
- 资助金额:
$ 32.5万 - 项目类别:
Standard Grant
OP: Collaborative Research: Coherent Integrated Si-Photonic Links
OP:协作研究:相干集成硅光子链路
- 批准号:
1611086 - 财政年份:2016
- 资助金额:
$ 32.5万 - 项目类别:
Standard Grant
OP: Collaborative Research: Coherent Integrated Si-Photonic Links
OP:协作研究:相干集成硅光子链路
- 批准号:
1701596 - 财政年份:2016
- 资助金额:
$ 32.5万 - 项目类别:
Standard Grant
Molding Optical Field Patterns for Highly Efficient Design of Strong-Confinement Photonic Devices
用于强约束光子器件高效设计的模塑光场图案
- 批准号:
1128709 - 财政年份:2011
- 资助金额:
$ 32.5万 - 项目类别:
Standard Grant
相似国自然基金
临时团队协作历史对协作主动行为的影响研究:基于社会网络视角
- 批准号:72302101
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
在线医疗团队协作模式与绩效提升策略研究
- 批准号:72371111
- 批准年份:2023
- 资助金额:41 万元
- 项目类别:面上项目
数智背景下的团队人力资本层级结构类型、团队协作过程与团队效能结果之间关系的研究
- 批准号:72372084
- 批准年份:2023
- 资助金额:40 万元
- 项目类别:面上项目
A-型结晶抗性淀粉调控肠道细菌协作产丁酸机制研究
- 批准号:32302064
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
面向人机接触式协同作业的协作机器人交互控制方法研究
- 批准号:62373044
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: How faithfully are melt embayments wedded to magma ascent?
合作研究:熔体海湾与岩浆上升的关系有多忠实?
- 批准号:
2221896 - 财政年份:2022
- 资助金额:
$ 32.5万 - 项目类别:
Standard Grant
Collaborative Research: SWIFT: Context-aware Spectrum Coexistence dEsign aNd implemenTation in satellite bands (ASCENT)
合作研究:SWIFT:卫星频段的上下文感知频谱共存设计和实施 (ASCENT)
- 批准号:
2245910 - 财政年份:2022
- 资助金额:
$ 32.5万 - 项目类别:
Standard Grant
Achieving Equity through SocioCulturally-informed, Digitally-Enabled Cancer Pain managemeNT” (ASCENT) Clinical Trial
通过社会文化知情、数字化的癌症疼痛管理 NT™ (ASCENT) 临床试验实现公平
- 批准号:
10539159 - 财政年份:2022
- 资助金额:
$ 32.5万 - 项目类别:
Collaborative Research: SWIFT: Context-aware Spectrum Coexistence dEsign aNd implemenTation in satellite bands (ASCENT)
合作研究:SWIFT:卫星频段的上下文感知频谱共存设计和实施 (ASCENT)
- 批准号:
2128540 - 财政年份:2021
- 资助金额:
$ 32.5万 - 项目类别:
Standard Grant
Collaborative Research: SWIFT: Context-aware Spectrum Coexistence dEsign aNd implemenTation in satellite bands (ASCENT)
合作研究:SWIFT:卫星频段的上下文感知频谱共存设计和实施 (ASCENT)
- 批准号:
2128584 - 财政年份:2021
- 资助金额:
$ 32.5万 - 项目类别:
Standard Grant