Artificial intelligence (AI) hardware is positioned to unlock revolutionary computational abilities across diverse fields ranging from fundamental science [1] to medicine [2] and environmental science [3] by leveraging advanced semiconductor chips interconnected in vast distributed networks. However, AI chip development has far outpaced that of the networks that connect them, as chip computation speeds have accelerated a thousandfold faster than communication bandwidth over the last two decades [4, 5]. This gap is the largest barrier for scaling AI performance [6, 7] and results from the disproportionately high energy expended to transmit data [8], which is two orders of magnitude more intensive than computing [9]. Here, we show a leveling of this long-standing discrepancy and achieve the lowest energy optical data link to date through dense 3D integration of photonic and electronic chips. At 120 fJ of consumed energy per communicated bit and 5.3 Tb/s bandwidth per square millimeter of chip area, our platform simultaneously achieves a twofold improvement in both energy consumption and bandwidth density relative to prior demonstrations [10, 11]. These improvements are realized through employing massively parallel 80 channel microresonator-based transmitter and receiver arrays operating at 10 Gb/s per channel, occupying a combined chip footprint of only 0.32 mm2. Furthermore, commercial complementary metal-oxide-semiconductor (CMOS) foundries fabricate both the electronic and photonic chips on 300 mm wafers, providing a clear avenue to volume scaling. Through these demonstrated ultra-energy efficient, high bandwidth data communication links, this work eliminates the bandwidth bottleneck between spatially distanced compute nodes and will enable a fundamentally new scale of future AI computing hardware without constraints on data locality.
人工智能(AI)硬件旨在通过利用在庞大分布式网络中相互连接的先进半导体芯片,在从基础科学[1]到医学[2]以及环境科学[3]等不同领域释放革命性的计算能力。然而,AI芯片的发展速度远远超过了连接它们的网络,因为在过去二十年中,芯片的计算速度比通信带宽快了一千倍[4,5]。这种差距是提升AI性能的最大障碍[6,7],并且是由于传输数据所消耗的能量过高所致,其强度比计算高出两个数量级[9]。在此,我们展示了这种长期存在的差异得到了平衡,并通过光子芯片和电子芯片的密集3D集成实现了迄今为止能耗最低的光数据链路。每传输一位数据消耗120飞焦的能量,每平方毫米芯片面积具有5.3太比特/秒的带宽,我们的平台相对于先前的成果[10,11]在能耗和带宽密度方面同时实现了两倍的提升。这些改进是通过采用基于微谐振器的80通道大规模并行发射器和接收器阵列实现的,每个通道的运行速度为10吉比特/秒,芯片总面积仅为0.32平方毫米。此外,商业互补金属氧化物半导体(CMOS)代工厂在300毫米晶圆上制造电子芯片和光子芯片,为量产提供了明确的途径。通过这些已展示的超高能效、高带宽数据通信链路,这项工作消除了空间上相距较远的计算节点之间的带宽瓶颈,并将使未来的AI计算硬件在不受数据局部性限制的情况下达到全新的规模。