Modern High-Performance Computing (HPC) systems are adding extra layers to the memory and storage hierarchy, named deep memory and storage hierarchy (DMSH), to increase I/O performance. New hardware technologies, such as NVMe and SSD, have been introduced in burst buffer installations to reduce the pressure for external storage and boost the burstiness of modern I/O systems. The DMSH has demonstrated its strength and potential in practice. However, each layer of DMSH is an independent heterogeneous system and data movement among more layers is significantly more complex even without considering heterogeneity. How to efficiently utilize the DMSH is a subject of research facing the HPC community. In this paper, we present the design and implementation of Hermes: a new, heterogeneous-aware, multi-tiered, dynamic, and distributed I/O buffering system. Hermes enables, manages, supervises, and, in some sense, extends I/O buffering to fully integrate into the DMSH. We introduce three novel data placement policies to efficiently utilize all layers and we present three novel techniques to perform memory, metadata, and communication management in hierarchical buffering systems. Our evaluation shows that, in addition to automatic data movement through the hierarchy, Hermes can significantly accelerate I/O and outperforms by more than 2x state-of-the-art buffering platforms.
现代高性能计算(HPC)系统正在给内存和存储层次结构增加额外的层,称为深度内存和存储层次结构(DMSH),以提高I/O性能。新的硬件技术,如NVMe和固态硬盘(SSD),已被引入突发缓冲装置中,以减轻外部存储的压力并提高现代I/O系统的突发性。DMSH在实践中已经展示了其实力和潜力。然而,DMSH的每一层都是一个独立的异构系统,即使不考虑异构性,更多层之间的数据移动也明显更加复杂。如何有效地利用DMSH是高性能计算社区面临的一个研究课题。在本文中,我们介绍了Hermes的设计与实现:一个新的、具有异构感知、多层、动态且分布式的I/O缓冲系统。Hermes能够实现、管理、监督,并在某种意义上扩展I/O缓冲,以完全融入DMSH。我们引入了三种新颖的数据放置策略来有效地利用所有层,并提出了三种新颖的技术来在分层缓冲系统中进行内存、元数据和通信管理。我们的评估表明,除了通过层次结构自动进行数据移动外,Hermes还能显著加速I/O,并且比最先进的缓冲平台性能高出两倍多。