With Moore’s law supplying billions of transistors on-chip, embedded systems are undergoing a transition from single-core to multi-core to exploit this high transistor density for high performance. However, the optimal layout of these multiple cores along with the memory subsystem (caches and main memory) to satisfy power, area, and stringent real-time constraints is a challenging design endeavor. The short time-to-market constraint of embedded systems exacerbates this design challenge and necessitates the architectural modeling of embedded systems to reduce the time-to-market by expediting target applications to device/architecture mapping. In this paper, we present a queueing theoretic approach for modeling multi-core embedded systems that provides a quick and inexpensive performance evaluation both in terms of time and resources as compared to the development of multi-core simulators and running benchmarks on these simulators. We verify our queueing theoretic modeling approach by running SPLASH-2 benchmarks on the SuperESCalar simulator (SESC). Results reveal that our queueing theoretic model qualitatively evaluates multi-core architectures accurately with an average difference of 5.6% as compared to the architectures’ evaluations from the SESC simulator. Our modeling approach can be used for performance per watt and performance per unit area characterizations of multi-core embedded architectures, with varying number of processor cores and cache configurations, to provide a comparative analysis.
随着摩尔定律使芯片上拥有数十亿个晶体管,嵌入式系统正从单核向多核转变,以利用这种高晶体管密度来实现高性能。然而,这些多核以及存储子系统(缓存和主存)的最优布局,要满足功耗、面积以及严格的实时约束,是一项具有挑战性的设计工作。嵌入式系统的短上市时间约束加剧了这一设计挑战,并且需要对嵌入式系统进行架构建模,通过加速目标应用到设备/架构的映射来缩短上市时间。在本文中,我们提出一种用于多核嵌入式系统建模的排队论方法,与开发多核模拟器并在这些模拟器上运行基准测试相比,该方法在时间和资源方面都能提供快速且低成本的性能评估。我们通过在SuperESCalar模拟器(SESC)上运行SPLASH - 2基准测试来验证我们的排队论建模方法。结果表明,我们的排队论模型能够准确地对多核架构进行定性评估,与SESC模拟器对架构的评估相比,平均差异为5.6%。我们的建模方法可用于对具有不同数量处理器核心和缓存配置的多核嵌入式架构进行每瓦性能和单位面积性能的特性描述,以提供比较分析。