Thanks to the massive parallel processing power and programmability of general-purpose graphics processing units (GPGPUs), many supercomputing centers as well as servers and high-end mobile devices are increasingly using GPUs for both graphics and general purpose computation. However, communication costs between host CPUs and GPUs have been a performance bottleneck. Recent industry trends towards accelerated processing units (APUs) that integrate CPUs and GPUs on a single die can significantly lower the communication costs and allow for more seamless use of these processing components. As the number of applications for APUs increases, reliability of the APU becomes paramount. In this paper, we describe an architectural vulnerability factor (AVF) modeling framework for APUs that we developed, and we present AVF results for several workloads. Our results include AVF characterization of key hardware structures in the GPU component of the APU and the variation in the AVF over time due to the workload execution on both the CPU and GPU sides of the APU. We also examine the impact of APU sizing on the AVF of the workloads.
由于通用图形处理单元(GPGPU)的大量并行处理能力和可编程性,许多超级计算中心以及服务器和高端移动设备越来越多地使用GPU来用于图形和通用计算。但是,主机CPU和GPU之间的沟通成本一直是性能瓶颈。最近在单个模具上将CPU和GPU集成的加速处理单元(APU)的行业趋势可以大大降低通信成本,并允许更多地使用这些处理组件。随着APU的应用程序数量的增加,APU的可靠性变得至关重要。在本文中,我们描述了我们开发的APU的建筑脆弱性因子(AVF)建模框架,并为几个工作负载提供了AVF结果。我们的结果包括APU的GPU组件中关键硬件结构的AVF表征以及由于APU的CPU和GPU侧的工作负载执行,AVF随时间的变化。我们还研究了APU尺寸对工作负载的AVF的影响。