Recently, composite-order bilinear pairing has been shown to be useful in many cryptographic constructions. However, it is time-costly to evaluate. This is because the composite order should be at least 1024bit and, hence, the elliptic curve group order n and base field become too large, rendering the bilinear pairing algorithm itself too slow to be practical (e.g., the Miller loop is Ω(n)). Thus, composite-order computation easily becomes the bottleneck of a cryptographic construction, especially, in the case where many pairings need to be evaluated at the same time. The existing solution to this problem that converts composite-order pairings to prime-order ones is only valid for certain constructions. In this paper, we leverage the huge number of threads available on Graphics Processing Units (GPUs) to speed up composite-order pairing computation. We investigate suitable SIMD algorithms for base field, extension field, elliptic curve and bilinear pairing computation as well as mapping these algorithms into GPUs with careful considerations. Experimental results show that our method achieves a record of 8.7ms per pairing on a 1024bit security level, which is a 20-fold speedup compared to state-of-the-art CPU implementation. This result also opens the road to adopting higher security levels and using rich-resource parallel platforms, which for example are available in cloud computing. In fact, we can achieve more than 24 times speedup on a 2048bit security level and a record of 7× 10−6 USD per pairing on the Amazon cloud computing environment.
最近,复合阶双线性对已被证明在许多密码学构造中是有用的。然而,它的计算是耗时的。这是因为复合阶至少应为1024位,因此椭圆曲线群阶n和基域变得太大,使得双线性对算法本身太慢而不实用(例如,米勒循环是Ω(n))。因此,复合阶计算很容易成为密码学构造的瓶颈,特别是在需要同时计算许多对的情况下。将复合阶对转换为素数阶对的现有解决方案仅对某些构造有效。在本文中,我们利用图形处理单元(GPU)上大量可用的线程来加速复合阶双线性对计算。我们研究了适用于基域、扩域、椭圆曲线和双线性对计算的单指令多数据(SIMD)算法,并在仔细考虑的情况下将这些算法映射到GPU上。实验结果表明,我们的方法在1024位安全级别上实现了每次对计算8.7毫秒的记录,与最先进的CPU实现相比,速度提高了20倍。这一结果也为采用更高的安全级别和使用资源丰富的并行平台(例如云计算中可用的平台)开辟了道路。事实上,我们在2048位安全级别上可以实现超过24倍的加速,并且在亚马逊云计算环境中每次对计算的成本达到7×10⁻⁶美元的记录。