Developing a bug-free compiler is difficult; modern optimizing compilers are among the most complex software systems humans build. Fuzzing is one way to identify subtle compiler bugs that are hard to find with human-constructed tests. Grammar-based fuzzing, however, requires a grammar for a compiler’s input language, and can miss bugs induced by code that does not actually satisfy the grammar the compiler should accept. Grammar-based fuzzing also seldom uses advanced modern fuzzing techniques based on coverage feedback. However, modern mutation-based fuzzers are often ineffective for testing compilers because most inputs they generate do not even come close to getting past the parsing stage of compilation. This paper introduces a technique for taking a modern mutation-based fuzzer (AFL in our case, but the method is general) and augmenting it with operators taken from mutation testing, and program splicing. We conduct a controlled study to show that our hybrid approaches significantly improve fuzzing effectiveness qualitatively (consistently finding unique bugs that baseline approaches do not) and quantitatively (typically finding more unique bugs in the same time span, despite fewer program executions). Our easy-to-apply approach has allowed us to report more than 100 confirmed and fixed bugs in production compilers, and found a bug in the Solidity compiler that earned a security bounty.
开发一个无bug的编译器是困难的;现代优化编译器是人类构建的最复杂的软件系统之一。Fuzzing是一种识别细微编译器错误的方法,这些错误很难通过人工构建的测试发现。然而,基于语法的模糊化需要编译器输入语言的语法,并且可能会错过由实际上不满足编译器应该接受的语法的代码引起的错误。基于语法的模糊也很少使用基于覆盖反馈的先进的现代模糊技术。然而,现代的基于变异的模糊器对于测试编译器通常是无效的,因为它们生成的大多数输入甚至没有接近编译的解析阶段。本文介绍了一种技术,采取现代的变异为基础的模糊(AFL在我们的情况下,但该方法是通用的),并增加它与运营商从突变测试,和程序拼接。我们进行了一项对照研究,以表明我们的混合方法显着提高模糊的有效性定性(一贯发现独特的错误,基线方法不)和定量(通常发现更多的独特的错误,在同一时间跨度,尽管更少的程序执行)。我们易于应用的方法使我们能够在生产编译器中报告100多个已确认和修复的错误,并在Solidity编译器中发现了一个获得安全奖励的错误。