Software maintenance constitutes a large portion of the software development lifecycle. To carry out maintenance tasks, developers often need to understand and reproduce bug reports. As such, there has been increasing research activity coalescing around the notion of automating various activities related to bug reporting. A sizable portion of this research interest has focused on the domain of mobile apps. However, as research around mobile app bug reporting progresses, there is a clear need for a manually vetted and reproducible set of real-world bug reports that can serve as a benchmark for future work. This paper presents AndroR2: a dataset of 90 manually reproduced bug reports for Android apps listed on Google Play and hosted on GitHub, systematically collected via an in-depth analysis of 459 reports extracted from the GitHub issue tracker. For each reproduced report, AndroR2 includes the original bug report, an apk file for the buggy version of the app, an executable reproduction script, and metadata regarding the quality of the reproduction steps associated with the original report. We believe that the AndroR2 dataset can be used to facilitate research in automatically analyzing, understanding, reproducing, localizing, and fixing bugs for mobile applications as well as other software maintenance activities more broadly.
软件维护在软件开发生命周期中占很大一部分。为了执行维护任务,开发人员经常需要理解和重现错误报告。因此,围绕与错误报告相关的各种活动自动化的概念,研究活动日益增多。这一研究兴趣有相当一部分集中在移动应用领域。然而,随着移动应用错误报告相关研究的进展,显然需要一组经过人工审核且可重现的真实世界错误报告,作为未来工作的基准。本文介绍了AndroR2:一个包含90个针对谷歌应用商店中列出且托管在GitHub上的安卓应用的人工重现错误报告的数据集,它是通过对从GitHub问题跟踪器中提取的459个报告进行深入分析而系统收集的。对于每个重现的报告,AndroR2包括原始错误报告、应用有错误版本的apk文件、一个可执行的重现脚本,以及与原始报告相关的重现步骤质量的元数据。我们相信AndroR2数据集可用于促进移动应用以及更广泛的其他软件维护活动中自动分析、理解、重现、定位和修复错误的研究。