Although the fluency of automatically generated abstractive summaries has improved significantly with advanced methods, the inconsistency that remains in summarization is recognized as an issue to be addressed. In this study, we propose a methodology for localizing inconsistency errors in summarization. A synthetic dataset that contains a variety of factual errors likely to be produced by a common summarizer is created by applying sentence fusion, compression, and paraphrasing operations. In creating the dataset, we automatically label erroneous phrases and the dependency relations between them as “inconsistent,” which can contribute to detecting errors more adequately than existing models that rely only on dependency arc-level labels. Subsequently, this synthetic dataset is employed as weak supervision to train a model called SumPhrase, which jointly localizes errors in a summary and their corresponding sentences in the source document. The empirical results demonstrate that our SumPhrase model can detect factual errors in summarization more effectively than existing weakly supervised methods owing to the phrase-level labeling. Moreover, the joint identification of error-corresponding original sentences is proven to be effective in improving error detection accuracy.
尽管随着先进方法的应用,自动生成的摘要的流畅性有了显著提高,但摘要中仍然存在的不一致性被认为是一个需要解决的问题。在这项研究中,我们提出了一种定位摘要中不一致性错误的方法。通过应用句子融合、压缩和释义操作,创建了一个包含各种可能由常见摘要器产生的事实性错误的合成数据集。在创建数据集时,我们自动将错误短语以及它们之间的依存关系标记为“不一致”,这比仅依赖依存弧级标签的现有模型更有助于更充分地检测错误。随后,这个合成数据集被用作弱监督来训练一个名为SumPhrase的模型,该模型共同定位摘要中的错误及其在源文档中对应的句子。实证结果表明,由于采用了短语级标记,我们的SumPhrase模型能够比现有的弱监督方法更有效地检测摘要中的事实性错误。此外,联合识别与错误对应的原始句子被证明在提高错误检测准确性方面是有效的。