Provenance is an increasing concern due to the ongoing revolution in sharing and processing scientific data on the Web and in other computer systems. It is proposed that many computer systems will need to become provenance-aware in order to provide satisfactory accountability, reproducibility, and trust for scientific or other high-value data. To date, there is not a consensus concerning appropriate formal models or security properties for provenance. In previous work, we introduced a formal framework for provenance security and proposed formal definitions of properties called disclosure and obfuscation.In this article, we study refined notions of positive and negative disclosure and obfuscation in a concrete setting, that of a general-purpose programing language. Previous models of provenance have focused on special-purpose languages such as workflows and database queries. We consider a higher-order, functional language with sums, products, and recursive types and functions, and equip it with a tracing semantics in which traces themselves can be replayed as computations. We present an annotation-propagation framework that supports many provenance views over traces, including standard forms of provenance studied previously. We investigate some relationships among provenance views and develop some partial solutions to the disclosure and obfuscation problems, including correct algorithms for disclosure and positive obfuscation based on trace slicing.
由于在网络和其他计算机系统中共享和处理科学数据的持续革命,出处是一个越来越多的关注点。建议许多计算机系统需要成为出处,以便为科学或其他高价值数据提供令人满意的问责制,可重复性和信任。迄今为止,尚无关于适当的正式模型或出处的安全属性的共识。在先前的工作中,我们介绍了一个正式的出处安全性框架,并提出了称为披露和混淆的物业的正式定义。在本文中,我们研究了在具体环境中的正面和负面披露和混淆的精致概念,语言。以前的出处模型专注于专用语言,例如工作流和数据库查询。我们考虑一种具有总和,产品和递归类型和功能的高阶,功能性语言,并为其配备了一个追踪语义,其中可以将痕迹本身作为计算重录。我们提出了一个注释传播框架,该框架支持对痕迹的许多出处视图,包括先前研究的标准形式。我们研究了出处观点之间的一些关系,并为披露和混淆问题开发了一些部分解决方案,包括根据痕量切片的正确披露算法和阳性混淆的算法。