Recently, cloud platforms play an essential role in large-scale big data analytics and especially running scientific workflows. In contrast to traditional on-premise computing environments, where the number of resources is bounded, cloud computing can provide practically unlimited resources to a workflow application based on a pay-as-you-go pricing model. One challenge of using cloud computing is the protection of the privacy of the confidential workflow’s tasks, whose proprietary algorithm implementations are intellectual properties of the respective stakeholders. Another one is the monetary cost optimization of executing workflows in the cloud while satisfying a user-defined deadline. In this paper, we use the Intel Software Guard eXtensions (SGX) as a Trusted Execution Environment (TEE) to support the confidentiality of individual workflow tasks. Based on this, we propose a deadline-constrained and SGX-aware workflow scheduling algorithm, called SEED (SGX, Efficient, Effective, Deadline Constrained), to address these two challenges. SEED features several heuristics, including exploiting the longest critical paths and reuse of extra times in existing virtual machine instances. Our experiments show that SEED outperforms the representative algorithm, IC-PCP, in most cases in monetary cost while satisfying the given user-defined deadline. To our best knowledge, this is the first workflow scheduling algorithm that considers protecting the confidentiality of workflow tasks in a public cloud computing environment.
近年来,云平台在大规模大数据分析,尤其是运行科学工作流方面发挥着至关重要的作用。与资源数量有限的传统本地计算环境不同,云计算能够基于按需付费的定价模式为工作流应用提供几乎无限的资源。使用云计算的一个挑战是保护机密工作流任务的隐私,这些任务的专有算法实现是相关利益相关者的知识产权。另一个挑战是在满足用户定义的期限的同时,优化在云中执行工作流的货币成本。在本文中,我们使用英特尔软件防护扩展(SGX)作为可信执行环境(TEE)来支持单个工作流任务的机密性。在此基础上,我们提出了一种有期限约束且具有SGX感知的工作流调度算法,称为SEED(SGX,高效、有效、有期限约束),以应对这两个挑战。SEED具有多种启发式方法,包括利用最长关键路径以及在现有虚拟机实例中重复利用额外时间。我们的实验表明,在满足给定的用户定义期限的情况下,SEED在大多数情况下在货币成本方面优于代表性算法IC - PCP。据我们所知,这是第一个在公共云计算环境中考虑保护工作流任务机密性的工作流调度算法。