ProvTemp: Provenance templates as a method for facilitating provenance capture and simulating provenance data
ProvTemp:出处模板作为促进出处捕获和模拟出处数据的方法
基本信息
- 批准号:EP/N027426/1
- 负责人:
- 金额:$ 12.86万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2016
- 资助国家:英国
- 起止时间:2016 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Our world is increasingly driven by data. Medical, economic and political decisions are made based on the results of automatically analysing ever-growing volumes of data. Whether these are patient treatment decisions or stock trading recommendations, if we are to trust the decisions being made, we need to have insight into the workings of these systems and achieve understanding of their outputs - referred to as their provenance. Related to the issue of trust is the concept of reproducibility in scientific discovery, as the ultimate test of findings' validity. Science is now all but impossible without data-intensive infrastructures, but these changes make research harder to verify and follow using traditional "pen-and-paper" methods, and new techniques are required to ensure correctness. A number of recent studies looked into published research in certain areas, only to find that a minority could be reproduced using the information provided. Understanding the provenance of the data and processes that we are relying on has never been more critical.Data provenance is a research field dedicated to standardised, uniform, representation of the network of data products, tasks that create and use those data, and the human and software actors who perform these tasks - typically represented as provenance graphs. Popular in "computational" disciplines that have long relied on scientific software, provenance is now becoming relevant and necessary to areas which have only recently become data-driven and which operate using multiple disjointed software tools. In order to facilitate the adoption of provenance in these disciplines, ProvTemp project is modeling provenance templates - the provenance graph fragments that multiple software tools can compose into a unified, meaningful trace of conducted research. A set of templates is defined by the scientists, describing the research details that need to be captured, and these are then translated into concrete provenance data. This theoretical work has two immediate applications. The first is a method for introducing provenance into scientific environments by integrating with existing software tools, minimising the effort needed for the developers of those tools to start capturing provenance. Second is a mechanism for using the templates to simulate realistic provenance data that would be produced from those templates, allowing them to be tested to ensure they are sufficiently informative for the intended purpose, e.g. publishing details of research task, providing legally required audit trail etc.The ProvTemp approach shall be evaluated on the example of modelling a clinical trial. The medical research community is a typical example of a non-computational discipline becoming increasingly data-driven, and it is currently moving towards big data enabled, intelligent infrastructures through use of data routinely captured in Electronic Health Record systems. The trend in medical research is towards Learning Health Systems, which seek to maximise and optimise the use and benefit of EHR data in clinical research and practice. The EU TRANSFoRm project, implemented a prototype software infrastructure for the Learning Health System, and conducted an international clinical trial, driven by EHR data. ProvTemp approach will replicate the trial execution using provenance templates, and examine the produced provenance data to ensure our method is valid and applicable to future clinical trials.In addition to the clinical trial work, we shall work closely with UK's Software Sustainability Institute which promotes sustainable software technologies. SSI shall assist in ensuring that ProvTemp is generalisable and relevant to other scientific disciplines. We shall also engage the public in defining the wider questions around reproducibility and quality of research. Finally, ProvTemp will produce a roadmap for further research, taking stock of the work done and identifying future opportunities.
我们的世界越来越受数据驱动。医疗,经济和政治决策是根据自动分析不断增长的数据的结果做出的。无论这些是患者治疗决策还是股票交易建议,如果我们要相信所做出的决定,我们都需要深入了解这些系统的运作,并了解其产出的理解 - 称为其出处。与信任问题有关的是科学发现中可重复性的概念,这是对发现的有效性的最终检验。现在,如果没有数据密集的基础架构,科学几乎是不可能的,但是这些变化使研究更加难以验证和遵循传统的“笔和纸”方法,并且需要新技术来确保正确性。最近的许多研究研究了某些领域已发表的研究,只是发现可以使用所提供的信息来复制少数群体。了解我们所依赖的数据和流程的出处从来都不是至关重要的。DATA出处是一个研究领域,致力于标准化,统一的数据产品网络的表示,创建和使用这些数据的任务以及执行这些任务的人和软件参与者 - 通常表示为出处图。在长期以来一直依赖科学软件的“计算”学科中,出处现在已经变得相关和必要,而这些领域直到最近才成为数据驱动并使用多个分离的软件工具运行。为了促进在这些学科中的出处,ProvTemp项目正在建模出处模板 - 多个软件工具可以将其构成的统一,有意义的有意义的研究痕迹的出处片段片段。一组模板由科学家定义,描述了需要捕获的研究细节,然后将其转换为具体的出处数据。这项理论工作有两个即时应用。首先是一种通过与现有软件工具集成到科学环境中的方法,最大程度地减少了这些工具开发人员开始捕获出处所需的努力。其次是使用模板模拟这些模板会产生的现实出处数据的机制,从而可以对其进行测试,以确保它们为预期目的提供足够的信息,例如发布研究任务的详细信息,提供法律要求的审计跟踪等。应在建模临床试验的示例中评估ProvTEMP方法。医学研究界是非计算学科变得越来越多的数据驱动的一个典型例子,并且目前正在通过使用在电子健康记录系统中经常捕获的数据来朝着支持大数据的智能基础架构迈进。医学研究的趋势是学习卫生系统,该系统试图最大程度地利用和优化EHR数据在临床研究和实践中的使用和利益。欧盟转型项目为学习卫生系统实施了原型软件基础架构,并进行了由EHR数据驱动的国际临床试验。 ProvTemp方法将使用出处模板复制试验执行,并检查生产的出处数据,以确保我们的方法有效并且适用于将来的临床试验。在临床试验工作中,我们将与英国软件可持续性研究所密切合作,该研究所促进可持续软件技术。 SSI应协助确保Provtemp具有普遍性并与其他科学学科有关。我们还将吸引公众定义有关可重复性和研究质量的更广泛问题。最后,ProvTemp将制作一个路线图,以进行进一步的研究,盘点完成的工作并确定未来的机会。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
LabelFlow Framework for Annotating Workflow Provenance
- DOI:10.3390/informatics5010011
- 发表时间:2018-02
- 期刊:
- 影响因子:0
- 作者:Pinar Alper;Khalid Belhajjame;V. Curcin;C. Goble
- 通讯作者:Pinar Alper;Khalid Belhajjame;V. Curcin;C. Goble
Requirements and validation of a prototype learning health system for clinical diagnosis.
用于临床诊断的原型学习健康系统的要求和验证。
- DOI:10.1002/lrh2.10026
- 发表时间:2017
- 期刊:
- 影响因子:3.1
- 作者:Corrigan D
- 通讯作者:Corrigan D
Provenance and Annotation of Data and Processes - 7th International Provenance and Annotation Workshop, IPAW 2018, London, UK, July 9-10, 2018, Proceedings
数据和流程的来源和注释 - 第七届国际来源和注释研讨会,IPAW 2018,英国伦敦,2018 年 7 月 9-10 日,会议记录
- DOI:10.1007/978-3-319-98379-0_18
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Corsar D
- 通讯作者:Corsar D
Provenance and Annotation of Data and Processes - 8th and 9th International Provenance and Annotation Workshop, IPAW 2020 + IPAW 2021, Virtual Event, July 19-22, 2021, Proceedings
数据和流程的出处和注释 - 第 8 届和第 9 届国际出处和注释研讨会,IPAW 2020 IPAW 2021,虚拟活动,2021 年 7 月 19-22 日,会议记录
- DOI:10.1007/978-3-030-80960-7_22
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Chapman M
- 通讯作者:Chapman M
Using Microservices to Design Patient-facing Research Software
- DOI:10.1109/escience55777.2022.00019
- 发表时间:2022-01-01
- 期刊:
- 影响因子:0
- 作者:Chapman, Martin;G-Medhin, Abigail;Curcin, Vasa
- 通讯作者:Curcin, Vasa
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Vasa Curcin其他文献
Machine learning to optimise statin therapy using real-world primary care outcomes: can statin doses be reduced in some patients?
- DOI:
10.1016/j.atherosclerosis.2024.117927 - 发表时间:
2024-08-01 - 期刊:
- 影响因子:
- 作者:
Andrew Krentz;Lisa Fournier;Thomas Castiglione;Vasa Curcin;Camil Haldane;Tianyi Liu;Andre Jaun - 通讯作者:
Andre Jaun
457 – On Demand Vs Continuous Use of Proton Pump Inhibitors (PPI) on Symptom Burden and Quality of Life: Results of a Real-World Rct in Primary Care Patients with Gastroesophageal Reflux Disease (GORD)
- DOI:
10.1016/s0016-5085(19)37030-1 - 发表时间:
2019-05-01 - 期刊:
- 影响因子:
- 作者:
Anna Andreasson;Lars Agréus;Robert Verheij;Ellen Wright;Vasa Curcin;Brendan D. Delaney - 通讯作者:
Brendan D. Delaney
Vasa Curcin的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
原始瓷产地的再研究
- 批准号:10875169
- 批准年份:2008
- 资助金额:38.0 万元
- 项目类别:面上项目
相似海外基金
Postdoctoral Fellowship: EAR-PF: Petrochronometers as provenance proxies: implications for the spatio-temporal evolution of continental collision to escape
博士后奖学金:EAR-PF:石油测时计作为起源代理:对大陆碰撞逃逸的时空演化的影响
- 批准号:
2305217 - 财政年份:2024
- 资助金额:
$ 12.86万 - 项目类别:
Fellowship Award
CAREER: Foundational Principles for Harnessing Provenance Analytics for Advanced Enterprise Security
职业:利用来源分析实现高级企业安全的基本原则
- 批准号:
2339483 - 财政年份:2024
- 资助金额:
$ 12.86万 - 项目类别:
Continuing Grant
Provenance Analytics Model for Research Software (PARS)
研究软件来源分析模型 (PARS)
- 批准号:
EP/X036383/1 - 财政年份:2024
- 资助金额:
$ 12.86万 - 项目类别:
Research Grant
CAMO: Counterfeit Attestation MOdule for Electronics Supply Chain Tracking and Provenance
CAMO:用于电子供应链跟踪和来源的防伪认证模块
- 批准号:
2341895 - 财政年份:2024
- 资助金额:
$ 12.86万 - 项目类别:
Standard Grant
I-Corps: Translation potential of using provenance-based threat detection for improving cybersecurity
I-Corps:使用基于来源的威胁检测来提高网络安全的转化潜力
- 批准号:
2424261 - 财政年份:2024
- 资助金额:
$ 12.86万 - 项目类别:
Standard Grant