Pooling INference and COmbining Distributions Exactly: A Bayesian approach (PINCODE)

准确地汇集推理和组合分布:贝叶斯方法 (PINCODE)

基本信息

  • 批准号:
    EP/X028119/1
  • 负责人:
  • 金额:
    $ 65.95万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2023
  • 资助国家:
    英国
  • 起止时间:
    2023 至 无数据
  • 项目状态:
    未结题

项目摘要

While Likelihood-based Statistics and in particular the Bayesian paradigm represent the gold standards for theoretical underpinnings and uncertainty qualification in many contexts, implementing Bayesian methods can often be challenging, particularly in contexts where data is stored on a collection of disjoint servers and cannot be combined. This situation is becoming increasing common, for instance in the big data context where storing all data on a single server is not technical feasible, or where privacy constraints preclude the sharing of individual shards of data. One application which particularly motivates our work involves inference for rare diseases present in different European countries which cannot share data for confidentiality reasons, but where there is a desire to carry out inference based on the entire data set across all countries.It is reasonable to assume that we can obtain the posterior distribution based just on the data from any one individual server, and that samples from this distribution (which we term a sub-posterior distribution) can be readily obtained, for example through some kind of MCMC approach. Thus the problem reduces to that of generating samples from the product of densities from which we can individually sample. This project will develop a novel approach to this problem. Unlike existing techniques, it is not based on approximation or asymptotic justification. In two proof of concept papers, Dai, Pollock and Roberts (2019, 2021) we have developed two closely related approaches: Monte Carlo Fusion (MCF) and Bayesian Fusion (BF). These methods have many promising properties in terms of accuracy and robustness to inconsistency between sub posteriors. However these methods are not scalable to very large problems and are restricted to the setting where all sub-posteriors describe distributions on the same data parameter sets. In addition, neither MCF nor BF can be applied directly to data subject to privacy constraints. The aims of this project will be to develop a robust scalable and accessible fusion methodology suitable for distributed inference in the variety of contexts described above.Within the privacy context, we shall incorporate the use of homomorphic secret sharing (HSS), a simple multi-party information theoretically secure encryption technique which can be used to securely carry out arithmetic combinations of summaries from multiple parties which each individually do not wish to reveal their summary to the other parties. Whilst we cannot use HSS directly to solve the fusion problem under confidentiality constraints (the so called ConFusion problem), we intend to use the technique in novel ways within the MCF and BF algorithms to solve the problem.PINCODE will first develop a general framework for fusion methods based on the stochastic simulation of coalescing Markov processes with the property that their common coalesced value comes from the combined posterior distribution. Considerable effort will go into the computational efficiency of these constructions with an eye to optimising scalability in data size, the number of distributed servers, dimensionality of the parameter set as well as robustness to sub-posterior heterogeneity. We shall also consider principled approximations of these algorithms and provide provable accuracy guarantees for these methods. The methodology we consider will be widely applicable. One targeted application will involve close collaboration with the FAIRVASC initiative (a Horizon 2020 project to borrow strength between distributed data sets for rare diseases in different European countries). A further direction will involve the incorporation of fusion methodology to provide exact ABC methods. There will be a strong emphasis within our project on software development for wide-ranging and effective dissemination.
虽然基于似然的统计,特别是贝叶斯范式在许多情况下代表了理论基础和不确定性限定的黄金标准,但实施贝叶斯方法通常具有挑战性,特别是在数据存储在不相交服务器集合上且无法组合的情况下。这种情况变得越来越普遍,例如在大数据环境中,将所有数据存储在单个服务器上在技术上不可行,或者隐私限制妨碍了单个数据碎片的共享。特别激励我们工作的一个应用程序涉及对不同欧洲国家中存在的罕见疾病进行推断,这些国家出于保密原因无法共享数据,但希望根据所有国家的整个数据集进行推断。可以合理地假设我们可以仅根据来自任何一台单独服务器的数据来获得后验分布,并且可以轻松获得该分布的样本(我们称之为次后验分布),例如通过某种 MCMC 方法。因此,问题简化为从密度乘积生成样本,我们可以从中单独采样。该项目将开发一种解决该问题的新方法。与现有技术不同,它不是基于近似或渐近证明。在 Dai、Pollock 和 Roberts(2019、2021)的两篇概念验证论文中,我们开发了两种密切相关的方法:蒙特卡罗融合(MCF)和贝叶斯融合(BF)。这些方法在准确性和对次后验之间不一致的鲁棒性方面具有许多有前景的特性。然而,这些方法不能扩展到非常大的问题,并且仅限于所有子后验描述相同数据参数集上的分布的设置。此外,MCF和BF都不能直接应用于受隐私约束的数据。该项目的目标是开发一种强大的可扩展且可访问的融合方法,适用于上述各种环境中的分布式推理。在隐私环境中,我们将结合同态秘密共享(HSS)的使用,这是一种简单的多各方信息理论上是安全的加密技术,可用于安全地对来自多个各方的摘要进行算术组合,而每个各方都不希望将其摘要透露给其他各方。虽然我们不能直接使用 HSS 来解决机密性约束下的融合问题(所谓的 ConFusion 问题),但我们打算在 MCF 和 BF 算法中以新颖的方式使用该技术来解决该问题。PINCODE 将首先开发一个通用框架基于合并马尔可夫过程的随机模拟的融合方法,其共同合并值来自组合后验分布。我们将投入大量精力来提高这些结构的计算效率,着眼于优化数据大小、分布式服务器数量、参数集的维数以及对次后验异质性的鲁棒性的可扩展性。我们还将考虑这些算法的原则近似,并为这些方法提供可证明的准确性保证。我们认为的方法将广泛适用。其中一项有针对性的应用将涉及与 FAIRVASC 计划(Horizo​​n 2020 项目,旨在借用欧洲不同国家罕见疾病分布式数据集之间的优势)的密切合作。进一步的方向将涉及融合方法的结合以提供精确的 ABC 方法。我们的项目将重点关注软件开发,以实现广泛和有效的传播。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Bayesian fusion: scalable unification of distributed statistical analyses
贝叶斯融合:分布式统计分析的可扩展统一
  • DOI:
    10.1093/jrsssb/qkac007
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Dai H
  • 通讯作者:
    Dai H
Methods and applications of PDMP samplers with boundary conditions
边界条件PDMP采样器的方法与应用
  • DOI:
    10.48550/arxiv.2303.08023
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Bierkens Joris
  • 通讯作者:
    Bierkens Joris
Optimal Scaling Results for a Wide Class of Proximal MALA Algorithms
多种近端 MALA 算法的最佳缩放结果
  • DOI:
    10.48550/arxiv.2301.02446
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Crucinio Francesca R.
  • 通讯作者:
    Crucinio Francesca R.
Scaling of Piecewise Deterministic Monte Carlo for Anisotropic Targets
各向异性目标的分段确定性蒙特卡罗缩放
  • DOI:
    10.48550/arxiv.2305.00694
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Bierkens Joris
  • 通讯作者:
    Bierkens Joris
Bayesian inference for high-dimensional discrete-time epidemic models: spatial dynamics of the UK COVID-19 outbreak
高维离散时间流行病模型的贝叶斯推理:英国 COVID-19 疫情的空间动态
  • DOI:
    10.48550/arxiv.2306.07987
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jewell Chris P
  • 通讯作者:
    Jewell Chris P
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Gareth Roberts其他文献

ON B AYESIAN N ONPARAMETRICS
贝叶斯非参数
  • DOI:
  • 发表时间:
    2009
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Isadora Antoniano Villalobos;Julyan Arbel;R. Argiento;Eric Barat;Federico Bassetti;Abhishek Bhattacharya;Anirban Bhattacharya;Pier Giovanni Bissiri;N. Bochkina;Eunice Campir´an;François Caron;Alessandro Carta;Ismael Castillo;A. Cerquetti;J. Ciera;Enkeleda Cuko;P. Blasi;Maria De Iorio;Jos´e C.S. de Miranda;D. Dey;Emanuele Dolera;Chang Dorea;Arnaud Doucet;D. Dunson;O. Dakkak;Michael Escobar;Stefano Favaro;Marian Farah;Giorgio Ferrari;Emily B. Fox;Kassandra M. Fronczyk;Mauro Gasparini;Alan Gelfand;Z. Ghahramani;S. Ghosal;D. Giannikis;Peter Green;Jim Griffin;A. Guglielmi;M. Guindani;G. Hadjicharalambous;Timothy Hanson;Spyridon J. Hatjispyros;Daniel Heinz;Ricardo Henao;G. Hermansen;Amy H. Herring;Nils Lid Hjort;Peter Hoff;Chris C. Holmes;Susan Holmes;Silvano Holzer;Zhaowei Hua;Sam Hui;Rosalba Ignaccolo;D. Imparato;Lancelot F. James;Alejandro Jara;Michael I. Jordan;Arbel Julyan;M. Kalli;G. Karabatsos;Dohyun Kim;Gwangsu Kim;Yong;B. Kleijn;B. Knapik;M. Kolossiatis;W. Kruijer;L. Ladelli;Heng Lian;A. Lijoi;A. Lo;Claudio Macci;S. MacEachern;Andrea Martinelli;Takashi Matsumoto;Karla Medina;Silvia Montagna;Pietro Muliere;Peter M¨uller;Consuelo Nava;L. Nieto;Mexico Itam;Bernardo Nipoti;Andriy Norets;A. Ongaro;Peter Orbanz;Antonio A. Ortiz Barranon;Kosuke Ota;O. Papaspiliopoulos;G. Peccati;Sonia Petrone;Giovanni Pistone;M. J. Polidoro;Cecilia Prosdocimi;Igor Pr¨unster;Anthony P. Quinn;Fernando A. Quintana;Sandra Ramos;E. Regazzini;Eva Riccomagno;Gareth Roberts;Abel Rodriguez;Carlos E. Rodriguez;Alex Rojas;J. Rousseau;Daniel M. Roy;Matteo Ruggiero;B. Scarpa;B. Shahbaba;Dario Spanò;Mark Steel;Erik B. Sudderth;Matthew A. Taddy;Y. W. Teh;Aleksey Tetenov Collegio;Italy Carlo Alberto;L. Trippa;Stephen G. Walker;A. Wedlin;Sinead Williamson;Fei Xiang;Hao Wu;Oliver Zobay
  • 通讯作者:
    Oliver Zobay
Supporting Women’s Livelihoods at Scale: RCT Evidence from a Nationwide Graduation Program
大规模支持妇女的生计:来自全国毕业计划的随机对照试验证据
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    I. Botea;Markus Goldstein;Corinne Low;Gareth Roberts
  • 通讯作者:
    Gareth Roberts
langcom 1792 ( In ) sensitivity to incoherence in human communication
langcom 1792 ( In ) 对人类沟通中不连贯的敏感性
  • DOI:
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Gareth Roberts;Benjamin Langstein;Bruno Galantucci
  • 通讯作者:
    Bruno Galantucci
Language and the Free-Rider Problem: An Experimental Paradigm
语言和搭便车问题:实验范式
  • DOI:
  • 发表时间:
    2008
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Gareth Roberts
  • 通讯作者:
    Gareth Roberts

Gareth Roberts的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Gareth Roberts', 18)}}的其他基金

On intelligenCE And Networks - Synergistic research in Bayesian Statistics, Microeconomics and Computer Sciences - OCEAN
论智能与网络 - 贝叶斯统计、微观经济学和计算机科学的协同研究 - OCEAN
  • 批准号:
    EP/Y014650/1
  • 财政年份:
    2023
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Research Grant
Key factors in the emergence of combinatorial structure: An experimental and computational approach
组合结构出现的关键因素:实验和计算方法
  • 批准号:
    1946882
  • 财政年份:
    2020
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Standard Grant
CoSInES (COmputational Statistical INference for Engineering and Security)
CoSInES(工程和安全计算统计推断)
  • 批准号:
    EP/R034710/1
  • 财政年份:
    2018
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Research Grant
The FIREsIdE International Collaboration: FIre Radiative powEr validation, Intercomparison & fire emissions Estimation
FIREsIdE 国际合作:火灾辐射功率验证、比对
  • 批准号:
    NE/M017958/1
  • 财政年份:
    2015
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Research Grant
Intractable Likelihood: New Challenges from Modern Applications (ILike)
棘手的可能性:现代应用的新挑战(Ilike)
  • 批准号:
    EP/K014463/1
  • 财政年份:
    2013
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Research Grant
RUI: Investigating Central Configurations in the N-Body and N-Vortex Problems
RUI:研究 N 体和 N 涡问题中的中心配置
  • 批准号:
    1211675
  • 财政年份:
    2012
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Standard Grant
A longitudinal model for the spread of bovine tuberculosis
牛结核病传播的纵向模型
  • 批准号:
    BB/I013482/1
  • 财政年份:
    2011
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Research Grant
InFER: Likelihood-based Inference for Epidemic Risk
InFER:基于可能性的流行病风险推断
  • 批准号:
    BB/H00811X/1
  • 财政年份:
    2010
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Research Grant
Inference for Diffusions and Related Processes
扩散推理及相关过程
  • 批准号:
    EP/G026521/1
  • 财政年份:
    2009
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Research Grant
RUI: Questions on Finiteness and Stability in Celestial Mechanics
RUI:天体力学的有限性和稳定性问题
  • 批准号:
    0708741
  • 财政年份:
    2007
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Standard Grant

相似国自然基金

结合动态嵌套贝叶斯网络和启发式推理的长跨结构安全评估
  • 批准号:
    52178276
  • 批准年份:
    2021
  • 资助金额:
    58 万元
  • 项目类别:
    面上项目
一种结合并行动态同位素标记实验和贝叶斯推理的基因组规模动力学模型构建和分析方法
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    58 万元
  • 项目类别:
    面上项目
面向海量高维数据的可深度结合的贝叶斯网学习与推理新方法研究
  • 批准号:
    61502198
  • 批准年份:
    2015
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目
模糊推理与不确定性决策理论的结合研究
  • 批准号:
    11171308
  • 批准年份:
    2011
  • 资助金额:
    46.0 万元
  • 项目类别:
    面上项目
模糊推理中规则约简模型及其相关研究
  • 批准号:
    61165014
  • 批准年份:
    2011
  • 资助金额:
    50.0 万元
  • 项目类别:
    地区科学基金项目

相似海外基金

Pooling INference and COmbining Distributions Exactly: A Bayesian approach (PINCODE)
准确地汇集推理和组合分布:贝叶斯方法 (PINCODE)
  • 批准号:
    EP/X027872/1
  • 财政年份:
    2023
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Research Grant
Pooling INference and COmbining Distributions Exactly: A Bayesian Approach (PINCODE)
池化推理和精确组合分布:贝叶斯方法 (PINCODE)
  • 批准号:
    EP/X028100/1
  • 财政年份:
    2023
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Research Grant
Pooling INference and COmbining Distributions Exactly: A Bayesian Approach PINCODE
池化推理和精确组合分布:贝叶斯方法 PINCODE
  • 批准号:
    EP/X028712/1
  • 财政年份:
    2023
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Research Grant
Small area estimation, combining data from multiple sources, and inference from non-probability samples
小区域估计,结合多个来源的数据,以及非概率样本的推断
  • 批准号:
    RGPIN-2019-06181
  • 财政年份:
    2021
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Discovery Grants Program - Individual
Combining statistical causal inference and structural estimation
结合统计因果推断和结构估计
  • 批准号:
    20K01597
  • 财政年份:
    2020
  • 资助金额:
    $ 65.95万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了