Exploring Efficient Automated Design Choices for Robust Machine Learning Algorithms

探索稳健的机器学习算法的高效自动化设计选择

基本信息

  • 批准号:
    2748823
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Studentship
  • 财政年份:
    2022
  • 资助国家:
    英国
  • 起止时间:
    2022 至 无数据
  • 项目状态:
    未结题

项目摘要

Are you familiar with machine learning? Do you have an aptitude for analysing and disseminating information from a variety of outcomes? Would you like to assist GCHQ develop and design new algorithms for both time-efficiency and energy-efficiency solutions? Are you keen on developing novel approaches and using modern computing architectures that make it easy to apply Deep Learning and Gaussian Processes to real problems?Applying Machine Learning (ML) currently requires the data scientist to make design choices. These choices might relate, for example, to: choosing the number of layers and the number of neurons in each layer of a Deep Neural Network; choosing which kernel family to use in a Gaussian Process. Since ML algorithms often involve time-consuming training regimes, data scientists often find it laborious to iterate between (re)-identifying candidate design choices and (re)-training the ML algorithms. Furthermore, different design choices can alter both how many hyper-parameters (e.g. neuron weights or kernel widths and cross-covariance terms) need to considered but also how challenging it is to optimise the hyper-parameters of the ML algorithm. Since practitioners have limited time to perform sensitivity analyses with respect to these parameters, design choice are typically based on estimated performance (calculated as an average over the test set) with very limited, if any, consideration for the variance in this estimate. It is important that this variance is considered since it will determine how likely it is that performance on the test set will accurately predict empirical performance when the algorithm is deployed operationally. Indeed, robust performance requires that we do not optimise the hyper-parameters (e.g. using stochastic gradient descent) but generate a set of samples for the hyper-parameters that are consistent with the data and then average across these sampled values for the hyper-parameters.Numerical Bayesian algorithms exist that can explore the design choices and the possible hyper-parameter values associated with each design choice. Mature variants of these algorithms exist and involve the use of Markov-Chain Monte Carlo (MCMC), with Reversible Jump MCMC (RJMCMC) being a variant applicable in contexts where the design choice alters the number of hyper-parameters that need to be considered. In general, and particularly in the case of RJMCMC, these mature algorithms are sufficiently slow and computationally demanding that they are widely assumed to be impractical for practical use in real-world scenarios.Recent advances at the University of Liverpool have identified that Sequential Monte Carlo (SMC) samplers are an alternative family of numerical Bayesian algorithms that offer the potential to improve on both the time-efficiency and energy-efficiency of MCMC algorithms. In this context, SMC samplers can be considered to comprise a team of sub-algorithms that collaborate to explore the space of design choices and associated hyper-parameters. By distributing the sub-algorithms across parallel computational resources, SMC samplers can improve time-efficiency. Since the sub-algorithms only need to avoid all failing at once, they can each be more adventurous in their exploration than the single MCMC algorithm: this can lead to energy-efficiency gains. Perhaps surprisingly, the potential for SMC samplers to automate design choices, while also exploring the associated hyper-parameter values, is largely unexplored. This PhD will investigate the significant potential to apply SMC samplers in this context.
您熟悉机器学习吗?您是否有能力分析和传播各种结果的信息?您是否想协助GCHQ开发和设计新算法,以达到时间效率和能源效率解决方案?您是否热衷于开发新颖的方法并使用现代计算体系结构,这些架构使得将深度学习和高斯流程应用于实际问题变得很容易?当前应用机器学习(ML)需要数据科学家做出设计选择。这些选择可能与:选择深神经网络每一层中的层数和神经元数的数量相关。选择在高斯过程中使用哪个内核家庭。由于ML算法通常涉及耗时的培训制度,因此数据科学家经常发现在(重新)识别候选设计选择和(重新)培训ML算法之间艰苦的迭代。此外,不同的设计选择可以改变需要考虑多少个超参数(例如神经元的权重或内核宽度和跨稳定术语),也需要考虑多么挑战,以优化ML算法的超参数。由于从业人员对这些参数进行灵敏度分析的时间有限,因此设计选择通常基于估计的性能(在测试集的平均值计算),并且在此估算中考虑了差异。重要的是要考虑这一方差,因为它将确定测试集上的性能在操作上部署算法时会准确预测经验性能的可能性。实际上,稳健的性能要求我们不优化超参数(例如,使用随机梯度下降),而是生成一组与数据一致的超参数的样本,然后在超级参数中的这些样本值中平均探索各种各样的超级算法,可以探索各种设计的设计选择,以探索各种设计的选择。这些算法的成熟变体存在,并涉及马尔可夫链蒙特卡洛(MCMC)的使用,可逆跳跃MCMC(RJMCMC)是适用于设计选择的上下文中的变体,在这种情况下,可以考虑需要考虑需要考虑的超参数的数量。通常,通常,尤其是在RJMCMC的情况下,这些成熟的算法足够缓慢且计算上要求,在现实情况下,人们普遍认为它们是不切实际的。利物浦大学的进展。 MCMC算法的时间效率和能效。在这种情况下,可以考虑使用SMC采样器组成一个次级合理的团队,该团队合作探索设计选择的空间和相关的超参数。通过在平行计算资源上分配子算法,SMC采样器可以改善时间效率。由于子算法只需要一次避免所有失败,因此与单个MCMC算法相比,它们在探索中的探索中都可以更具冒险精神:这可以导致能源效率增长。也许令人惊讶的是,SMC采样器可以自动化设计选择的潜力,同时探索相关的高参数值,这在很大程度上没有探索。该博士将研究在这种情况下应用SMC采样器的重要潜力。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

暂无数据

数据更新时间:2024-06-01

其他文献

Metal nanoparticles entrapped in metal matrices.
Ged?chtnis und Wissenserwerb [Memory and knowledge acquisition]
  • DOI:
    10.1007/978-3-662-55754-9_2
    10.1007/978-3-662-55754-9_2
  • 发表时间:
    2019-01-01
    2019-01-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
A Holistic Evaluation of CO2 Equivalent Greenhouse Gas Emissions from Compost Reactors with Aeration and Calcium Superphosphate Addition
曝气和添加过磷酸钙的堆肥反应器二氧化碳当量温室气体排放的整体评估
  • DOI:
    10.3969/j.issn.1674-764x.2010.02.010
    10.3969/j.issn.1674-764x.2010.02.010
  • 发表时间:
    2010-06
    2010-06
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
共 3349423 条
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 669885
前往

的其他基金

An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
  • 批准号:
    2901954
    2901954
  • 财政年份:
    2028
  • 资助金额:
    --
    --
  • 项目类别:
    Studentship
    Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
  • 批准号:
    2896097
    2896097
  • 财政年份:
    2027
  • 资助金额:
    --
    --
  • 项目类别:
    Studentship
    Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
  • 批准号:
    2780268
    2780268
  • 财政年份:
    2027
  • 资助金额:
    --
    --
  • 项目类别:
    Studentship
    Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
  • 批准号:
    2908918
    2908918
  • 财政年份:
    2027
  • 资助金额:
    --
    --
  • 项目类别:
    Studentship
    Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
  • 批准号:
    2908693
    2908693
  • 财政年份:
    2027
  • 资助金额:
    --
    --
  • 项目类别:
    Studentship
    Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
  • 批准号:
    2908917
    2908917
  • 财政年份:
    2027
  • 资助金额:
    --
    --
  • 项目类别:
    Studentship
    Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
  • 批准号:
    2879438
    2879438
  • 财政年份:
    2027
  • 资助金额:
    --
    --
  • 项目类别:
    Studentship
    Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
  • 批准号:
    2890513
    2890513
  • 财政年份:
    2027
  • 资助金额:
    --
    --
  • 项目类别:
    Studentship
    Studentship
CDT year 1 so TBC in Oct 2024
CDT 第 1 年,预计 2024 年 10 月
  • 批准号:
    2879865
    2879865
  • 财政年份:
    2027
  • 资助金额:
    --
    --
  • 项目类别:
    Studentship
    Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
  • 批准号:
    2876993
    2876993
  • 财政年份:
    2027
  • 资助金额:
    --
    --
  • 项目类别:
    Studentship
    Studentship

相似国自然基金

基于多模型-自动化微流控芯片技术的三代EGFR-TKI耐药后个体化药物高效筛选平台的构建及应用
  • 批准号:
    82304435
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
面向高效能光子DNN加速器的系统架构自动化设计方法研究
  • 批准号:
    62202159
  • 批准年份:
    2022
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
面向高效能光子DNN加速器的系统架构自动化设计方法研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
高效的自动化图神经网络架构设计方法与关键技术研究
  • 批准号:
    62102177
  • 批准年份:
    2021
  • 资助金额:
    24.00 万元
  • 项目类别:
    青年科学基金项目
高效的自动化图神经网络架构设计方法与关键技术研究
  • 批准号:
  • 批准年份:
    2021
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Safe and efficient eco-driving using connected and automated vehicles
使用联网和自动驾驶车辆实现安全高效的生态驾驶
  • 批准号:
    DP240102189
    DP240102189
  • 财政年份:
    2024
  • 资助金额:
    --
    --
  • 项目类别:
    Discovery Projects
    Discovery Projects
Promoting Mental Health of Teachers and Caregivers using a Personalized mHealth Toolkit in Uganda
在乌干达使用个性化移动医疗工具包促进教师和护理人员的心理健康
  • 批准号:
    10739227
    10739227
  • 财政年份:
    2023
  • 资助金额:
    --
    --
  • 项目类别:
Automated High-purity Exosome isolation-based AD diagnostics system (AHEADx)
基于自动化高纯度外泌体分离的 AD 诊断系统 (AHEADx)
  • 批准号:
    10738697
    10738697
  • 财政年份:
    2023
  • 资助金额:
    --
    --
  • 项目类别:
Collaborative Research: SHF: Medium: Automated energy-efficient sensor data winnowing using native analog processing
协作研究:SHF:中:使用本机模拟处理进行自动节能传感器数据筛选
  • 批准号:
    2212346
    2212346
  • 财政年份:
    2022
  • 资助金额:
    --
    --
  • 项目类别:
    Continuing Grant
    Continuing Grant
Histology, Biochemistry and Molecular Imaging (HBMI) Core
组织学、生物化学和分子成像 (HBMI) 核心
  • 批准号:
    10232835
    10232835
  • 财政年份:
    2022
  • 资助金额:
    --
    --
  • 项目类别: