Exploring Efficient Automated Design Choices for Robust Machine Learning Algorithms

探索稳健的机器学习算法的高效自动化设计选择

基本信息

  • 批准号:
    2748823
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Studentship
  • 财政年份:
    2022
  • 资助国家:
    英国
  • 起止时间:
    2022 至 无数据
  • 项目状态:
    未结题

项目摘要

Are you familiar with machine learning? Do you have an aptitude for analysing and disseminating information from a variety of outcomes? Would you like to assist GCHQ develop and design new algorithms for both time-efficiency and energy-efficiency solutions? Are you keen on developing novel approaches and using modern computing architectures that make it easy to apply Deep Learning and Gaussian Processes to real problems?Applying Machine Learning (ML) currently requires the data scientist to make design choices. These choices might relate, for example, to: choosing the number of layers and the number of neurons in each layer of a Deep Neural Network; choosing which kernel family to use in a Gaussian Process. Since ML algorithms often involve time-consuming training regimes, data scientists often find it laborious to iterate between (re)-identifying candidate design choices and (re)-training the ML algorithms. Furthermore, different design choices can alter both how many hyper-parameters (e.g. neuron weights or kernel widths and cross-covariance terms) need to considered but also how challenging it is to optimise the hyper-parameters of the ML algorithm. Since practitioners have limited time to perform sensitivity analyses with respect to these parameters, design choice are typically based on estimated performance (calculated as an average over the test set) with very limited, if any, consideration for the variance in this estimate. It is important that this variance is considered since it will determine how likely it is that performance on the test set will accurately predict empirical performance when the algorithm is deployed operationally. Indeed, robust performance requires that we do not optimise the hyper-parameters (e.g. using stochastic gradient descent) but generate a set of samples for the hyper-parameters that are consistent with the data and then average across these sampled values for the hyper-parameters.Numerical Bayesian algorithms exist that can explore the design choices and the possible hyper-parameter values associated with each design choice. Mature variants of these algorithms exist and involve the use of Markov-Chain Monte Carlo (MCMC), with Reversible Jump MCMC (RJMCMC) being a variant applicable in contexts where the design choice alters the number of hyper-parameters that need to be considered. In general, and particularly in the case of RJMCMC, these mature algorithms are sufficiently slow and computationally demanding that they are widely assumed to be impractical for practical use in real-world scenarios.Recent advances at the University of Liverpool have identified that Sequential Monte Carlo (SMC) samplers are an alternative family of numerical Bayesian algorithms that offer the potential to improve on both the time-efficiency and energy-efficiency of MCMC algorithms. In this context, SMC samplers can be considered to comprise a team of sub-algorithms that collaborate to explore the space of design choices and associated hyper-parameters. By distributing the sub-algorithms across parallel computational resources, SMC samplers can improve time-efficiency. Since the sub-algorithms only need to avoid all failing at once, they can each be more adventurous in their exploration than the single MCMC algorithm: this can lead to energy-efficiency gains. Perhaps surprisingly, the potential for SMC samplers to automate design choices, while also exploring the associated hyper-parameter values, is largely unexplored. This PhD will investigate the significant potential to apply SMC samplers in this context.
您熟悉机器学习吗?您是否有能力分析和传播来自各种结果的信息?您愿意协助 GCHQ 开发和设计时间效率和能源效率解决方案的新算法吗?您是否热衷于开发新颖的方法并使用现代计算架构,以便轻松地将深度学习和高斯过程应用于实际问题?应用机器学习 (ML) 目前需要数据科学家做出设计选择。例如,这些选择可能涉及: 选择深度神经网络的层数和每层神经元的数量;选择在高斯过程中使用哪个内核系列。由于 ML 算法通常涉及耗时的训练机制,因此数据科学家经常发现在(重新)识别候选设计选择和(重新)训练 ML 算法之间进行迭代很费力。此外,不同的设计选择不仅可以改变需要考虑的超参数数量(例如神经元权重或核宽度和互协方差项),还可以改变优化 ML 算法的超参数的挑战性。由于从业者对这些参数进行敏感性分析的时间有限,因此设计选择通常基于估计的性能(计算为测试集的平均值),并且对该估计中的方差的考虑非常有限(如果有的话)。考虑这种方差很重要,因为它将决定在操作部署算法时测试集上的性能准确预测经验性能的可能性。事实上,鲁棒的性能要求我们不优化超参数(例如使用随机梯度下降),而是为超参数生成一组与数据一致的样本,然后对超参数的这些采样值进行平均数值贝叶斯算法可以探索设计选择以及与每个设计选择相关的可能的超参数值。这些算法存在成熟的变体,并涉及马尔可夫链蒙特卡罗 (MCMC) 的使用,其中可逆跳跃 MCMC (RJMCMC) 是适用于设计选择改变需要考虑的超参数数量的情况的变体。一般来说,特别是在 RJMCMC 的情况下,这些成熟的算法非常慢且计算要求高,因此人们普遍认为它们在现实场景中不切实际。利物浦大学的最新进展已经确定,顺序蒙特卡罗(SMC) 采样器是数值贝叶斯算法的替代系列,它具有提高 MCMC 算法的时间效率和能源效率的潜力。在这种情况下,SMC 采样器可以被视为包含一组子算法,这些子算法协作探索设计选择和相关超参数的空间。通过将子算法分布在并行计算资源上,SMC 采样器可以提高时间效率。由于子算法只需要同时避免所有失败,因此它们在探索中都可以比单个 MCMC 算法更具冒险精神:这可以带来能源效率的提高。也许令人惊讶的是,SMC 采样器在自动化设计选择的同时还探索相关的超参数值的潜力在很大程度上尚未被开发。本博士将研究在这种情况下应用 SMC 采样器的巨大潜力。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

其他文献

Acute sleep deprivation increases inflammation and aggravates heart failure after myocardial infarction.
Ionic Liquids-Polymer of Intrinsic Microporosity (PIMs) Blend Membranes for CO(2) Separation.
  • DOI:
    10.3390/membranes12121262
  • 发表时间:
    2022-12-13
  • 期刊:
  • 影响因子:
    4.2
  • 作者:
  • 通讯作者:

的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('', 18)}}的其他基金

An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
  • 批准号:
    2901954
  • 财政年份:
    2028
  • 资助金额:
    --
  • 项目类别:
    Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
  • 批准号:
    2896097
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
  • 批准号:
    2780268
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
  • 批准号:
    2908918
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
  • 批准号:
    2908693
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
  • 批准号:
    2908917
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
  • 批准号:
    2879438
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
  • 批准号:
    2890513
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
CDT year 1 so TBC in Oct 2024
CDT 第 1 年,预计 2024 年 10 月
  • 批准号:
    2879865
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
  • 批准号:
    2876993
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship

相似国自然基金

基于多模型-自动化微流控芯片技术的三代EGFR-TKI耐药后个体化药物高效筛选平台的构建及应用
  • 批准号:
    82304435
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
面向高效能光子DNN加速器的系统架构自动化设计方法研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
高效的自动化图神经网络架构设计方法与关键技术研究
  • 批准号:
  • 批准年份:
    2021
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于数据检测与状态预测的5G工业自动化系统网络资源高效管控研究
  • 批准号:
    62073039
  • 批准年份:
    2020
  • 资助金额:
    59 万元
  • 项目类别:
    面上项目
面向超大规模FPGA的高效并行布线方法研究
  • 批准号:
    61802446
  • 批准年份:
    2018
  • 资助金额:
    25.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Safe and efficient eco-driving using connected and automated vehicles
使用联网和自动驾驶车辆实现安全高效的生态驾驶
  • 批准号:
    DP240102189
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Discovery Projects
Promoting Mental Health of Teachers and Caregivers using a Personalized mHealth Toolkit in Uganda
在乌干达使用个性化移动医疗工具包促进教师和护理人员的心理健康
  • 批准号:
    10739227
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
Automated High-purity Exosome isolation-based AD diagnostics system (AHEADx)
基于自动化高纯度外泌体分离的 AD 诊断系统 (AHEADx)
  • 批准号:
    10738697
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
Collaborative Research: SHF: Medium: Automated energy-efficient sensor data winnowing using native analog processing
协作研究:SHF:中:使用本机模拟处理进行自动节能传感器数据筛选
  • 批准号:
    2212346
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Histology, Biochemistry and Molecular Imaging (HBMI) Core
组织学、生物化学和分子成像 (HBMI) 核心
  • 批准号:
    10232835
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了