Exploring Efficient Automated Design Choices for Robust Machine Learning Algorithms

探索稳健的机器学习算法的高效自动化设计选择

基本信息

批准号：
2748823
负责人：
金额：
--
依托单位：
University of Liverpool
依托单位国家：
英国
项目类别：
Studentship
财政年份：
2022
资助国家：
英国
起止时间：
2022 至无数据
项目状态：
未结题

来源：
https://gtr.ukri.org/projects?ref=studentship-2748823
关键词：
Exploring Efficient Automated Design Choices

Exploring Efficient Automated Design Choices

项目摘要

Are you familiar with machine learning? Do you have an aptitude for analysing and disseminating information from a variety of outcomes? Would you like to assist GCHQ develop and design new algorithms for both time-efficiency and energy-efficiency solutions? Are you keen on developing novel approaches and using modern computing architectures that make it easy to apply Deep Learning and Gaussian Processes to real problems?Applying Machine Learning (ML) currently requires the data scientist to make design choices. These choices might relate, for example, to: choosing the number of layers and the number of neurons in each layer of a Deep Neural Network; choosing which kernel family to use in a Gaussian Process. Since ML algorithms often involve time-consuming training regimes, data scientists often find it laborious to iterate between (re)-identifying candidate design choices and (re)-training the ML algorithms. Furthermore, different design choices can alter both how many hyper-parameters (e.g. neuron weights or kernel widths and cross-covariance terms) need to considered but also how challenging it is to optimise the hyper-parameters of the ML algorithm. Since practitioners have limited time to perform sensitivity analyses with respect to these parameters, design choice are typically based on estimated performance (calculated as an average over the test set) with very limited, if any, consideration for the variance in this estimate. It is important that this variance is considered since it will determine how likely it is that performance on the test set will accurately predict empirical performance when the algorithm is deployed operationally. Indeed, robust performance requires that we do not optimise the hyper-parameters (e.g. using stochastic gradient descent) but generate a set of samples for the hyper-parameters that are consistent with the data and then average across these sampled values for the hyper-parameters.Numerical Bayesian algorithms exist that can explore the design choices and the possible hyper-parameter values associated with each design choice. Mature variants of these algorithms exist and involve the use of Markov-Chain Monte Carlo (MCMC), with Reversible Jump MCMC (RJMCMC) being a variant applicable in contexts where the design choice alters the number of hyper-parameters that need to be considered. In general, and particularly in the case of RJMCMC, these mature algorithms are sufficiently slow and computationally demanding that they are widely assumed to be impractical for practical use in real-world scenarios.Recent advances at the University of Liverpool have identified that Sequential Monte Carlo (SMC) samplers are an alternative family of numerical Bayesian algorithms that offer the potential to improve on both the time-efficiency and energy-efficiency of MCMC algorithms. In this context, SMC samplers can be considered to comprise a team of sub-algorithms that collaborate to explore the space of design choices and associated hyper-parameters. By distributing the sub-algorithms across parallel computational resources, SMC samplers can improve time-efficiency. Since the sub-algorithms only need to avoid all failing at once, they can each be more adventurous in their exploration than the single MCMC algorithm: this can lead to energy-efficiency gains. Perhaps surprisingly, the potential for SMC samplers to automate design choices, while also exploring the associated hyper-parameter values, is largely unexplored. This PhD will investigate the significant potential to apply SMC samplers in this context.

您熟悉机器学习吗？您是否有能力分析和传播各种结果的信息？您是否想协助GCHQ开发和设计新算法，以达到时间效率和能源效率解决方案？您是否热衷于开发新颖的方法并使用现代计算体系结构，这些架构使得将深度学习和高斯流程应用于实际问题变得很容易？当前应用机器学习（ML）需要数据科学家做出设计选择。这些选择可能与：选择深神经网络每一层中的层数和神经元数的数量相关。选择在高斯过程中使用哪个内核家庭。由于ML算法通常涉及耗时的培训制度，因此数据科学家经常发现在（重新）识别候选设计选择和（重新）培训ML算法之间艰苦的迭代。此外，不同的设计选择可以改变需要考虑多少个超参数（例如神经元的权重或内核宽度和跨稳定术语），也需要考虑多么挑战，以优化ML算法的超参数。由于从业人员对这些参数进行灵敏度分析的时间有限，因此设计选择通常基于估计的性能（在测试集的平均值计算），并且在此估算中考虑了差异。重要的是要考虑这一方差，因为它将确定测试集上的性能在操作上部署算法时会准确预测经验性能的可能性。实际上，稳健的性能要求我们不优化超参数（例如，使用随机梯度下降），而是生成一组与数据一致的超参数的样本，然后在超级参数中的这些样本值中平均探索各种各样的超级算法，可以探索各种设计的设计选择，以探索各种设计的选择。这些算法的成熟变体存在，并涉及马尔可夫链蒙特卡洛（MCMC）的使用，可逆跳跃MCMC（RJMCMC）是适用于设计选择的上下文中的变体，在这种情况下，可以考虑需要考虑需要考虑的超参数的数量。通常，通常，尤其是在RJMCMC的情况下，这些成熟的算法足够缓慢且计算上要求，在现实情况下，人们普遍认为它们是不切实际的。利物浦大学的进展。 MCMC算法的时间效率和能效。在这种情况下，可以考虑使用SMC采样器组成一个次级合理的团队，该团队合作探索设计选择的空间和相关的超参数。通过在平行计算资源上分配子算法，SMC采样器可以改善时间效率。由于子算法只需要一次避免所有失败，因此与单个MCMC算法相比，它们在探索中的探索中都可以更具冒险精神：这可以导致能源效率增长。也许令人惊讶的是，SMC采样器可以自动化设计选择的潜力，同时探索相关的高参数值，这在很大程度上没有探索。该博士将研究在这种情况下应用SMC采样器的重要潜力。