Sparse representations for reinforcement learning

强化学习的稀疏表示

基本信息

批准号：
RGPIN-2018-05721
负责人：
White, Martha
金额：
$ 2.84万
依托单位：
University of Alberta
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2022
资助国家：
加拿大
起止时间：
2022-01-01 至 2023-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=750527
关键词：
Sparse representations reinforcement learning

项目摘要

A key component of an artificial intelligence system is the ability to process and learn from a high-dimensional, high-volume sensory stream of information. For example, an agent controlling the pumps in an industrial plant continually receives sensory information about temperatures and energy consumption, to continually adjust the motor speed in real-time to optimize performance. To make such decision, the agents needs to be able to predict the long-term outcomes of their behaviour. For example, if the industrial agent can predict the long-term temperature of the motor, given the current state of the system, they can use these predictions to improve their decisions and ensure motors are not damaged. Such predictions, however, can be difficult to learn accurately from raw sensory information. Predictions are typically learned as functions of inputted sensory information. For example, the prediction of motor temperature in five minutes could be approximated as a polynomial function of the last ten recorded temperature and motor speeds. Polynomials, however, are only one possible functional form, and not necessarily the best one. Further, to obtain general learning agents, the functional forms should be effective across multiple settings or tasks. This is the goal of representation learning in reinforcement learning: identifying a general mapping from a sequence of raw sensory information to a set of features, that facilitates accurate predictions. The goal in my research is to understandboth theoretically and empiricallythe properties of effective representations for a reinforcement learning agent learning on a continual stream of sensory information. A part of this challenge is to identify simpler representations for which we can provide optimization guarantees, but that are nonetheless sufficiently powerful to facilitate learning. Continuing preliminary research, I will explore prototype-based (kernel) representations and a sparse supervised auto-encoder representation. We have already found that, within this class of simpler representations, we can find computational models that provide highly accurate predictions, but are more amenable to theoretical analysis. A core component of this research direction will be to investigate sparsity as a generally useful property of representations, and how we can encode that property into our representation learning algorithms. If successful, this research will have important scientific and societal benefits. This research will contribute to a core endeavour in artificial intelligence: understanding how to develop intelligent agents that can learn in complex environments. This understanding, in turn, will contribute to improving the robustness of automated decision-making systems, which are becoming ubiquitous in our world, including in industrial systems and factories, in self-driving vehicles and even in our homes.

人工智能系统的一个关键组成部分是能够从高维，高量的感觉信息流中进行处理和学习。例如，控制工厂中泵的代理会不断收到有关温度和能量消耗的感官信息，以不断实时调整运动速度以优化性能。为了做出这样的决定，代理商需要能够预测其行为的长期结果。例如，如果工业剂可以预测电动机的长期温度，鉴于系统的当前状态，他们可以使用这些预测来改善决策并确保电动机不会受到损坏。但是，此类预测可能很难从原始感觉信息中准确学习。通常将预测作为输入的感觉信息的功能学习。例如，可以在五分钟内预测电动机温度，作为最后十个记录的温度和电动机速度的多项式函数。但是，多项式只是一种可能的功能形式，不一定是最好的功能形式。此外，为了获得一般学习代理，功能形式应在多种设置或任务中有效。这是在加强学习中表示学习的目的：确定从一系列原始感官信息到一组功能的一般映射，从而有助于准确的预测。我的研究中的目的是从理论和经验上理解有效表示的特性，用于在不断的感官信息流上学习的有效表述。这一挑战的一部分是确定我们可以为其提供优化保证的更简单的表示，但是这些表示的功能足以促进学习。继续初步研究，我将探索基于原型的（内核）表示和稀疏监督自动编码器表示。我们已经发现，在这类简单的表示中，我们可以找到提供高度准确预测的计算模型，但更适合理论分析。该研究方向的一个核心组成部分是将稀疏性作为表示的通常有用的属性，以及我们如何将该属性编码到表示算法中。如果成功，这项研究将具有重要的科学和社会利益。这项研究将有助于人工智能的核心努力：了解如何开发可以在复杂环境中学习的智能代理。反过来，这种理解将有助于改善自动决策系统的鲁棒性，这些决策系统在我们的世界中变得无处不在，包括工业系统和工厂，自动驾驶汽车，甚至在我们的家中。