RI: Small: Stochastic Planning and Probabilistic Inference for Factored State and Action Spaces

RI:小:因子状态和行动空间的随机规划和概率推理

基本信息

  • 批准号:
    2002393
  • 负责人:
  • 金额:
    $ 17.77万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-11-01 至 2022-05-31
  • 项目状态:
    已结题

项目摘要

Many important problems require control of multiple actuators, or agents, in parallel, to achieve a common coordinated goal in a stochastic environment. Examples of such problems include scheduling in a building with multiple elevators, managing a team for fire and rescue operations, managing the inventory of a large company, controlling a robotic soccer team, and controlling a robotic team to manage shelving and orders in a warehouse environment. These problems naturally fit into a formulation as discrete-time central-control problems where we design an algorithm that decides what action each agent takes at any time step in order to optimize the common objective. The corresponding computational problem, known as stochastic planning, is challenging due its sheer size. In particular, the number of possible states (for example, possible positions of robots, shelves and merchandise in a warehouse) and the number of possible joint actions (combinations of actions of individual robots) are huge in any problem instance of interest. State of the art approaches typically fail due to requiring too much time to properly search for a good policy or due to requiring too much memory to store intermediate values. By viewing stochastic planning through the lens of probabilistic inference, this project proposes several novel domain independent algorithmic approaches that take advantage of problem structure to calculate approximate solutions effectively under time constraints. The project funds are largely devoted to support training and research of PhD students therefore directly support human development in an important high impact area for the nation. More concretely, we propose three competing approaches to solving such problems, all taking insight from formulating the finite horizon control problem as probabilistic inference in a corresponding graphical model, also known as a dynamic Bayesian network. The first approach uses the idea of Monte Carlo search, but adds a strong symbolic component by introducing aggregate trajectories. Aggregate trajectories are obtained by simulating a compositional symbolic model under independence assumptions over the random variables. Each aggregate trajectory provides a value estimate that is approximate but can replace numerous individual trajectories. In this way we get fast approximation of values and effective control under time constraints. The second approach uses problem structure to translate the inference problem into an integer linear program, where the objective and quality of the solution can be traded-off for speed through problem decomposition. A novel construction shows how to sidestep the exponential complexity of the problem and obtain a sequence of integer programs that are both small and decomposable so as to yield effective control under time constraints. The third approach, or more accurately framework, builds on the tight connection between stochastic planning and probabilistic inference in the corresponding dynamic Bayesian network. We show that variants of the first two approaches can be viewed in this light, and through this we propose new inference algorithms for solving the stochastic planning problem. In addition, based on this analysis, we propose new algorithms for probabilistic inference, and new generalized inference questions that go beyond current research on marginal map in graphical models.
许多重要的问题需要同时控制多个致动器或代理,以在随机环境中实现共同的协调目标。此类问题的例子包括在具有多个电梯的建筑物中进行安排,管理一个团队进行消防和救援行动,管理大型公司的库存,控制机器人足球团队,并控制机器人团队在仓库环境中管理货架和订单。这些问题自然地适合公式作为离散时间中央控制问题,在该问题中,我们设计了一种算法,该算法决定每个代理在任何时间步骤中采取的措施以优化共同目标。相应的计算问题(称为随机计划)由于其庞大的规模而具有挑战性。特别是,在任何问题的问题实例中,可能的状态数量(例如,机器人,货架和商品的可能位置)和可能的联合行动数量(单个机器人的动作组合)都是巨大的。技术方法通常由于需要太多时间来正确搜索良好的策略或需要过多的内存以存储中间值而失败。通过通过概率推断的镜头查看随机计划,该项目提出了几种新型域独立算法方法,这些方法利用问题结构在时间限制下有效地计算出近似解决方案。因此,该项目资金主要致力于支持博士生的培训和研究,因此直接支持国家重要的高影响力领域的人类发展。更具体地说,我们提出了三种竞争方法来解决此类问题,所有这些方法都从将有限的地平线控制问题提出为概率的推断中,这在相应的图形模型中,也称为动态贝叶斯网络。第一种方法使用蒙特卡洛搜索的概念,但通过引入骨料轨迹来增加强大的符号组件。通过在随机变量上模拟独立假设下的组成符号模型来获得骨料轨迹。每个骨料轨迹提供了一个近似值,但可以替代许多单个轨迹的值估计值。通过这种方式,我们可以在时间限制下快速近似值和有效控制。第二种方法使用问题结构将推理问题转化为整数线性程序,在该程序中,可以通过问题分解来交易解决方案的目标和质量。一种新颖的结构显示了如何避开问题的指数复杂性,并获得一系列既小且可分解的整数程序序列,以便在时间限制下产生有效的控制。第三种方法或更准确的框架是建立在相应动态贝叶斯网络中随机计划与概率推断之间的紧密联系上的。我们表明,可以从这一观点中查看前两种方法的变体,通过此方法,我们提出了解决随机计划问题的新推理算法。此外,基于此分析,我们提出了用于概率推断的新算法,以及在图形模型中对边缘地图的当前研究超出了当前研究的新概括推理问题。

项目成果

期刊论文数量(6)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Sampling Networks and Aggregate Simulation for Online POMDP Planning
在线 POMDP 规划的采样网络和聚合模拟
Approximate Inference for Stochastic Planning in Factored Spaces
  • DOI:
    10.48550/arxiv.2203.12139
  • 发表时间:
    2022-03
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zhennan Wu;R. Khardon
  • 通讯作者:
    Zhennan Wu;R. Khardon
From Stochastic Planning to Marginal MAP
从随机规划到边际 MAP
Stochastic Planning with Lifted Symbolic Trajectory Optimization
  • DOI:
    10.1609/icaps.v29i1.3467
  • 发表时间:
    2019-07
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hao Cui;Thomas Keller;R. Khardon
  • 通讯作者:
    Hao Cui;Thomas Keller;R. Khardon
Stochastic Planning and Lifted Inference
随机规划和提升推理
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    R. Khardon;S. Sanner
  • 通讯作者:
    S. Sanner
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Roni Khardon其他文献

Roni Khardon的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Roni Khardon', 18)}}的其他基金

RI: Small: Approximate Inference for Planning and Reinforcement Learning
RI:小:规划和强化学习的近似推理
  • 批准号:
    2246261
  • 财政年份:
    2023
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Standard Grant
III: Small: Algorithms and Theoretical Foundations for Approximate Bayesian Inference in Machine Learning
III:小:机器学习中近似贝叶斯推理的算法和理论基础
  • 批准号:
    1906694
  • 财政年份:
    2018
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Continuing Grant
III: Small: Algorithms and Theoretical Foundations for Approximate Bayesian Inference in Machine Learning
III:小:机器学习中近似贝叶斯推理的算法和理论基础
  • 批准号:
    1714440
  • 财政年份:
    2017
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Continuing Grant
RI: Small: Stochastic Planning and Probabilistic Inference for Factored State and Action Spaces
RI:小:因子状态和行动空间的随机规划和概率推理
  • 批准号:
    1616280
  • 财政年份:
    2016
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Standard Grant
RI: Medium: Collaborative Research: Optimizing Policies for Service Organizations in Complex Structured Domains
RI:中:协作研究:优化复杂结构领域服务组织的政策
  • 批准号:
    0964457
  • 财政年份:
    2010
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Continuing Grant
EAGER: First Order Decision Diagrams for Relational Markov Decision Processes
EAGER:关系马尔可夫决策过程的一阶决策图
  • 批准号:
    0936687
  • 财政年份:
    2009
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Standard Grant
Learning and Reasoning with Relational Structures
利用关系结构进行学习和推理
  • 批准号:
    0099446
  • 财政年份:
    2001
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Continuing Grant

相似国自然基金

带小噪声随机微分方程数值方法的大偏差原理
  • 批准号:
    12201228
  • 批准年份:
    2022
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
带小噪声随机微分方程数值方法的大偏差原理
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
带有α-稳定OU噪声小干扰的随机过程的参数与非参数估计
  • 批准号:
  • 批准年份:
    2021
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
带有α-稳定OU噪声小干扰的随机过程的参数与非参数估计
  • 批准号:
    12101004
  • 批准年份:
    2021
  • 资助金额:
    24.00 万元
  • 项目类别:
    青年科学基金项目
基于鲁棒广义短路比的高比例新能源电力系统数据驱动随机小干扰稳定性分析
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    24 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

RI: Small: Incremental Sampling-Based Algorithms and Stochastic Optimal Control on Random Graphs
RI:小:基于增量采样的算法和随机图上的随机最优控制
  • 批准号:
    1617630
  • 财政年份:
    2016
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Continuing Grant
RI: Small: Stochastic Planning and Probabilistic Inference for Factored State and Action Spaces
RI:小:因子状态和行动空间的随机规划和概率推理
  • 批准号:
    1616280
  • 财政年份:
    2016
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Standard Grant
RI: Small: Collaborative Research: Stochastic Sampling for Rendering, Imaging, and Modeling
RI:小型:协作研究:用于渲染、成像和建模的随机采样
  • 批准号:
    1422477
  • 财政年份:
    2014
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Standard Grant
RI: Small: Collaborative Research: Stochastic Sampling for Rendering, Imaging, and Modeling
RI:小型:协作研究:用于渲染、成像和建模的随机采样
  • 批准号:
    1423082
  • 财政年份:
    2014
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Standard Grant
RI: Small: Efficient Bayesian Learning from Stochastic Gradients
RI:小:从随机梯度中进行高效贝叶斯学习
  • 批准号:
    1216045
  • 财政年份:
    2012
  • 资助金额:
    $ 17.77万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了