— When we want to understand the sensitivity of a simulation model with respect to an input value or to optimize an objective function, the gradient usually provides a good hint. The adjoint state method is a widely used numerical method to compute the gradient of a function. It decomposes functions into a sequence of basic operations. It performs a forward sweep to evaluate the function, followed by a backward sweep to calculate the gradient using the chain rule iteratively. One limitation of the adjoint state method is that all intermediate values from the forward sweep are needed by the backward sweep. Usually, we keep only a portion of those values, called checkpoints, in the memory because of limited space. The remaining values are either stored on the hard disk or recomputed from the nearest checkpoint whenever needed. In this work, we seek to compress the intermediate values in order to better utilize limited space in the memory and to speed the I/O when checkpointing to the hard disk.
- 当我们要了解对输入值的敏感性或优化目标函数的敏感性时,梯度通常提供了良好的提示。伴随状态方法是,从向后扫描中需要的所有中间值,我们只将这些值的一部分(称为检查点)保留在存储器中,因为其余值有限,在硬盘上存储。磁盘。