Assigning bands of the wireless spectrum as resources to users is a common problem in wireless networks. Typically, frequency bands were assumed to be available in a stable manner. Nevertheless, in recent scenarios where wireless networks may be deployed in unknown environments, spectrum competition is considered, making it uncertain whether a frequency band is available at all or at what quality. To fully exploit such resources with uncertain availability, the multi-armed bandit (MAB) method, a representative online learning technique, has been applied to design spectrum scheduling algorithms. This article surveys such proposals. We describe the following three aspects: how to model spectrum scheduling problems within the MAB framework, what the main thread is following which prevalent algorithms are designed, and how to evaluate algorithm performance and complexity. We also give some promising directions for future research in related fields.
将无线频谱频段作为资源分配给用户是无线网络中的一个常见问题。通常情况下,人们认为频段是以稳定的方式可用的。然而,在近期无线网络可能部署在未知环境的场景中,需要考虑频谱竞争,这使得一个频段是否可用以及其质量如何都变得不确定。为了充分利用这种可用性不确定的资源,多臂老虎机(MAB)方法——一种具有代表性的在线学习技术,已被应用于设计频谱调度算法。本文对这类方案进行了综述。我们描述了以下三个方面:如何在MAB框架内对频谱调度问题进行建模,设计主流算法所遵循的主线是什么,以及如何评估算法性能和复杂度。我们还为相关领域的未来研究给出了一些有前景的方向。