Many problems in natural language processing and computer vision can be framed as structured prediction problems. Structural support vector machines (SVM) is a popular approach for training structured predictors, where learning is framed as an optimization problem. Most structural SVM solvers alternate between a model update phase and an inference phase (which predicts structures for all training examples). As structures become more complex, inference becomes a bottleneck and thus slows down learning considerably. In this paper, we propose a new learning algorithm for structural SVMs called DEMIDCD that extends the dual coordinate descent approach by decoupling the model update and inference phases into different threads. We take advantage of multicore hardware to parallelize learning with minimal synchronization between the model update and the inference phases.We prove that our algorithm not only converges but also fully utilizes all available processors to speed up learning, and validate our approach on two real-world NLP problems: part-of-speech tagging and relation extraction. In both cases, we show that our algorithm utilizes all available processors to speed up learning and achieves competitive performance. For example, it achieves a relative duality gap of 1% on a POS tagging problem in 192 seconds using 16 threads, while a standard implementation of a multi-threaded dual coordinate descent algorithm with the same number of threads requires more than 600 seconds to reach a solution of the same quality.
自然语言处理和计算机视觉中的许多问题都可以作为结构化的预测问题构成。结构支持向量机(SVM)是训练结构化预测因子的流行方法,其中学习被构成优化问题。大多数结构SVM求解器在模型更新阶段和推理阶段之间进行交替(这可以预测所有培训示例的结构)。随着结构变得越来越复杂,推理成为瓶颈,从而大大减慢了学习的速度。在本文中,我们为称为demIDCD的结构SVM提出了一种新的学习算法,该算法通过将模型更新和推理阶段分解为不同的线程来扩展双坐标下降方法。我们利用多核算硬件的优势来平行学习,并在模型更新和推理阶段之间最少同步,并证明我们的算法不仅收敛,而且还完全利用所有可用的处理器来加快学习速度,并在两个现实世界中验证我们的方法NLP问题:词性标记和关系提取。在这两种情况下,我们都表明我们的算法都利用所有可用的处理器来加快学习和实现竞争性能。例如,它在192秒内使用16个线程在192秒内实现了相对双重性差距为1%,而多线程双坐标下降算法的标准实现具有相同数量的线程,需要超过600秒才能达到相同质量的解决方案。