Land use regression (LUR) has been extensively used to capture the spatial distribution of air pollution. However, regional background and non-linear relationships can be challenging to capture using linear approaches. Machine learning approaches have recently been used in air quality prediction. Using data from a mobile campaign of fine particulate matter and black carbon in Toronto, Canada, this study investigates the boundaries of LUR approaches and the potential of two different machine learning models: Artificial Neural Networks (ANN) and gradient boost. In addition, a moving camera was used to collect real-time traffic. Models developed for fine particulate matter performed better than those for black carbon. For the same pollutants, machine learning exhibited superior performance over LUR, demonstrating that LUR performance could benefit from understanding how explanatory variables were expressed in machine learning models. This study unveils the black-box nature of machine learning algorithms by investigating the performance of different models in the context of how they capture the relationship between air quality and various predictors.
土地利用回归(LUR)已被广泛用于获取空气污染的空间分布。然而,区域背景和非线性关系可能难以用线性方法获取。机器学习方法最近已用于空气质量预测。本研究利用加拿大多伦多一次细颗粒物和黑碳移动监测活动的数据,探究了LUR方法的局限性以及两种不同机器学习模型(人工神经网络和梯度提升)的潜力。此外,还使用了移动摄像机来收集实时交通数据。针对细颗粒物建立的模型比针对黑碳的模型表现更好。对于相同的污染物,机器学习表现优于LUR,这表明LUR的性能可受益于了解解释变量在机器学习模型中是如何表达的。本研究通过探究不同模型在捕捉空气质量与各种预测因子之间关系方面的表现,揭示了机器学习算法的黑箱性质。