Journal of Atmospheric and Environmental Optics ›› 2023, Vol. 18 ›› Issue (3): 258-268.

Previous Articles     Next Articles

Improving the accuracy of NO2 concentrations derived from remote sensing using localized factors based on random forest algorithm

FU Miao   

  1. School of Economics and Trade, Guangdong University of Foreign Studies
  • Received:2022-01-11 Revised:2022-02-28 Online:2023-05-28 Published:2023-05-28
  • Contact: Miao Fu E-mail:cnfm@163.com
  • Supported by:
    教育部人文社会科学研究规划基金项目 (17YJA790021), 广东省自然科学基金自由申请项目 (2017A030313439), 广州国际商贸中心研究 基地专项资助 (JDZB202108)

Abstract: NO2 is a main air pollutant that damages human health and ecological environment. Based on NASA's NO2 concentrations retrieved from Aura OMI, the prediction accuracy of NO2 concentration is improved in this work using the random forest algorithm, the Geographic Weighted Regression (GWR) and the Multi-scale GWR model respectively. Localized data of economy, population, road network and slope within 8 km of the sampling point, as well as the point values of meteorology, vegetation and elevation are used as correction variables in the models. It is found that the three models increase the cross validation R2 of NASA's concentrations, from original 0.48 to 0.74, 0.71 and 0.70, respectively. Among the three models, the random forest algorithm is the most accurate one, with a low root mean square error (RMSE) of 6.4 μg/m3 and a low mean absolute error (MAE) of 4.98 μg/m3, and its speed is much faster than multi-scale GWR. In addition, the accuracy of random forest algorithm is also higher than that of most existing studies of similar extents. In terms of the concentration correction of NO2, it is found that the contribution of localized factors of economy, population and road network is at least 11.24%. In addition, based on the random forest algorithm, the distribution map of NO2 estimated concentration for county-level cities in China is also presented.

Key words: NO2 concentrations, localized factors, random forest algorithm, geographic weighted regression, multi-scale geographic weighted regression

CLC Number: