Research Article 2026-04-21 posted v1

Prediction and modeling of air quality factor concentrations based on machine learning in mining cities in China

X
Xiaolong Li Anhui University of Science and Technology
Z
Zhihui Huang Anhui University of Science and Technology
J
Jinxiang Yang Anhui University of Science and Technology
Y
Yongsheng Chen Anhui University of Science and Technology
Y
Yu Yun Anhui University of Science and Technology
J
Jian Lang Anhui University of Science and Technology
X
Xiang Liu Anhui University of Science and Technology
S
Shiwen Zhang Anhui University of Science and Technology
H
Huijun Wu Anhui University of Science and Technology

Abstract

Modeling air quality concentration plays a crucial role in predicting and mitigating airborne pollutant levels. This study collected and organized daily average air quality data and meteorological data for Huainan City in 2019. Correlation and redundancy analyses were conducted to examine the relationships between air quality factors and meteorological variables. Using R language and a machine learning model, air quality concentration was simulated and predicted. The results show that air quality in Huainan City exhibits temporal variation, with poorer conditions in winter and better conditions in summer, and spatial variation, being worse in urban areas than in suburban ones. Spearman correlation and redundancy analyses revealed that PM2.5, PM10, and NO2 exhibit a strong negative correlation with temperature and a positive correlation with air pressure. O3 shows a strong positive correlation with temperature, CO has a strong negative correlation with visibility, and SO2 demonstrates minimal correlation with meteorological factors. Comparative analysis of two machine learning models, random forest (RF) and support vector machine (SVM), showed that the SVM model outperformed the RF model. The SVM model achieved a higher R-value (R > 0.8) and better prediction accuracy, with the actual and predicted values closely aligning. In addition, optimizing the sample size revealed that using 70–80% of the data for modeling yields high prediction accuracy, while further increasing the sample size provides minimal improvement. This study greatly reduces workload and financial investment, offering valuable insights for improving meteorological services and air quality management.

Citation Information

@article{xiaolongli2026,
  title={Prediction and modeling of air quality factor concentrations based on machine learning in mining cities in China},
  author={Xiaolong Li and Zhihui Huang and Jinxiang Yang and Yongsheng Chen and Yu Yun and Jian Lang and Xiang Liu and Shiwen Zhang and Huijun Wu},
  journal={Research Square},
  year={2026},
  doi={https://doi.org/10.21203/rs.3.rs-8839228/v1}
}
Back to Top
Home
Paper List
Submit
0.019099s