Multi-Source-Data-Oriented Ensemble Learning Based PM 2.5 Concentration Prediction in Shenyang

dc.contributor.author Qi, Tianfang
dc.contributor.author Jiang, Hongxun
dc.contributor.author Shi, Xiaowen
dc.date.accessioned 2019-01-02T23:51:09Z
dc.date.available 2019-01-02T23:51:09Z
dc.date.issued 2019-01-08
dc.description.abstract Shenyang where is surrounded by smokestack industries and depends on coal heating in winter, is a classical one of cities in China northeastern which has suffered from serious air pollution, especially PM2.5. The existing research on machine learning, based on historical air-monitoring data and meteorological data, does neither forecast accurately nor identify key pollutants for PM2.5. This paper presents a multi-source-data-oriented ensemble learning for predicting PM2.5 concentration. The proposed framework incorporates not only air quality data and weather data, but also industrial emission data, especially those of winter heating enterprises, in Shenyang and nearby cities; the model also takes into account location and emission frequency of pollution sources. All these data are entered into an ensemble learning model based on Extreme Gradient Boosting (XGBoost) in order to predict PM2.5 concentration, which not only improves prediction accuracy effectively, but also provides contribution analysis of different pollutants. Experimental results show that the top two factors affecting PM2.5 concentration are: (1) air pollutant emission quantities and (2) distance from pollution sources to air-monitoring stations. According to the importance of these two factors, we refine feature selection and re-train the ensemble learning model and find that the new model performs better on 72% of evaluation indexes.
dc.format.extent 10 pages
dc.identifier.doi 10.24251/HICSS.2019.157
dc.identifier.isbn 978-0-9981331-2-6
dc.identifier.uri http://hdl.handle.net/10125/59569
dc.language.iso eng
dc.relation.ispartof Proceedings of the 52nd Hawaii International Conference on System Sciences
dc.rights Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject Decision Support for Smart Cities
dc.subject Decision Analytics, Mobile Services, and Service Science
dc.subject Ensemble Learning
dc.subject Prediction
dc.subject Multi-Source-Data
dc.title Multi-Source-Data-Oriented Ensemble Learning Based PM 2.5 Concentration Prediction in Shenyang
dc.type Conference Paper
dc.type.dcmi Text
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
0129.pdf
Size:
558.29 KB
Format:
Adobe Portable Document Format
Description: