click here to download project abstract/base paper of data analysis
In recent years, people have been paying more and more attention to air quality because it directly affects people’s health and daily life. Effective air quality prediction has become one of the hot research issues. However, this paper is suffering many challenges, such as the instability of data sources and the variation of pollutant concentration along time series. Aiming at this problem, we propose an improved air quality prediction method based on the LightGBM model to predict the PM2.5 concentration at the 35 air quality monitoring stations in Beijing over the next 24 h. In this paper, we resolve the issue of processing the high-dimensional large-scale data by employing the LightGBM model and innovatively take the forecasting data as one of the data sources for predicting the air quality. With exploring the forecasting data feature, we could improve the prediction accuracy with making full use of the available spatial data. Given the lack of data, we employ the sliding window mechanism to deeply mine the high-dimensional temporal features for increasing the training dimensions to millions. We compare the predicted data with the actual data collected at the 35 air quality monitoring stations in Beijing. The experimental results show that the proposed method is superior to other schemes and prove the advantage of integrating the forecasting data and building up the high-dimensional statistical analysis.
IN recent years, people are beginning to pay more and more attention to the impact of the environment on health, and the information related to air quality has become the focus of people’s daily life. The existing air quality monitoring instruments, stations and satellite meteorological data can provide real-time air quality monitoring information . However, this is far from sufficient, and it is entirely necessary to predict the trend of air pollutants in the future. Currently, the forecast data on weather conditions is of high reliability and accuracy.