the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Global fields of daily accumulation-mode particle number concentrations using in situ observations, reanalysis data and machine learning
Abstract. Accurate global estimates of accumulation-mode particle number concentrations (N100) are essential for understanding aerosol–cloud interactions, their climate effects, and improving Earth System Models. However, traditional methods relying on sparse in situ measurements lack comprehensive coverage, and indirect satellite retrievals have limited sensitivity in the relevant size range. To overcome these challenges, we apply machine learning (ML) techniques— multiple linear regression (MLR) and eXtreme Gradient Boosting (XGB)—to generate daily global N100 fields, using in situ measurements as target variables and reanalysis data from Copernicus Atmosphere Monitoring Service (CAMS) and ERA5 as predictor variables. Our cross-validation showed that ML models captured N100 concentrations well in environments well-represented in the training set, with over 70 % of daily estimates within a factor of 1.5 of observations. However, performance declines in underrepresented regions and conditions, such as clean and remote environments, underscoring the need for more diverse observations. The most important predictors for N100 in theML models were aerosol-phase sulphate and gas-phase ammonia concentrations, followed by carbon monoxide and sulfur dioxide. Although black carbon and organic matter showed the highest feature importance values, their opposing signs in the MLR model coefficients suggest their effects largely offset each other’s contribution to the N100 estimate. By directly linking estimates to in situ measurements, our ML approach provides valuable insights into the global distribution of N100 and serves as a complementary tool for evaluating Earth System Model outputs and advancing the understanding of aerosol processes and their role in the climate system.
Competing interests: Some authors are members of the editorial board of journal AR.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.- Preprint
(3540 KB) - Metadata XML
-
Supplement
(1538 KB) - BibTeX
- EndNote
Status: open (until 12 Aug 2025)
Data sets
Daily Averaged Accumulation Mode Particle Number Concentrations (N100) from 35 Stations (2003-2019) A. Ovaska, E. Rauth, D. Holmberg, P. Artaxo, J. Backman, B. Bergmans, D. Collins, M. A. Franco, S. Gani, R. M. Harrison, R. K. Hooda, T. Hussein, A. Hyvärinen, K. Jaars, A. Kristensson, M. Kulmala, L. Laakso, A. Laaksonen, N. Mihalopoulos, C. O'Dowd, J. Ondracek, T. Petäjä, K. Plauškaitė-Šukienė, M. Pöhlker, X. Qi, P. Tunved, V. Vakkari, A. Wiedensohler, K. Puolamäki, T. Nieminen, V.-M. Kerminen, V. A. Sinclair, and P. Paasonen https://doi.org/10.5281/zenodo.15222674