the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A Cross-Correlation Based Method for Determining Size-Resolved Particle Growth Rates
Abstract. The particle growth rate (GR) is a key parameter in aerosol dynamics and plays a crucial role in understanding atmospheric new particle formation and its effects. A fast, robust and reproducible calculation of GRs from aerosol number-size distribution data remains a challenge. In this study, we introduce a new method that we call the maximum correlation method for calculating particle and ion GRs from number-size distributions. We employed this novel method to calculate GRs from Hyytiälä, Finland using 14 years of ion and total particle size distribution data and compared our results against previous studies that used conventional methods for calculating the GRs. We found that our method compares well against the published data and reproduces the seasonal variability and size-dependent trends in the GRs. The maximum correlation method enables fast and repeatable GR calculations from large aerosol datasets, which facilitates the systematic incorporation of GR analysis into new particle formation studies.
Competing interests: Markku Kulmala is a member of the editorial board of Aerosol Research.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.- Preprint
(2673 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 04 Aug 2025)
-
RC1: 'Comment on ar-2025-19', Anonymous Referee #1, 12 Jul 2025
reply
This manuscript by Lampilahti et al. presents a new method to calculate the particle growth rate (GR) during new particle formation (NPF). The particle GR is a key metric to understand the gas-to-particle conversion during NPF and the calculation from the particle number size distribution (PNSD) is still a considerable source of uncertainty when mechanistic insights into NPF are inferred from ambient aerosol data. This has two reasons: 1) considerable instrumental uncertainties in the sub-10 nm size range 2) the usage of GR calculation methods which highly depend on the user or are sensitive to the measurement uncertainties. In that sense, this manuscript presents a considerable step forward for the community working on analyzing NPF as no fully automated method to infer the GR from the PNSD has been presented so far. The introduced maximum cross-calculation method is shown to be robust compared to the other methods when a comparing the results for Hyytiälä Finland. However, the manuscript in its current form has some shortcomings, which should be addressed such that this paper encourages the involved community to use that new method in the future and therefore I can only recommend publication in AR after the following points have been addressed:
Major comments:
- The authors show in this manuscript that the new maximum cross-correlation method performs similar (providing the same medians and variance) to other methods when applied to a dataset from Hyytiälä, Finland. What is missing from this manuscript (and what should be part of an introduction into any new method) is the testing of its performance against data where the true GR is known! The central point of this manuscript is not only to show that the new approach reproduces probably the same errors than the other methods have but to show that it probably can also retrieve the actual GR of an NPF event. Inclusion of a synthesized dataset (with instrumental uncertainties imposed) and the application of the new method to it should be the number 1 point of the results. I am 100% confident, that this group of authors has access to such synthesized NPF event data and it is therefore not too much additional work to include.
- The second major point relates to the fact that the authors analyze 14 years of Hyytiälä data, but then do not make use of the already existing analyses of these datasets. As far as I know, already analyzed Hyytiälä NPF event classification and GR data should be available to that set of authors for the same 14 years they now present as being analyzed by the maximum cross-correlation method. It is unclear to my why Figure 5, only contains randomly selected days where the GR is re-analyzed with the maximum concentration method, while there should be the entire GR dataset for 14 years be available. There are two specific things I’d like to see in a revised manuscript: 1) Correlation plots of the full 14-year dataset for all instruments (ion-GR, particle-GR NAIS, particle-GR DMPS) between all GRs calculated from the maximum cross-correlation method and GRs calculated from other methods, whenever both methods returned a value. 2) The background/signal differentiation histograms when only data from manually classified NPF events (or use the ranking method or whatever) are used. Does the background population completely vanish if we only use values from “classified” NPF events (should be Appendix Figure).
Minor comments:
- Line 35-42: Stolzenburg et al., 2023, Rev. Mod. Phsy. is the most recent and complete review on particle growth and also compares different methods with each other. It should be included here.
- Line 37: In the above mentioned review, there is a long discussion about a third approach using the full evolution of the PNSD combined with the general dynamics equations. These methods do disentangle the different contributions to GR but suffer from other challenges. They should be mentioned here, because especially those methods would have the chance to also be fully automated (in an ideal world)
- Line 63: What is the difference between these methods to the one presented in Lehtipalo et al., 2014. This should be clarified here.
- Line 67: I would also add that one of the advantages of the size channel based methods is that they do not require perfect knowledge on the absolute concentrations of the PNSD (i.e. the inversion correctness). In fact, they sometimes can even be run on raw data. The same applies to the new maximum cross-correlation method and should be mentioned here.
- Line 100-101: I’d prefer N_1 (with bar above) for the mean notation, as this is far more common.
- Line 109: To judge the robustness of the method, it would also be interesting how the results change when a different averaging is applied. Especially if the method is transferred to other environments, this parameter might need to change. In addition, it is known that rolling averages can skew the results of the appearance time method, so it would be interesting to see what changes when a static, time resolution reduction (i.e. block averages) is used.
- Line 123-131: It is not fully clear to me how the approach in creating more size increments is facilitated. Is the PNSD first inverted for all data and then somehow resized? I.e. how many original channels are e.g. in that window between 2-3 nm and how many increments are later used? I.e. do you obtain a higher size-resolution than the original data with that approach?
- Line 133: In my opinion the method should be called maximum cross-correlation method throughout the manuscript as maximum correlation method could be misleading/ doesn’t describe exactly what the method does.
- Line 147-149: That sentence seems to be broken. “To ensure” what?
- Line 149-151: Again, how many original size channels are in these ranges?
- Line 173-178 and 183-184: As said in the major comment: Here should be a comparison to the classical NPF event day characterization.
- Line 189-192: Again, how many original size channels are in these ranges?
- Line 192-194: Probably the right argument. But assuming these fast GRs are really there, would it make sense to then probably just tolerate e.g. one of the tau_max,i to be below zero to still capture some of these?
- Line 211-214: As the new method can calculate GR on more days, it would be very interesting to see how this corresponds to a event classification scheme (major comment 2)
- Line 216-218, Figure 5: It is important to see this comparison across different instruments and also probably different methods. Moreover, as said in major comment 2, the authors should make use of the full Hyytiälä datasets available to them.
- Line 240-241: Again Stolzenburg et al. (2023) probably provides the to-date most complete overview of GR datasets and should be referenced here.
- Line 245-250: Feels a bit off here, as this more or less is Figure 8, but Figure 6 is not yet discussed here. Should that be later in the manuscript?
- Line 253-255, Figure 6: My honest opinion: This Figure is not very interesting as it doesn’t provide any new insights (except that the new method can perform across different seasons). If the authors want to save space as they need to include the comparisons with synthesized data or the full Hyytiälä dataset, they can remove it. In addition, the caption is too short and should explain more what is in that Figure.
- Line 260-268, Figure 7: I would love to see the median days also for the “edges” of the GR distribution, i.e. the very fast and very slow growth cases, as these might be the most interesting, where deviations from the classical “banana-type” picture might appear.
- Line 296-297: Why only ion GR from Gonzalez-Carracedo? The DMA train data are especially useful in the sub 3 nm range, where other instruments often perform worse. This comparison should be shown, as this is one of the cases where the method might reach its limits (I don’t think so, but it needs to be included).
Citation: https://doi.org/10.5194/ar-2025-19-RC1 -
RC2: 'Comment on ar-2025-19', Anonymous Referee #2, 19 Jul 2025
reply
The work of Lampilahti et al. introduces a novel, fully automatic method for calculating particle growth rates (GRs) during new particle formation (NPF) events, using a maximum correlation approach applied to particle number size distribution data (both ions and particles). The accurate determination of GRs is essential for understanding NPF mechanisms and for quantifying the influence of NPF on climate, particularly in relation to the formation of CCN. Automating this calculation is a crucial, but challenging step and I consider the current manuscript a valuable addition to the NPF community. However, in its current form, the manuscript presents a few methodological and conceptual issues that need to be addressed to ensure the robustness, transparency, and comparability of the proposed method. I therefore recommend the manuscript for publication in AR after the authors have adequately addressed the major and minor comments outlined below.
Major comment:
The GRs calculated with the maximum correlation method are compared with GRs calculated with other methods, for the corresponding time periods, in Hyytiälä. However, the various studies used for comparison have utilized an event classification algorithm (either manual or automatic) before calculating the GRs. These are usually calculated only for clear/strong (Class I) NPF events but sometimes even for Class II events (Manninen et al., 2009). The new method seems to include days not typically classified as clear events (maybe sometimes undefined but also non-events due to quiet NPF). While I agree that most “Signal Days” are indeed clear NPF events, I have concerns about the comparability of the results, because of lack of classification prior to GR calculation. The absence of a pre-classification step introduces uncertainty; for example, your method may exclude days previously identified as Class I events or include days that would not typically qualify. What was the mean clear event (Class I) frequency in Hyytiälä during the periods of interest? Is it comparable with the signal % you find in Fig. 4? How many days were previously classified as Class I but were excluded by your method (and vice versa)?
Furthermore, as pointed out in lines 167-172 (and elsewhere) the early morning/late evening concentrations leading to erroneous and small GRs (background) seems to be a small weakness of the method (especially if one wants to use it in environments with increased local sources etc.), which is of course expected for automatic methods. However, my greatest concern is the possibility of getting GRs that appear reasonable, but for the wrong reasons, for example by having similar concentration and diameter increases from different sources (and not NPF) which can “trick” the method into calculating a “fake” GR higher than β, resulting in “signal” and not in “background”. Figure 7 helps illustrate that most of these days are NPF, but because it is averaged and normalized, days that do not follow this behavior can be averaged out. Have you observed such discrepancies in Hyytiälä?
General comment:
- I suggest that the “Results” section should be reorganized into clearly defined subsections. At present, the narrative lacks coherence, and transitions between paragraphs are abrupt, making the analysis difficult to follow. For instance, the paragraph starting on line 260 appears disconnected from the preceding discussion and could logically belong to a separate subsection.
Minor comments:
- Lines 54-67: I would suggest commenting a bit more about how these existing methods compare with each other. When each method is preferred (for example type of data, available sizes etc.).
- Line 63: A related approach to what? Related to the size channel-based methods? Please clarify.
- Lines 70-72: Please elaborate a little bit more. GR calculation and especially fitting-based methods are extremely sensitive to concentration spikes (e.g. from local sources), noisy data and even meteorological conditions that can alter the number concentration for a while.
- Lines 73-74: Additionally, there is not a “universal” GR calculation method, and some (older) studies do not even mention how GR was calculated exactly.
- Lines 96-97: What was the temporal resolution of your dataset? It would help to specify it here.
- Lines 109-110: Isn’t that a high averaging time for such a dynamic phenomenon as the growth rate? I get that in Hyytiälä the GRs are generally small compared to other environments, but still when calculating it for the size range of 2-3 nm (or even the 3-7 nm) as you do later, major changes can happen even within 1 hour. Also, I’m certain that during the 14 years, there should be NPF days with high GRs. Is the method/ averaging/ smoothing time-sensitive enough to catch these dynamic changes and calculate a trustworthy GR? What happens if you have more than one NPF events on the same day?
- Figure 1: What is the data time resolution you use in general with this method? From Fig 1a it seems it’s 1 hour. However, you have said that you use 3 hour rolling mean smoothing to run the method. Why not apply it in these graphs?
- Line 167-168: How was the local minimum, β ranging between the different diameter classes? In other words, what was the minimum growth rate above which you considered to have an NPF event (signal)?
- Line 171-172: I understand that, but it is not very clear to me why this happens. Please elaborate on this.
- Line 177: Please replace “of” with “or”.
- Line 181: How many days were the outliers? What was different in these days? Increased local sources?
- Fig A1 and A3 and A5 (and also lines 175-178): It is related to my major comment. You say that “the background days mostly days that would be classified as non-event or undefined days”. I agree, but also the signal days contain days that would not be necessarily classified as “Class I” days (which are the class typically used for GR calculations in most studies). In fact, from Figs A1, A3, A5 I see days that I would classify as non-events (Fig. A1 middle, and second line to the left) or undefined (Fig. A3 in the middle, and the one bottom right) (and also some are close to Class II). Even if a growth pattern exists in these cases (which is not clear in this color scale) how do these GRs compare to the actual NPF observations of the previous studies that use only Class I (and sometimes also Class II) events to calculate the GR?
- Lines 189-196: How did you decide the 0.1 log difference for the size increments? Was it a trial-error approach? With this approach, how many high-GR days do you think were discarded because of τmax=0? Could lowering the data averaging time help for these days?
- Lines 212-214: I suspect that the higher percentages of your method are mostly because of the so-called “quiet NPF” (Kulmala et al., 2022). Since you normalize the number concentrations, the algorithm detects growth patterns not detected by previous manual methods. Is that correct? Can you elaborate on that? If this is true, this can actually be a good feature about this method (meaning to calculate GR of days that would be otherwise not included manually).
- Lines 215-219: As Reviewer #1 pointed out, I would like to see a more comprehensive comparison of the new method with previous calculations of the GRs in Hyytiälä. Maybe you could utilize a GR database and include more days in Fig. 5.
- Lines 245-246 and 249-250: Why is that? Please elaborate.
- Figure 6: This Figure is not adequately described in both the main text and the figure’s caption. Furthermore, it is not connected to the previous analyses of the paper, and I believe it is of small value in the current paper (at least in the main text) since it focuses on the GR calculation method and not its general variability in the atmosphere of Hyytiälä. Maybe you could move it to the Supplement or try to connect it better to the rest of the text.
- Lines 260-268: I’m not sure what the conclusion of this paragraph is. That the days included automatically by the method indeed represent NPF days? Why don’t you first use an automatic classification method (Aliaga et al., 2023) to make sure that you have NPF events only? Because Fig. 6 is the median normalized diurnal, the days that are not typical NPF events but are included in the GR calculation are averaged out.
- Lines 267-268: Isn’t that self-evident?
- Lines 275-303: Relates to major comment. The studies you are comparing with have used only the clear NPF events (usually classified as Class I but some use also the Class II) to calculate the GRs, so their averages may include less days than the ones included in the new method. I think it is important to clarify this here and elsewhere in the text. Additionally, in Figure 8, consider annotating each bar with the number of days included in the corresponding average (maybe above the bars).
- Lines 311-312: This statement can be misleading, since the method was tested and validated only for Hyytiälä and no other environments yet. Please rephrase.
- Lines 339-342: That is a good idea, to combine the method with the automatic classification. From my point of view, the classification of the events should always come before the GR calculation, which is something the authors do not do directly. Have the authors tried this combination? It could increase the efficiency and the comparability of the method, since days not typically classified as NPF will not be included in the GR calculation.
References
Aliaga, D., Tuovinen, S., Zhang, T., Lampilahti, J., Li, X., Ahonen, L., Kokkonen, T., Nieminen, T., Hakala, S., Paasonen, P., Bianchi, F., Worsnop, D., Kerminen, V.-M., and Kulmala, M.: Nanoparticle ranking analysis: determining new particle formation (NPF) event occurrence and intensity based on the concentration spectrum of formed (sub-5 nm) particles, Aerosol Research, 1, 81–92, https://doi.org/10.5194/ar-1-81-2023, 2023.
Kulmala, M., Junninen, H., Dada, L., Salma, I., Weidinger, T., Thén, W., Vörösmarty, M., Komsaare, K., Stolzenburg, D., Cai, R., Yan, C., Li, X., Deng, C., Jiang, J., Petäjä, T., Nieminen, T., and Kerminen, V. M.: Quiet New Particle Formation in the Atmosphere, Frontiers in Environmental Science, 10, https://doi.org/10.3389/fenvs.2022.912385, 2022.
Manninen, H. E., Nieminen, T., Riipinen, I., Yli-Juuti, T., Gagné, S., Asmi, E., Aalto, P. P., Petäjä, T., Kerminen, V.-M., and Kulmala, M.: Charged and total particle formation and growth rates during EUCAARI 2007 campaign in Hyytiälä, Atmospheric Chemistry and Physics, 9, 4077–4089, https://doi.org/10.5194/acp-9-4077-2009, 2009.
Citation: https://doi.org/10.5194/ar-2025-19-RC2
Data sets
Dataset for "A Cross-Correlation-Based Method for Determining Size-Resolved Particle Growth Rates" Janne Lampilahti, Pauli Paasonen, Santeri Tuovinen, Katrianne Lehtipalo, Veli-Matti Kerminen, Markku Kulmala https://doi.org/10.5281/zenodo.15648698
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
112 | 26 | 17 | 155 | 13 | 19 |
- HTML: 112
- PDF: 26
- XML: 17
- Total: 155
- BibTeX: 13
- EndNote: 19
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1