Introduction

Particulate matter with an aerodynamic diameter of less than 2.5 µm (PM2.5) has been widely studied for its impacts on human health. The 2015 Global Burden of Disease Study estimates that exposure to outdoor fine particulate matter is the fifth leading risk factor of death worldwide, accounting for 4.2 million deaths and 103.1 million disability-adjusted life years (Cohen et al. 2017). In Europe, although air pollution levels have generally decreased over the last decades, especially in the large urban and industrial centers, they still exceed the World Health Organization (WHO) health guideline (annual mean PM2.5 concentrations of 5 µg/m3) (WHO 2021). Particular sub-components of PM2.5, such as carbonaceous particles (e.g., black carbon: BC, organic aerosols: OA), have received great attention in the scientific community. These are mostly emitted during the incomplete combustion of organic material, for example, from biomass burning and diesel exhaust (Raga et al. 2018). There is a growing evidence from toxicological studies that these carbonaceous particles are particularly hazardous to human health and are associated with cardiovascular and respiratory outcomes (including asthma, congestive heart disease, and lung cancer) (Bates et al. 2019). A recent epidemiological study by Pond et al. (2022) has indicated ten times greater cardiopulmonary mortality association for elemental carbon (EC) than for total PM2.5 and larger cardiopulmonary mortality risk for secondary organic aerosols and vehicle sources within the US National Health Interview Survey cohort compared to other components and sources. On the other hand, low or insignificant cardiopulmonary mortality associations were found for primary organic aerosols (Pond et al. 2022, Pye et al. 2022). These PM2.5 sub-components likely cause disproportional harm compared to fine inorganic particles. Despite these indications, carbonaceous aerosols are not the focus of policymakers due to inconsistencies of component-specific toxicities (Pond et al. 2022).

Air quality models are useful for air quality studies and air pollution health impact studies. They are potentially the most powerful tools to capture the complexity of atmospheric pollution life cycles and interactions and are able to support emission reduction strategies (Thunis et al. 2022). The Weather Research and Forecasting Model coupled with online chemistry (WRF-Chem) is a widely used air quality model, with several user-defined options, suitable for global and regional air quality modeling. Compared to offline models, WRF-Chem has the advantage of a tight coupling between meteorology and species transport, as well as the consistency between chemical and physical parameterizations used in radiation, planetary boundary layer (PBL) dynamics, mixing, convection, and photolysis processes (Ahmadov et al. 2012). The accuracy of air quality modeling is highly dependent on the input data (including emissions) and the representation of meteorology. Temperature, humidity, wind speed, and direction can influence the boundary layer stability, cloud/fog formation, and radiation, all of which affect the transport, dispersion, chemistry, and photochemistry (Gilliam et al. 2015). Several nudging techniques (e.g., spectral and grid-analysis nudging) have been introduced to regional climate models to prevent them from drifting from actual meteorological conditions. Grid analysis nudging uses a relaxation term to adjust the model predictions at individual grid points with the same strength and is applied for retrospective meteorological modeling for air quality applications (Liu et al., 2018). Nudging has been shown to improve PM2.5 simulation (Jeon et al. 2015). Meteorological grid-analysis nudging is generally more appropriate than spectral nudging, but nudging coefficients should be chosen to adjust the strength of the nudging force in the governing equations (Ma et al. 2016).

PM2.5 modeling has several challenges, especially regarding organic species, which contribute a large part of PM2.5 mass (20–90%) (Kanakidou et al. 2005). Organic aerosols have a complex chemical composition that varies depending on the emission source and chemical processes. Therefore, the accuracy of emission inventories and the representation of the chemical reactions are key aspects. WRF-Chem has been developed to reproduce many known chemical reactions effectively; however, the final output depends on several features, including the choice of chemical and physical parameterizations.

Organic aerosols are derived from multiple sources, both primary (POA) (i.e., directly emitted from traffic or combustion sources) and secondary organic aerosols (SOA) formed through gas-phase oxidation reactions of volatile organic compound (VOC) precursors (Baltensperger et al., 2005). Their complex and diverse formation pathways are not yet fully known, thus their representation in models is incomplete. Several studies report that organic aerosols are often underestimated by air quality models possibly due to missing key precursors (e.g., intermediate and semi-volatile organic carbons: IVOCs and SVOCs) and uncertainties in emissions especially for specific sources (e.g., residential combustion) (Tuccella et al., 2015; Kong et al., 2015; Berger et al., 2016). The emissions of OA precursors have been shown to be uncertain due to their spatial and temporal variability, the sectoral attribution of emission factors, and inconsistencies in measurement or reporting techniques used to determine the emission factors (Borbon et al., 2013, Kanakidou et al., 2005). Particularly, the emission factors from residential wood combustion, one of the most important sources of organic aerosols in Europe, (Raga et al., 2018; Lelieveld et al., 2020; Paunu et al., 2021) are not quantified in a harmonized way among the EU countries (Denier Van Der Gon et al., 2015). This is because of challenges in obtaining accurate activity data and fuel statistics (e.g., for wood), especially from rural regions, which leads to missing information and underestimated emissions. Another factor contributing to the uncertainty in emission inventories is the sampling conditions, such as the temperature, humidity, and dilution ratio during their test sampling which can affect the condensation of SVOCs/IVOCs. Several studies have highlighted the importance of the so-called “condensable organics” in estimated emissions which are typically not quantified in a harmonized way between countries (Kanakidou et al., 2005; Robinson et al., 2007; Denier Van Der Gon et al., 2015; et al. Simpson et al, 2020). The key role of anthropogenic VOCs to SOA formation and related health effects has been discussed in a recent study, where the mitigation of anthropogenic VOCs emissions has been found to be a more efficient strategy in reducing air pollution-associated cardiopulmonary deaths than that of nitrogen and sulfur oxides (NOx or SOx) (Pye et al. 2022).

The variability in emission estimates can lead to model uncertainties and further challenges to health impact studies and policy design on emissions reduction measures. For example, Crippa et al. (2019) have calculated that uncertainties related to emissions can contribute to 1.1–3.4 million excess death uncertainty intervals, being about 50% of the estimated total mortality per year globally. Kushta et al. (2018) suggested that WRF-Chem captures well the variation of fine particulate matter over Europe and highlighted that the uncertainties related to model resolution and performance are not as significant as the uncertainties driven by the relative risk (RR) models for estimating excess mortality. Other studies have found important differences in excess death estimates attributed to PM2.5 concentration uncertainties, different ozone precursors emissions, and meteorology (J. Liu et al. 2022; Wang et al. 2019). RRs are currently based on annual PM2.5 mass concentration, and the resulting excess mortality over a large region of interest (e.g., at the continental or global scale) is possibly not highly impacted by such model uncertainties compared to smaller-scale mortality estimates (e.g., at the country or city level). However, if the contribution of carbonaceous aerosols (CA) to the overall health burden is to be estimated, it is important to obtain representative levels of these components, which may be more hazardous to human health and can thus contribute relatively strongly to excess mortality (Bates et al. 2019; Daellenbach et al. 2020; Chowdhury et al. 2022). In this study, we run the WRF-Chem model over Europe to simulate annual PM2.5 levels and the carbonaceous sub-components BC and OA, including POA and SOA. We aim to obtain representative concentrations of these components that can be used in air pollution health impact studies by investigating the influence of several factors (e.g., anthropogenic emissions and meteorological nudging) on the modeled OA concentrations. Moreover, we investigate the implications of model uncertainties related to OA on excess mortality estimations.

Methodology

Model configuration

WRF-Chem-v.3.9.1 (Grell et al. 2005) was used to simulate the annual concentrations of PM2.5 and its carbonaceous constituents over Europe. WRF-Chem consists of a meteorological core (WRF) that enables the simulation of various meteorological processes online with a coupled chemistry core, which also enables the simulation of chemical processes, thus allowing the interaction between meteorology and chemistry.

We simulate the year 2015 (January 1st to December 31st), a representative and well-studied year with full availability of emission data. The model performance has been previously evaluated for the year 2015 using the benchmarking evaluation methodology developed in the framework of the Forum for Air Quality Modelling in Europe (FAIRMODE). The FAIRMODE methodology considers paired series of model and observed values of PM2.5 for year-long periods. It investigates the model capabilities by introducing an overall indicator, namely, the modeling quality indicator (MQI), considering the measurement uncertainty of each pollutant (Kushta et al. 2019; Thunis et al. 2012).

The model horizontal resolution is 20 × 20 km, with 33 vertical layers, and top level at 50 hPa. The meteorological initial and boundary conditions have been obtained from the NCEP FNL (Final Operational Global Analysis) data, updated every 6 h. Besides the main prognostic variables (e.g., temperature, wind components, etc.), the FNL data provide information of other parameters that would require a long spin-up period (e.g., soil moisture and temperature). The chemical boundary conditions were obtained from the global Model for Ozone and Related chemical Tracers (MOZART-4 (Emmons et al. 2010), at 1.89° × 1.89° resolution). The mineral dust emissions are calculated online, as well as the biogenic emissions which are computed using the Model of Emissions of Gases and Aerosols from Nature version 2.1 (MEGAN2.1). The chemical/aerosol scheme chosen is the Regional Atmospheric Chemical Model coupled with Modal Aerosol Dynamics for Europe (MADE) with the Volatility Basis Set (VBS) (RACM/MADE-VBS). RACM includes 57 chemical species and 158 gas-phase reactions, and MADE deploys two overlapping log-normal modes to simulate aerosol size distribution (Tuccella et al. 2015; Ahmadov et al. 2012). The VBS is used to calculate SOA formation taking into account the semi-volatile nature of OA, ageing processes (homogeneous oxidation of organic carbon vapors by OH radicals) and to significantly improve the model’s ability to reproduce observed OA concentrations (Bergström et al. 2012; Kryza et al. 2020). We note that IVOCs are not included in this scheme as SOA precursors (default model set-up), while studies have shown that they have a significant contribution to SOA formation (Karydis et al. 2010; Tsimpidi et al. 2010). More detailed information about the VBS scheme used here can be found in Ahmadov et al. (2012). Parameterization schemes that are common in all simulations are indicated in Table 1.

Table 1 Other physical and chemical options that are common in all simulations 

Sensitivity tests of anthropogenic emissions

Through this analysis, we aim to improve the model representation of PM2.5 and carbonaceous aerosols in Europe for health impact estimations of long-term exposure to these particles and test the impact of important factors contributing to their modeling uncertainties. The model configurations are based on previous studies for the region of interest (Denier Van Der Gon et al. 2015; Tsimpidi et al. 2010; Zittis et al. 2018; Georgiou et al. 2020). Several modifications on anthropogenic emissions are applied to investigate the impact of (a) combining two different anthropogenic emission inventories, specifically for NMVOCs (S4-S7) and (b) the inclusion of IVOCs (S3, S6, and S7) and SVOCs (S7) precursors emissions on the final OA concentration (Table 3). IVOCs emissions are reported to be missing in current emission inventories and have been estimated to be a factor of between 0.25 and 2.8 times the primary organic carbon emissions (Robinson et al. 2007; Tsimpidi et al. 2010). In the present study, we take into account missing IVOC emissions equal to 1.5 times the primary organic carbon (POC) emissions (Table 3) (Tsimpidi et al. 2010). Anthropogenic emissions are interpolated by the “anthro-emiss” utility to generate daily emissions for 2015. The two emission inventories used in this study are the EDGARv.5 (Crippa et al. 2020) (https://edgar.jrc.ec.europa.eu) used as the primary, and the CAMS-REG-AP.v4.2 (DT) (Kuenen et al. 2022) (https://eccad.aeris-data.fr) used as complementary data for NMVOCs from residential combustion and SVOCs emissions. EDGARv.5 does not distinguish between filterable and condensable organic carbon. On the other hand, CAMS-REG-AP.4.2 (DT) accounts for condensable PM2.5 (condensed SVOCs) emissions that are usually not included in emission inventories because they fall outside the normal definitions of VOCs. SVOCs are partially in the particulate phase shortly after emission due to condensation upon dilution and cooling. The emissions of condensed SVOCs are added as PM2.5 in CAMS-REGv.4.2 (DT) that led to considerably higher PM2.5 emissions than EDGARv.5. This additional fraction of SVOCs in PM2.5 was used in this study (in simulation S7) and added to the primary OC emissions to be interpolated by the “anthro-emiss” utility. The main differences between the two emission inventories are summarized in Table 2. CAMS-REG combines emission data from several sources including the recently updated emission inventory for residential wood combustion (RWC) and the TNO-newRWC (Denier Van Der Gon et al. 2015). Residential combustion is one of the dominant sources of organic aerosols in Europe, with a strong impact on local air quality and health in many EU cities. However, as emissions from the RWC sector are often underestimated in the emission inventories (Bergström et al. 2012), their contribution is not well-identified. The TNO-newRWC was developed following a detailed analysis of the nationally reported residential emissions. It included data from wood usage, accounted for appliance type, and spatially distributed the RWC emissions based on population density, urban/rural differentiation, and access to wood fuel (Paunu et al. 2021). It also used a set of more harmonized emission factors between EU countries. This resulted in a more consistent emission dataset, independent of important country differences. CAMS-REG also has a higher resolution (7 × 7 km over Europe) than EDGAR (11 × 11 km). Most of the EU emissions are lower than in EDGAR but were found to be higher in non-EU countries (North Africa, Eastern Europe, and the Middle East), of which some are included in our model domain (Kuenen et al. 2022).

Table 2 Differences between EDGARv.5 and CAMS-REG-AP anthropogenic emission inventories

Sensitivity tests on meteorological nudging

Nudging involves the addition of a dynamic relaxation term to the prognostic equations based on the difference between the model state and a reference field (Zittis et al. 2018). Grid and spectral nudging are two techniques that can be applied in WRF-Chem. Each approach has its advantages, but the results can be sensitive to many factors (the nudging coefficients, boundary conditions, the frequency of nudging, and the selection of variables). Here, we assess whether applying these techniques can improve the OA simulation in WRF-Chem and which configuration is the most suitable for this purpose. Initial test simulations with default grid analysis performed better than spectral nudging; therefore, it was selected for this application. Analysis nudging was applied every 6 h to the zonal and meridional components of wind (U and V), temperature (T), water vapor mixing ratio (QVAPOR), and geopotential perturbation (PH) in every model layer, including the planetary boundary layer (PBL). The default nudging coefficient is 0.0003. Temperature is an important parameter that directly affects the volatility of OA precursors and their partitioning into gas or aerosol phases. Thus, we also test the impact of a stronger nudging coefficient in temperature on the final OA concentration and keep the rest of the nudging coefficients to the default option (Zittis et al. 2018). More details in the sensitivity analysis regarding meteorological nudging are presented in Table 3. The model validation in terms of meteorological parameters and the effect of nudging is presented in the Supplementary Information (SI). We note that the aerosol-radiation feedback is turned on in all simulations.

Table 3 Summary of the sensitivity tests on emission changes and meteorology

Comparison with observations

The modeled particle concentrations are compared with ground-based observations obtained from EBAS-Nilu (http://ebas.nilu.no/) and AirBase (https://www.eea.europa.eu/) databases. In total, 996 stations that cover a large part of Europe are used for PM2.5. Thirty stations for black carbon and 33 for organic carbon are used, which are located mostly over central Europe, with less coverage in the northern (e.g., Sweden, Norway, Finland) and some southern regions (e.g., Italy and Greece) (Figs. S10S12). Statistical metrics such as annual mean bias (MB), mean absolute error (MAE), mean normalized bias (MNB), mean fractional bias, and mean fractional error (MFB, MFE) are derived for each simulation to assess their performance. The sum of modeled POA and SOA is compared with organic aerosols from observations. The organic carbon to organic mass conversion factor of 2.1 (Yazdani et al. 2021) is also used to compare the modeled OC. The comparison is focused on annual averages since the hazard ratios used to estimate excess mortality from long-term exposure to fine particles are based on annual mean concentrations. Surface meteorological variables, including temperature (T), precipitation, and wind speed at 10 m (WS10m), are compared with observations from E-OBS (Cornes et al. 2018) (at 0.25° × 0.25°) and ERA-5 (Hersbach et al. 2020) (at 0.1° × 0.1°) datasets. ERA-5 is used for wind speed comparison because it has better spatial and temporal coverage than the E-OBS for this meteorological parameter (for more details, see the Supplementary Information). Ideally, independent observations should have been compared with the meteorological fields. However, this could not be achieved considering the number of station observations (part of E-OBS) that are already assimilated in the boundary conditions (FNL dataset). The model performance is assessed based on the aerosol annual mean biases. As proposed by Boylan and Russell (2006) for standard modeling applications, the mean fractional bias and errors (MFB and MFE) should be below 60 and 75%, respectively, for the model to be considered acceptable for use in air quality studies. When the MFB and MFE are below 30% and 50%, respectively, the level of accuracy is considered to be close to the best a model can be expected to achieve (Berger et al. 2016).

Excess mortality calculations

The annual excess deaths are calculated using the age-specific hazard ratios (HR) for non-communicable diseases and lower respiratory infections by Burnett et al. (2018) based on the PM2.5 annual mean concentration (Eqs. 12). The contribution of carbonaceous aerosols to excess deaths can be estimated with two approaches. The first approach is to calculate new HRs based on PM2.5 concentration without CA concentration, and the resulting excess mortality (without CA) is to be deducted from the total. With this approach, the new HRs are derived at the tail of the (non-linear) risk curve, which might not reflect the actual mortality risk and lead to lower numbers of excess deaths. To avoid this limitation, we use a different approach and attribute the excess deaths from carbonaceous aerosols to their contribution to total PM2.5 concentration as indicated in Eq. 3. BMR accounts for age-specific baseline mortality rates that are obtained from the Global Burden of Disease Results Tool (http://ghdx.healthdata.org/gbd-results-tool) and POP are age-specific population data obtained from NASA’s Socioeconomic Data and Applications Center (SEDAC) GPW-v4 dataset (https://sedac.ciesin.columbia.edu). The 95% confidence interval (CI) is calculated by estimating the upper and lower values of the HR functions by using the standard error of parameter θ in GEMM, obtained from Burnett et al. (2018) (see Supplementary Information). The lower and upper values of the baseline mortality rates are also included in the excess mortality calculations (Eqs. 2 and 3) accordingly. The results are calculated at the population’s grid resolution (0.042° × 0.042°) and then aggregated at the country and European levels. In our European domain, we include 45 countries (based on the United Nations’ official statistics), including Turkey and part of Russia.

$$HR \left(z\right)= exp\left(\theta \times log\left( \left(\frac{z}{a}\right)+1\right) /\left(1+exp\left(-\frac{z-\mu }{\nu }\right)\right)\right)$$
(1)
$$Excess {Mortality}_{PM2.5}= BMR \times POP \times \mathrm{HR}-1/\mathrm{HR}$$
(2)
$$Excess {Mortality}_{CA}= Excess {Mortality}_{PM2.5} \times \frac{CA}{\mathrm{PM}2.5}$$
(3)

where θ, α, µ, and ν are the parameters of the risk model and z = max (0, PM2.5–2.4 µg/m3).

Results

Sensitivity tests

Compared to ground-based observations, annual mean PM2.5 concentrations are simulated reasonably well in all sensitivity tests (Fig. S4, top left). The PM2.5 bias ranges between 1 and 17% and is well within the uncertainties associated with the observational dataset, which is in the range of 10–15%, except in simulation S2, which resulted in the highest mean normalized bias (17%) (Table S2). Detailed analysis of all sensitivity test results (S1S7) is included in the Supplementary Information. Here, we focus on the differences between the default (S1) and best-performing (S7) simulations (Table 4, Fig. 1). Simulation S7, which includes both IVOC and SVOC emissions, improved most substantially in annual mean PM2.5 concentrations. This led to considerably lower biases compared to the default simulation (S1). The inclusion of SVOCs in S7 has lowered the annual MB to − 1.28 µg/m3 but led to some overestimation of PM2.5 in some stations, mainly at the lower concentration ranges (below 20 µg/m3). Based on the statistical metrics obtained in simulation S7, the model performance for PM2.5 reaches beyond the acceptable levels for MFB and MFE, which should be lower than 60 and 75%, respectively (Table 4). The MFB and MFE are also well within the maximum limits for the model to be considered accurate (30% and 50%, respectively).

Table 4 Statistical metrics from the default (S1) and best-performing (S7) simulations, including mean bias (MB), mean absolute error (MAE), mean normalized bias (MNB), mean fractional bias (MFB), and mean fractional error (MFE) for PM2.5, organic carbon (OC), total organic aerosols (OA), and black carbon (BC)
Fig. 1
figure 1

Annual modeled PM2.5, black carbon (BC), and total organic aerosols (OA) compared to observations from the default (S1) and best-performing (S7) simulation

Significant differences in the simulation of meteorological fields resulted in different seasonal behaviors (Figs. S5S7). The medium nudging coefficient for temperature in S7 led to a lower annual mean temperature than S1 and less total precipitation (Figs. S5S6). The relative overestimation of precipitation in S1 has been compensated with an underestimation of precipitation in simulation S7 that partially explains the improved PM2.5.

BC is also well simulated by the model, particularly at lower concentrations (Table 4, Fig. 1, upper-right panel). Some overestimation of BC is observed in rural stations in Germany, Portugal, and the Netherlands, which can be attributed to changes in wind speed and prevailing direction in the abovementioned areas. An intermediate nudging coefficient in temperature leads to lower MNB for BC (in simulations S3, S6, and S7) than the stronger nudging and outperforms the default simulation (S1). At the higher concentration ranges (over 0.5 µg/m3), BC is underestimated in two stations but improves at both sites in the simulation S7. The persistent underestimation in one of the stations could be due to small-scale activities (e.g., urban plumes) that might be responsible for the high local BC concentrations. Such local effects might need higher spatial resolution to be captured by the model. Furthermore, the different techniques that are used for BC measurements may contribute to this uncertainty (Genberg et al. 2013). However, since BC is mostly influenced by emissions, transport, and deposition, the overall good agreement between the model and the observations in most of the simulations shown in Fig. S1 (top-right) suggests that the model is representing the interaction of these three processes reasonably well. Based on the statistical indicators obtained for the annual averages, the model is well within the acceptable and accuracy limits for BC (Table 4).

Here, we focus on the organic fine particles since they are shown to be particularly sensitive to model limitations and likely more hazardous to human health compared to inorganic fine particles. A negative annual mean bias is found with all simulations (Fig. S4, bottom-left), which ranges between − 22 and − 35%; however, there are notable improvements from the different model modifications we have applied (Table S2). The sensitivity test results for OA are analyzed in the Supplementary Information.

Comparing the default and the best-performing simulations, the annual MB for OA has dropped from − 1.03 to − 0.7 µg/m3 and the MNB from − 35 to − 22% (Table 4). The default simulation did not reach the MFB and MFE levels, under which the model can be considered accurate but was within the acceptable ranges (50% and 56% for MFB and MFE, respectively). On the other hand, the best-performing simulation approached the maximum MFB and MFE levels for model accuracy, with 31% and 40%, respectively.

Monthly comparison of OA concentration is also performed for the default and best-performing simulations. In the default simulation (S1), the modeled OA is underestimated mostly during winter and to a lesser extent in autumn and spring (Fig. 2, left panel). This is primarily the case in stations with high OA concentrations (above 5 µg/m3). This underestimation is improved in simulation S7, where we account for the missing IVOCs and SVOCs emissions, e.g., from residential heating. These are important SOA precursors contributing largely to OA in winter (Ciarelli et al. 2017). Furthermore, simulation S7 results in a cold bias and a lower temperature than simulation S1, which could favor partitioning of the semi-volatile species to the aerosol phase. On the other hand, simulation S7 leads to some overestimation at the lower concentration ranges during the cold months of the year. This can be attributed to increased primary OA emissions applied in simulation S7. A slight OA overestimation is also observed during summer in simulation S1, which is also improved in simulation S7. The overestimated OA in simulation S1 compared to S7 can be partially attributed to differences in the meteorological fields that affect SOA formation during the summer months (e.g., changes in PBL height, temperature, or wind speed). The default simulation resulted in higher temperatures during summer that may compensate for this overestimation by enhancing their transport to higher elevations. This is consistent with other studies that report an underestimation of OA during summer (Jiang et al. 2019). Furthermore, variations in primary OA can affect SOA concentration. The partitioning between condensable organic vapors and SOA used in the VBS approach depends on the total organic matter (Ahmadov et al. 2012). Thus, if POA is under-predicted, the resulting SOA could also be underestimated (Tuccella et al. 2015). Overall, the concentration of OA is better in agreement with the observations in S7 during all months, especially at the higher concentration levels.

Fig. 2
figure 2

Monthly organic aerosol (OA) concentrations compared with monthly observations for simulations S1 (left panel) and S7 (right panel). Each dot refers to a monthly value, which is colored based on the season to which each month corresponds

PM2.5, carbonaceous aerosols, and health implications of model uncertainties

Based on the highest accuracy simulation (S7), we estimate that the PM2.5 concentration levels over Europe range between 5 and 25 µg/m3. Eastern European countries exhibit up to four times higher PM2.5 levels than the recommended WHO safety thresholds (Fig. 3). Iceland and the northern Scandinavian countries fall within the WHO safety limits (below 5 µg/m3). Carbonaceous aerosol concentrations are higher in winter (up to 10 µg/m3) than during summer (up to 7 µg/m3) (Fig. S13). The contribution of BC to the total PM2.5 concentration is relatively low and ranges between 1 and 3% (Fig. 4 bottom-right panel). On the other hand, the OA component contributes to up to 25% of total PM2.5, with POA and SOA contribution to PM2.5 varying spatially (Fig. 4, top panels). Higher ratios of POA are observed in Eastern Europe (up to 25%), which could be attributed to solid biofuel combustion emissions used for residential heating in Eastern Europe (Jiang et al. 2019). The SOA contribution is shown to be less (up to 10%) with a clear increasing gradient going from northwest to southeast Europe.

Fig. 3
figure 3

Annual mean concentration of PM2.5 for the year 2015 based on simulation S7

Fig. 4
figure 4

Ratio of primary organic aerosols (POA, top-left), secondary organic aerosols (SOA, top-right), total organic aerosols (OA, bottom-left), and black carbon (BC, bottom-right) to total PM2.5 concentration

The best-performing model configuration (S7) adequately captures the total PM2.5 concentration to support health impact studies. The model results for PM2.5 agree well with the observations, with a bias equal to − 1.28 µg/m3 in our best simulation. This bias is well within the uncertainties associated with the observational dataset, which is in the range of 10–15% (Allan et al. 2022). To investigate to what extent our modeling improvements affect the health impact estimations, we calculate the excess mortality based on PM2.5 levels obtained from the default (S1) and best-performing simulation (S7). We find a ∼15% difference in the total annual excess mortality for the European domain that is attributed to different model configurations (S7-S1). This would account for a difference of about 105,000 excess deaths per year between the two simulations.

Simulation S1 results in 680 (CI: 558–802), and S7 results in 785 (CI: 645–925) thousand annual excess deaths over the wider European domain. In the view of spatial distribution, this difference is shown to be small (Fig. 5). To assess the role of carbonaceous aerosols (as sub-components of PM2.5), on health impact estimates, we calculate their contribution to total excess mortality (from PM2.5) and investigate whether our improvements in OA modeling affect the results.

Fig. 5
figure 5

Differences in carbonaceous aerosols concentration (left panel) and in their attributable excess mortality (right panel) in each grid cell between simulations S1 and S7

When we calculate the fraction of excess mortality from carbonaceous aerosol concentrations from the default and best-performing simulations (S1 and S7) for the whole domain, we find that S7 yields almost two times higher total excess deaths compared to S1. This is equal to 79 (CI: 65–93) and 131 (CI: 108–156) thousand excess deaths per year from simulations S1 and S7, respectively. Since PM2.5 levels and total excess mortality differ between S1 and S7, we also calculate the percentage contribution of carbonaceous aerosols to the total excess mortality being 11 and 17% (on average), respectively, for the larger European domain. When the population per grid cell is taken into account, this is interpreted as 15 and 23 deaths per 100,000 individuals (on average) per year in Europe from simulations S1 and S7, respectively (Fig. 6).

Fig. 6
figure 6

Annual excess deaths due to PM2.5 (top panels) and carbonaceous aerosols (bottom panels) exposure in Europe (normalized to the population of each grid cell) with the default (left) and best-performing (right) simulations

Furthermore, the relationship between the GEMM (used for hazard ratios (HRs) calculation) and PM2.5 concentration is supra-linear over the lower exposure ranges and then near-linear at higher concentrations (Burnett et al. 2018). Therefore, the differences in excess mortality estimates can depend on whether the improvements of PM2.5 and OA appear at lower or higher PM2.5 exposure ranges. However, our results do not take into account the increased harmfulness of carbonaceous particles compared to inorganic ones, which increases their contribution to total excess mortality (Lelieveld et al. 20022015; Chowdhury et al. 2022). We expect that the uncertainties in OA modeling would have a greater impact on studies where the increased toxicity of carbonaceous aerosols is accounted for. The uncertainty related to model PM2.5 (here estimated 15%) is less than the uncertainty related to GEMM risk model (~ 18%). On the other hand, the uncertainty related to modeled carbonaceous aerosols is higher (~ 65%), highlighting the need for further modeling improvements.

Conclusions

Annual PM2.5 and carbonaceous aerosols were simulated with the WRF-Chem model over Europe for 2015. Several configurations were tested to improve the simulation of OA concentrations, which are typically underestimated by air quality models and can affect health impact studies. The model performance was evaluated by comparing the model output with available ground-based observations. This highlighted a lack of information regarding observations of organic carbon aerosols in Europe. Although there is a large number of stations that measure PM2.5, air quality monitoring of carbonaceous aerosols in Europe is not always adequate in time and space. This lack of observations, combined with other factors that can affect PM2.5 simulation, makes the model assessment a challenging task. From our simulations, the modeled PM2.5 and BC concentrations agree well with the available observations in all simulations and within the uncertainties associated with the observational dataset. However, OA show a systematic negative bias that ranges between − 22 and − 35% depending on the configuration. The application of meteorological nudging in WRF-Chem is critical for achieving accurate PM2.5 simulations. Here, we find improvements mainly in near-surface temperature and wind speed after the application of nudging. However, caution is needed in the selection of nudging coefficients. Our simulation with a stronger nudging coefficient in temperature did not improve the concentration of OA due to meteorological inconsistencies, mainly during summer, where we find cold biases for temperature. From our sensitivity analysis, we find that significant improvements in OA are possible, especially with the inclusion of anthropogenic SVOC/IVOC emissions. Thus, the observed negative bias is partially attributed to missing emissions of anthropogenic SOA precursors (e.g., SVOC/IVOC) and simplifications of SOA formation in the model. However, including these SOA precursors as primary OC emissions led to partial OA overestimation in some regions with low OA concentrations (e.g., below 2 µg/m3). At sites with higher concentrations, where SOA dominate and are usually not captured well by the model, the added SOA precursors improved the modeled total OA.

Our work focuses on the European region and provides air pollution information at the European level, complementing similar studies with a national focus. The model 20 km resolution is relatively high for the whole domain considering the computational time needed for the numerous simulations. Our study focused on annual averages for the year 2015 which are relevant for health impact estimations of long-term exposure to PM2.5 and carbonaceous aerosols. Furthermore, the comparison of OA can be made either by comparing organic carbon that is converted to organic mass or by comparing the sum of POA and SOA. The conversion factor used for organic carbon to OM (here 2.1) is a simplification commonly used to avoid discrepancies between stations when calculating SOA concentrations. In agreement with other studies, the OM/OC ratio largely depends on the level of oxidation in the organic aerosol; the more oxygen associated with carbon in the aerosol, the higher the OM/OC ratio (Brown et al. 2013). The OM/OC can vary depending on the location (e.g., rural vs urban areas) and season (winter vs summer) and has been found to range between 1.5 and 2.2. Consequently, using a fixed conversion factor might under- or overestimate OC concentrations. On the other hand, since computed SOA varies between different models and configurations of the same model, this approach removes potentially large SOA variations that can lead to inaccurate comparison. The agreement between OA and observations depends on the chosen approach and the station location. Furthermore, the impact of our sensitivity tests on inorganic aerosols is not evaluated, but they also contribute significantly to PM2.5. Biogenic sources of OA are also important, especially in rural sites. Specifically, isoprene chemistry has been shown to significantly contribute to SOA concentrations in Europe and can partly explain the underestimation of OA concentrations in southern Europe (Bessagnet et al. 2008). Here, we do not assess the role of biogenic OA, but an inadequate treatment of these particles in the model or uncertainties in their emissions may contribute to the general underestimation of OA highlighted in this study. Our results highlight the need for a better and more consistent methodology for calculating anthropogenic emissions that will take into account condensable organic precursors. There is an ongoing effort from several experts (e.g., Simpson et al. 2022) to advance the emissions reporting method, but there is still room for improvement and future work regarding OA handling in modeling studies.

We find that the model reaches adequate accuracy thresholds for PM2.5 in both the default and best-performing simulations. We note that OA were not accurately modeled in the default set-up and required modeling adjustments to obtain more accurate results. Based on our analysis, we find that our modeling adjustments can contribute to a 15% difference in the total PM2.5 attributable excess mortality over Europe, which is equal to 105,000 additional excess deaths per year. From these numbers, 52,000 additional deaths are due to carbonaceous aerosols, captured by our modeling improvements. These are twice as high as the excess deaths obtained in the default simulation. We note that the results strongly depend on the shape of the hazard ratio model used, which drives the relationship between PM2.5 concentration and mortality risk. Additionally, we find that the uncertainty related to modeled PM2.5 (15%) is less than the uncertainty related to GEMM risk model (~ 18%). However, we identify much higher mortality uncertainty related to carbonaceous aerosols (~ 65%), suggesting that more effort is needed towards improving their representation in the model for adequately capturing the representative exposure levels. As long as emission inventories do not follow a consistent methodology to account for the emissions of important organic precursors, OA modeling will continue to require adequate scaling of the emissions.

Our results indicate that the contribution of carbonaceous aerosols reaches 17% of total excess mortality with our best-performing simulation. This is a lower limit since further modeling improvements for OA are possible, and we did not consider the potentially increased toxicity of these components. Hence, we cannot neglect their significant contribution on the mortality burden, and we expect that the uncertainties in OA modeling will have a greater impact on studies in which the potentially high harmfulness of carbonaceous aerosols is accounted for.