Introduction

Microwave signals emitted by space-based navigation satellites or radio sources are inevitably influenced by the neutral atmosphere when they propagate through the atmosphere, revealed as retardation and geometric bending effects. The neutral atmospheric effect is commonly referred to as tropospheric slant delay in the fields of space geodesy since the dry gases and moisture particles in the troposphere mainly caused it. Compared to the ionosphere, the neutral atmosphere is a non-dispersive medium for microwave signals, and hence, tropospheric delays cannot be directly eliminated by the linear combinations of observations at multiple frequencies. It is necessary to accurately model them by means of external datasets such as sounding profiles or numerical weather model (NWM) operational or reanalysis data.

For modeling purposes, tropospheric slant delays are normally separated into two components: a hydrostatic part mainly caused by the dry gases and a non-hydrostatic (wet) part induced by the water vapor in the lower atmosphere (Davis et al. 1985). Both hydrostatic and non-hydrostatic constituents can be represented as the product of the zenith delay and respective mapping functions under the assumption of a spherically symmetric atmosphere (Böhm et al. 2006). As the dry gases are relatively stable in the atmosphere, accurate modeling of zenith hydrostatic delay (ZHD) can be realized based on physics-based models with measured surface pressure under the condition of atmosphere hydrostatic equilibrium (Hopfield 1971; Saastamoinen 1972). Using the in situ meteorological measurements, ZHD values at the earth’s surface can be derived by the Saastamoinen (1972) model with a typical accuracy of a few millimeters. However, in contrast to the ZHD, it is difficult to accurately estimate zenith wet delay (ZWD) due to the high dynamics of the atmospheric moisture both spatially and temporally (Nilsson et al. 2013; Dousa and Elias, 2004).

At present, the most accurate way of determining ZWD is the ray-tracing through atmospheric vertical profiles from radiosonde measurements (e.g., Davis et al. 1985; Niell 1996) or data from numerical weather models. It is based on rigorous theoretical fundamentals and is thus commonly used as the benchmark for ZWD modeling and validation. In addition to the ray-tracing approach, ZWD is usually parameterized as the function of surface meteorological variables at the sites, for example, traditional troposphere models by Saastamoinen (1972) and Hopfield (1971). However, the accuracy and reliability of those models with surface meteorological data are still limited (Yao and Hu 2018). Considering vertical humidity distribution in the troposphere and assuming that water vapor pressure decreases exponentially with increasing height, Askne and Nordius (1987) proposed a formula for ZWD with the input parameters water vapor decrease factor \(\lambda\), weighted mean temperature \(T_{\text{m}}\), and surface water vapor pressure \(e_{\text{s}}\). For convenience, the formula proposed by Askne and Nordius (1987) is referred to as the A&N formula in the following.

Compared with the approaches mentioned above, global empirical models such as the UNB3 series (Collins and Langley 1997; Leandro et al. 2008) and Global Pressure and Temperature (GPT) series (Böhm et al. 2015; Landskron and Böhm 2018) can directly estimate zenith delays as well as mapping functions with a loss of accuracy using the date and approximate station coordinates. They have a promising prospect in real-time and fast geodetic applications since no external meteorological data are required. Nevertheless, global empirical troposphere models still cannot well represent daily fluctuations and complex variations of meteorological variables in the troposphere because they are realized based on monthly NWM data or the US standard atmosphere model. It has been demonstrated that the UNB3m (Leandro et al. 2008) model typically yields ZWD values with a global mean accuracy of around 5.5 cm due to a coarse spatial resolution and the lack of sensitivity to the longitude in the modeling (Möller et al. 2014). Concerning the GPT3 (Landskron and Böhm 2018) and its former version GPT2w (Böhm et al. 2015), they can yield ZWD estimations with a global mean accuracy of approximately 3.8–4.0 cm compared with the radiosonde observations (Yao and Hu 2018).

In recent years, machine learning has attracted much attention due to the ability to model time-varying signals through a large number of historical datasets. It has been widely applied to space geodesy, meteorology, and geophysical studies, such as satellite orbit determination (Peng and Bai 2019; Mortlock and Kassas 2021), ionospheric delay modeling and forecasting (Han et al. 2021; Zhukov et al. 2021), precipitable water fusion and rainfall forecast (Khanisni et al. 2021; Zhang and Yao 2021), tropospheric zenith delay modeling (Kitpracha et al. 2019; Mohammed 2021), and global atmospheric forecast (Arcomano et al. 2020). Up to now, few studies have been devoted to global modeling of the tropospheric ZWD based on machine learning (ML) approaches. Moreover, the ZWD modeling in these studies was realized based on site-wise Vienna Mapping Function 3 (VMF3) products (Landskron and Böhm 2018) from TU Wien or troposphere products at discrete International GNSS Service (IGS) stations. Their modeling accuracy over oceans is inevitably affected due to the inhomogeneous distribution of sites. In addition, the accuracy of wet components for IGS troposphere products is affected by the modeling uncertainty of surface pressure at the sites. On the other hand, the ZWD accuracy of the VMF3 products depends on the quality of the NWM meteorological fields and ray-tracing algorithms. Therefore, the main objective of this study is to realize global modeling for tropospheric ZWD based on machine learning and surface meteorological data derived from radiosonde data and Global Positioning System radio occultation (GPS RO) measurements. The modeling accuracy over the globe is expected to be improved by considering the relationship between ZWD and surface meteorological variables and integrating both radiosonde data and GPS RO observations. Accurate a priori information for ZWD is of importance to real-time precise positioning and other geodetic applications, which can reduce the coordinate convergence time and to some extent improve positioning accuracy. In particular, this applies to cases when the estimation of the ZWD from GNSS observations is not advisable.

The next section introduces the datasets of radiosonde data and GPS RO atmospheric profiles, along with ZWD determination through the numerical integration approach. Then, we present the ZWD modeling based on the random forest and backpropagation neural network approaches. Moreover, we validate the ZWD models and formulae by comparison with the numerical integrals of radiosonde data and GPS RO measurements across the globe. Conclusions are given in the last section.

Datasets and zenith wet delay determination

Radiosondes provide meteorological profiles with a high vertical resolution at discrete sites, while GPS radio occultations gathered by low-earth-orbit satellites can retrieve atmospheric profiles with good global coverage. Thus, the integration of both kinds of atmospheric profiles will enable more reliable modeling of the tropospheric ZWD over the globe. Within the current study, the modeling datasets include ZWD values and surface meteorological data derived from 5 years (2015–2019) of radiosonde data at 596 sites and 4 years (2016–2019) of GPS RO global atmospheric profiles of the Meteorological Operational Polar Satellite (MetOp-A/-B) mission. The datasets for the validation were obtained from radiosonde data at 609 near-global sites and GPS RO atmospheric profiles of the MetOp-A/-B satellites in 2020.

Atmospheric profiles from radiosondes

Radiosonde (RS) can retrieve meteorological variables for total pressure, temperature, relative humidity, geopotential height, wind speed as well as directions at a series of isobaric levels, covering the troposphere and part of the stratosphere (Durre et al. 2018). Currently, several research institutions including the University of Wyoming and the National Centers for Environmental Information (NCEI) routinely generate and release site-wise RS files after applying a consistent strategy for data check and quality control. These RS datasets provide 6–12 hourly atmospheric vertical profiles at more than 1500 sites on land covering the period from 1938 onward to the present (Li et al. 2020).

In this study, high-quality radiosonde data at near-globally distributed sites were obtained from the Integrated Global Radiosonde Archive (IGRA) via the website (https://www.ncei.noaa.gov/pub/data/igra/). It should be noted that we carried out a data screening to remove the stations that have less than half of radiosonde observations during the period of 2015–2019. Moreover, a quality control scheme was also applied to remove the outliers of the IGRA radiosonde files. On the one hand, the sounding profiles with the uppermost pressure level larger than 300 hPa or two adjacent levels exceeding 200 hPa are removed. On the other hand, the sounding profiles with more than 8 standard pressure levels are only taken into account. After the data screening and quality control procedures, 5 years (2015–2019) of radiosonde data at 596 near-globally distributed sites were utilized for the modeling, and the sounding profiles at 609 sites in 2020 were employed for the validation.

Atmospheric profiles from GPS radio occultations

The Global Positioning System Receiver for Atmospheric Sounding (GRAS) onboard low-earth-orbit satellites provides radio occultation (RO) events with a global distribution every day. For each RO event, meteorological profiles with a high vertical resolution in the neutral atmosphere can be derived based on the Abel integral inversion under the assumption of local spherical symmetry (Kursinski et al. 2000). For a detailed introduction to atmospheric profiles inversion from GPS RO observations we refer to Kursinski et al (1997).

The Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC) Data Analysis and Archive Center (CDAAC) processes and provides atmospheric data and products from a dozen GPS RO missions of different low-earth-orbit satellites. In the current study, the CDAAC post-processed ''wetPrf'' atmospheric profiles from MetOp-A and MetOp-B missions were used, which were routinely produced by a one-dimensional variational (1D-Var) data analysis using European Centre for Medium-Range Weather Forecasts (ECMWF) gridded analyses or short-term forecasts as background fields. 4 years (2016–2019) of ''wetPrf'' atmospheric profiles were used as one part of modeling data sources, and the ''wetPrf'' atmospheric profiles in 2020 were employed for the evaluation. The spatial distribution of GPS RO observations for MetOp-A and MetOp-B satellites on January 1, 2019, is displayed in Fig. 1. We can see that the GRAS for each MetOp satellite can retrieve 650–700 GPS RO atmospheric profiles with uniform distribution across the globe every day, which provides abundant atmospheric state information in marine and polar regions.

Fig. 1
figure 1

Distribution of ZWD values derived by the numerical integration of GPS RO atmospheric profiles for MetOp-A and MetOp-B satellites on January 1, 2019

ZWD determination with atmospheric profiles

Using the atmospheric profiles from radiosonde data and GPS RO observations, the ZWD values at each site can be accurately determined by numerically integrating non-hydrostatic refractivity in the zenith direction from the site height to the top of the lower atmosphere (Nilsson et al. 2013):

$${\text{ZWD}} = 10^{ - 6} \int_{h_s }^{h_{{\text{top}}} } {N_{{\text{nh}}} } {\text{d}}h$$
(1)

where \(N_{{\text{nh}}}\) is the non-hydrostatic refractivity in the atmosphere, a function of the water vapor pressure and temperature. Equation (1) can be rewritten as follows:

$${\text{ZWD}} = 10^{ - 6} \int_{h_S }^{h_{{\text{top}}} } {\left[ {\left( {k_2 - \frac{{R_{\text{d}} }}{{R_{\text{w}} }}k_1 } \right)\frac{e}{T} + k_3 \frac{e}{T^2 }} \right] \cdot Z_w^{ - 1} } {\text{d}}h$$
(2)

where \(T\) and \(e\) denote the absolute temperature in [K] and partial pressure of water vapor in [hPa] in the atmosphere, respectively; \(R_{\text{d}}\) and \(R_{\text{w}}\) represent the specific gas constant for dry gases and water vapor, respectively; \(k_i\) (i = 1, 2, 3) are the coefficients of atmospheric refractivity and the ''best-average'' values determined by Rüeger (2002) are adopted in the current study (k1 = 77.6890 K/hPa, k2 = 71.2952 K/hPa, k3 = 375,463 K2/hPa); \(Z_{\text{w}}\) is the compressibility factor for atmospheric water vapor indicating the deviation of the atmosphere from an ideal gas.

ZWD modeling based on RF and BPNN approaches

In general, ZWD modeling is based on global atmospheric profiles and two machine learning approaches: random forest (RF) and backpropagation neural network (BPNN). The basic functional formulation for the modeling is presented in Table 1. As can be seen from Table 1, the input features consist of the day of the year (DOY), latitude (lat), longitude (lon), ellipsoidal height (hgt), surface water vapor pressure (es), and surface temperature (Ts). Note that the surface meteorological data used in the modeling are retrieved from the lowest level of RS and GPS RO atmospheric profiles. The desired model outputs are time-varying ZWD values calculated from RS data and GPS RO measurements. Moreover, we can see that both RF-based and BPNN-based models are suitable for determining time-varying ZWD values at sites with access to surface meteorological data.

Table 1 ZWD modeling schemes through the RF and BPNN regression analysis

Backpropagation neural network

The backpropagation neural network (BPNN) is a multiple feed-forward neural network, which was proposed by Rumelhart et al. (1986). Generally, a BPNN system consists of an input layer, one or multiple hidden layers, and an output layer. For a fully connected neural network, each layer has a certain number of neurons and each neuron in the current layer is directly connected to the neurons of the subsequent layer with an activation function. In this study, we adopted a three-layer fully connected BPNN structure to develop global ZWD models with surface meteorological parameters, as presented in Fig. 2. Each neuron in the input layers contains station location information and surface temperature and water vapor pressure at a certain epoch; the corresponding neuron in the output layers contains ZWD values obtained from the RS and GPS RO measurements. The number of neurons in the hidden layer needs to be optimized.

Fig. 2
figure 2

Structure of the BPNN for the tropospheric ZWD modeling

The main task of the BPNN algorithm is to determine the network parameters of the weights and offsets that optimally approximate the relationship between input datasets and the desired outputs through a training process. It mainly comprises two phases: the forward propagation of input signals and the backpropagation of the ''loss'' or error signals between model outputs and desired results. For the former, information contained in the input datasets enters the input layer, propagates through the hidden layer, and finally transfers to the output layer. The functional formulation that represents the forward propagation of a three-layer BPNN system can be written as follows:

$$Y = f_{{\text{HO}}} (W_{{\text{HO}}} f_{{\text{IH}}} (W_{{\text{IH}}} X + B_{{\text{IH}}} ) + B_{{\text{HO}}} )$$
(3)

where \(X\) and \(Y\) are the input vectors and the respective output vectors, respectively; \(f_{{\text{IH}}}\) and \(f_{{\text{HO}}}\) are the activation function that connects each neuron in the current layer and the neurons in the subsequent layer; \(W_{{\text{IH}}}\) and \(B_{{\text{IH}}}\) denote weight matrices and offset vectors between the neurons in the input and hidden layers; while \(W_{{\text{HO}}}\) and \(B_{{\text{HO}}}\) represent the weight matrices and offset vectors between the neurons in the hidden and output layers. In the modeling, the Nguyen–Widrow algorithm was applied to obtain initial weights and offsets for the three-layer neural network model (Nguyen and Widrow 1990). Moreover, we adopted two activation functions to connect the neurons between adjacent two layers, and their formulae are expressed as follows (Yonaba et al. 2010):

$$f_{1} (u) = \frac{2}{{1 + e^{ - u} }} - 1$$
(4)
$$f_{2} (u) = u$$
(5)

where \(f_{1} (u)\) is a hyperbolic tangent function that connects the neurons between the input and hidden layers. While \(f_2 (u)\) is a linear activation function that connects the neurons in the hidden and output layers.

After the forward propagation of input signals, the error or ''loss'' signals between model outputs and the desired results are derived. If the ''loss'' signals cannot achieve the predefined requirement for precision, then the error backpropagation process would be performed layer-by-layer to adjust the network parameters. We adopted the Levenberg–Marquardt (L–M) algorithm (Levenberg 1944) to iteratively update the weights and offsets for each neuron until the minimum mean square error is guaranteed. In this work, the maximum number of iterations was set to 2000 and the minimum performance gradient was set to 1 × 10–7. The cost function of error signals is presented as follows:

$$J(W,B) = E_{X,Y} \left\{ {L\left[ {f(X;W,B),Y} \right]} \right\}$$
(6)

where the defined cost function \(J(W,B)\) is the mean loss of all training samples, which is affected by the weights and offset values for each neuron; \(E_{X,Y}\) is the mathematical expectation; \(L[ \cdot ]\) denotes the loss function of the model estimations and desired results. Based on the gradient descent method, the optimal network parameters of the weights and offsets can be determined:

$$W_{i + 1} = W_i - \alpha J_W^{\prime} (W_i ,B_i ) = W_i - \alpha \nabla L_W (W_i ,B_i )$$
(7)
$$B_{i + 1} = B_i - \alpha J_B^{\prime} (W_i ,B_i ) = W_i - \alpha \nabla L_B (W_i ,B_i )$$
(8)

where \(W_i\) and \(W_{i + 1}\) denote the initial and adjusted weights; the corresponding initial and adjusted offset values are represented by \(B_i\) and \(B_{i + 1}\), respectively; \(J_W^{\prime}\) and \(J_B^{\prime}\) denote the partial derivatives of the cost function with respect to W and B, respectively.

Random forest

Random forest (RF) is an ensemble learning approach for classification and regression by integrating multiple decision trees. Based on the concept of bagging, it generates a set of decision trees for training and yields the mode of classes for classification or average estimation for regression (Breiman 2001). Due to the ensemble learning and data-driven mechanism, the RF regression approach is suitable for modeling the nonlinear functional relationship and reproducing time-varying signals.

The RF adopts the bootstrap algorithm to construct multiple homogeneous subsets from the training samples and constructs multivariable regression trees for each subset based on the decision tree algorithm. For each subset, approximately two-thirds of the training samples are selected, including input vector X (\(x_i\), i = 1, 2, …, N) and desired outputs Y (\(y_i\), i = 1, 2, …, N). This subset constructs an input space R with dimensions of N × P, where N is the number of input samples and P is the number of input features. The decision tree aims at constructing a functional formulation from input vector X to the desired output vector Y:

$$Y = f(X,P)$$
(9)

Generally, the functional formulation \(Y = f(X)\) in the current study is realized based on the classification and regression tree (CART) algorithm (Lewis 2000). The CART algorithm constructs each tree by recursively partitioning the input space R into two non-overlapping subspaces (R1, R2) based on randomly selected p (p < P) features at current node s, and two corresponding splitting subspaces can be defined as follows (Xue et al. 2021):

$$R_1 (p,s) = \left\{ {X\left| {x^{(p)} \le s} \right.} \right\},R_2 (p,s) = \left\{ {X\left| {x^{(p)} > s} \right.} \right\}$$
(10)

where \(x^{(p)}\) is a splitting variable with the feature p and s the split node. The optimal feature p and splitting node s are determined based on the principle of minimum regression error:

$${\mathop {\min }\limits_{p,s}} \left[ {{\mathop {\min }\limits_{c_1 }} \sum_{x_i \in R_1 (p,s)} {(y_i - c_1 )^2 + {\mathop {\min }\limits_{c_2 }} \sum_{x_i \in R_2 (p,s)} {(y_i - c_2 )^2 } } } \right]$$
(11)
$$\hat{c}_1 = \frac{1}{{N_{R_1 } }}\sum_{x_i \in R_1 (p,s)} {y_i } ,\hat{c}_2 = \frac{1}{{N_{R_2 } }}\sum_{x_i \in R_2 (p,s)} {y_i }$$
(12)

where \(c_1\) and \(c_{2}\) are the output values of subspace R1 and R2 for the binary split node, respectively:

The binary partitioning process continues until the stopping criterion is satisfied, that is a node has fewer than a minimum number of samples needed for splitting or the partition reaches the maximum depth of the individual trees. Finally, the input space R is partitioned into a set of subspaces {R1, R2, …, RM}, and the final output results for the constructed CART tree can be calculated by the regression model \(f(X)\):

$$Y = f(X,P) = \sum_{m = 1}^M {c_m I(X \in R_m )}$$
(13)

where the I is an indicator function and its value is 1 when \(X \in R_m\) or 0 when \(X \notin R_m\).

Hyperparameter determination

The number of decision trees for the RF regression and the number of neurons in the hidden layer for the BPNN regression are regarded as the hyperparameter that affects modeling accuracy and generalization ability. In this study, we devised a set of decision tree numbers and neuron numbers ranging from 5 to 60 with a step of 5 to test the modeling accuracy of the RF and BPNN regression models, respectively. The optimal hyperparameters for those ML-based models were rigorously determined through the tenfold cross-validation (CV) approach. The tested accuracy of the ML-based models is represented as the root-mean-square (RMS) of ZWD residuals between model outputs and the expected results (references), as illustrated in Fig. 3.

Fig. 3
figure 3

Generalization ability of the RF-based and BPNN-based ZWD regression models

As can be seen from Fig. 3, the ZWD modeling accuracy improves with the increasing number of decision trees and with the increasing number of neurons in the hidden layer, indicating ensemble learning can combine multiple learner bases to achieve better generalization ability. Moreover, we find the ZWD modeling based on BPNN is more sensitive to the hyperparameter than that of the RF regression. When the number of decision trees varies from 5 to 60, the ZWD accuracy has only improved by 0.1 cm for the RF models, while it improved by 0.4 cm for the BPNN models when the number of neurons in the hidden layer changes from 5 to 60. Furthermore, it becomes obvious that the ZWD modeling accuracy for the RF and BPNN regression tends to be stable when the number of decision trees is larger than 25 and the number of neurons in the hidden layer exceeds 40, respectively. Therefore, we fixed the number of decision trees to 25 and set the number of neurons in the hidden layer to 40 to develop ZWD models through the RF and BPNN regression analysis, respectively.

Model validation and comparison

In this section, the ML-based ZWD models were validated by using global atmospheric profiles from radiosonde data and GPS RO measurements in 2020. As a comparison, we also assessed several commonly used models and formulae for ZWD calculation, which include the model by Saastamoinen (1972), the formula by Askne and Nordius (1987), a rule of thumb (Böhm and Schuh 2013, Eq. 51, p89), and Global Pressure and Temperature 3 (GPT3) (Landskron and Böhm 2018). Among them, the Saastamoinen (1972) model calculates ZWD values based on es and Ts, while the Askne and Nordius (1987) formula determines ZWD values using the parameter es, the weighted mean temperature, and the water vapor decrease factor. GPT3 can directly estimate ZWD at the site in the vicinity of the earth’s surface based on the gridded coefficients for meteorological parameters. On the other hand, the water vapor pressure in GPT3 can be replaced by real observations. This approach is referred to as A & N(es) in the following. Moreover, a rule of thumb (Böhm and Schuh 2013, Eq. 51, p89) stating that ZWD in [cm] approximately equals the water vapor pressure in [hPa] is called ZWD-es in the following.

Validation with sounding profiles

In this subsection, the ML-based models and other formulae for ZWD determination were evaluated using radiosonde data at 609 near-globally distributed sites in 2020. The overall statistics of ZWD residuals between the modeled values and radiosonde-derived results at all 609 sites are presented in Table 2. It should be noted that the surface meteorological data used in the ML-based models (RF-A, RF-B, BPNN-A, and BPNN-B), ZWD-es, Saastamoinen (1972) model, and A&N(es) are retrieved from the lowest data in the sounding profiles.

Table 2 Statistics of ZWD differences between modeled values and radiosonde-derived results at 609 sites in 2020

As illustrated in Table 2, the GPT3 model estimates ZWD values at all sites with a bias of − 0.5 cm and an overall RMS error of 4.2 cm. Considering surface water vapor pressure and surface temperature, the global accuracy of ZWD modeling for the ML-based models and other formulae have improved with different magnitudes. Taking the integrals of sounding profiles as the benchmark, the Saastamoinen and ZWD-es achieve absolute biases of 0.7–0.9 cm and overall accuracies of 4.1 cm and 4.0 cm, respectively, around 0.1–0.2 cm improvement relative to the GPT3. They are unable to accurately describe vertical humidity conditions in the lower troposphere even with the support of surface meteorological information. On the other hand, compared to GPT3, the ZWD accuracy gained by the RF and BPNN regression analysis approaches has significantly improved by taking surface meteorological parameters into account, especially surface water vapor pressure. We can see that the RF-A and RF-B models better reproduce time-varying ZWD signals with biases close to zero and overall agreements of 3.2 cm and 3.1 cm, respectively. The BPNN-A and BPNN-B achieve biases of around 0.1 cm and overall RMS errors of 3.4 cm and 3.3 cm, respectively. In general, the RF-B is superior to the GPT3, Saastamoinen model, ZWD-es, and A&N(es), with overall RMS reductions of around 26.0%, 23.8%, 22.9%, and 12.2%, respectively.

To study the spatial characteristics of ZWD accuracy, we calculated site-wise ZWD residuals between modeled values and radiosonde data at each epoch in 2020. Then, the statistics indicators for these models and formulae were derived at each site using the ZWD residuals. Figures 4 and 5 display site-wise biases and RMS errors of ZWD values between model estimations and radiosonde-derived results in 2020, respectively.

Fig. 4
figure 4

Global distribution of mean ZWD errors between modeled values and radiosonde data in centimeters for the year 2020

Fig. 5
figure 5

RMS of ZWD residuals between modeled values and radiosonde data in centimeters for the year 2020

As illustrated in Fig. 4, the Saastamoinen model and ZWD-es overestimate ZWD values by 2–4 cm at a few stations on the island and coastal areas in low latitudes. On the other hand, they exhibit negative biases with absolute magnitudes larger than 3 cm in South China, Southeast Asia, and northern South America although the surface water vapor pressure data are utilized for the ZWD calculation. A similar spatial distribution of negative anomalies can also be observed for GPT3. This phenomenon indicates that GPT3, the Saastamoinen model, and ZWD-es are unable to well represent the short-term and complex variations of atmospheric moisture in regions with monsoon climate and tropical rainfall climate patterns. For A&N(es) and BPNN-based models, we find negative and positive biases with absolute magnitudes of 1–3 cm at a few sites even though the measured surface water vapor pressure data are provided. On the other hand, we reveal that the RF-based models yield ZWD values with absolute biases within 1 cm for the majority of sites over the globe. The RF-based and BPNN-based models mitigate large positive and negative biases as present in GPT3, the Saastamoinen model, and ZWD-es, which to some extent indicates they have improved capability to model short-term and complex variations of the moisture in the troposphere.

Concerning the RMS errors, we can find from Fig. 5 that the GPT3 model performs worse in estimating ZWD values in regions with monsoon climate and tropical rainforest climate types, including southeastern China, southeastern North America, and southern South America. It has large RMS errors exceeding 5.5 cm in regions with monsoon climate patterns and better than 4.5 cm for the other regions. This accuracy distribution is related to the realization of the GPT3 model. The model coefficients of the GPT3 are determined based on the monthly ERA-Interim reanalysis data and they only take into account the climatological mean and seasonal variations of meteorological variables. Thus, the GPT3 model cannot well reproduce short-term, complex, and abnormal variations of atmospheric moisture, in particular in regions with specific climate patterns.

Furthermore, the Saastamoinen model and ZWD-es perform better in representing ZWD values at sites in North Asia, Europe, and North America in contrast to the GPT3 model, but they still exhibit large RMS errors of 5.5–8 cm in South Asia and northern South America even though surface water vapor pressure data are used in the ZWD calculation. On the other hand, we find that the ZWD accuracy obtained by the ML-based models has improved significantly across the globe compared to GPT3 due to the introduction of surface meteorological parameters in the modeling. The RF-based models (RF-A and RF-B) and BPNN-based models (BPNN-A and BPNN-B) yield time-varying ZWD signal with accuracies better than 3 cm for the middle and high latitudes of the northern hemisphere, while they achieve accuracies of 3–5 cm for the regions with monsoon climate patterns. In addition, the A&N(es) presents a consistent spatial distribution of ZWD accuracy as the RF-based and BPNN-based models and outperforms the Saastamoinen model and ZWD-es in regions with specific climate types.

To study temporal features of ZWD accuracy, we calculated sub-daily ZWD residuals between modeled values and radiosonde data at each site. Then, the daily biases and RMS errors were computed using the ZWD residuals at sites for each hemisphere as presented in Fig. 6. Generally, the Saastamoinen model and ZWD-es underestimate ZWD values for all seasons in both hemispheres except for the southern hemisphere winter. They exhibit daily biases within 2 cm in the northern hemisphere and between − 3 cm and 1.5 cm in the southern hemisphere. Moreover, the GPT3 model also presents daily biases with magnitudes of − 1.5 to 0.5 cm in the northern hemisphere and -2 to 0.5 cm in the southern hemisphere. On the other hand, the RF-based and BPNN-based models can estimate time-varying ZWD signals with daily biases fluctuating within 0.5 cm in the northern hemisphere and within 1 cm in the southern hemisphere.

Fig. 6
figure 6

Time series of biases and RMS errors for these ZWD models in 2020

Furthermore, the ZWD accuracy of these models presents seasonal variations for both hemispheres with relatively large/small RMS errors in the summer/winter hemispheres. The temporal feature of the ZWD accuracy is related to seasonal variations of water vapor in the troposphere since the holding capability of atmospheric moisture varies with respect to the air temperature according to the Clausius–Clapeyron law. Concerning the ZWD modeling with surface meteorological parameters, one can see that the RF-based ZWD models considering meteorological parameters achieve better daily accuracy than other approaches and better represent daily fluctuations of atmospheric water vapor. Moreover, ZWD daily accuracy varies from 2 to 5 cm in the northern hemisphere, which is superior to that in the southern hemisphere (RMS: 3–6 cm). The daily accuracy degraded in the southern hemisphere relative to the northern hemisphere can be attributed to two factors. On the one hand, most radiosonde sites in the southern hemisphere are distributed in coastal areas and islands, which are frequently affected by complex weather conditions. On the other hand, fewer sounding profiles are collected in the southern hemisphere compared to the northern hemisphere, implying radiosonde data used in the modeling are relatively sparse in the southern hemisphere.

Validation with GPS RO observations

In this subsection, the performance of the ML-based models and other formulae for ZWD calculation was validated by using global atmospheric profiles obtained from GPS RO observations of MetOp-A and MetOp-B satellites in 2020. The statistics of ZWD residuals between modeled values and GPS RO data are presented in Table 3

Table 3 Statistics of ZWD residuals of these models and formulae in contrast to GPS RO measurements in 2020

As illustrated in Table 3, the GPT3 model estimates ZWD values over the globe with a mean bias of 0.3 cm and an overall RMS error of 3.2 cm in contrast to GPS RO atmospheric profiles in 2020. The Saastamoinen model and ZWD-es formulae obtain biases of around 0.5 cm and overall accuracies of 3.0 to 3.1 cm. Compared with GPT3, the ZWD accuracy gained by both models has slightly improved with the support of the surface meteorological data. Nevertheless, the improvement in ZWD modeling for both models is still limited because the surface meteorological circumstance cannot represent the whole vertical distribution of the moisture in the lower troposphere. On the other hand, we can see that the A&N(es) model taking the vertical humidity distribution into account can calculate ZWD values over the globe with a bias of − 0.3 cm and an overall RMS error of 2.6 cm, an accuracy improved by 20.1%, 13.7%, and 20.0% in comparison with the GPT3, ZWD-es, and Saastamoinen model, respectively. Concerning the ML-based model, we can find that RF-A and RF-B models exhibit the best agreement with the integrals of GPS RO atmospheric profiles and they yield ZWD estimations with biases close to zero and overall accuracies of 2.4 cm. The ZWD accuracy for both models has improved by around 19.7%, 21.8%, and 7.0% in contrast to the ZWD-es, Saastamoinen model, and A&N(es), respectively. Moreover, the BPNN-A and BPNN-B achieve overall agreements of 2.6 cm in comparison with the GPS RO measurements, whose accuracy is identical to that of A&N(es). We can see that the RF nonlinear regression approach is superior to the BPNN method in terms of the tropospheric ZWD modeling with surface meteorological parameters.

Figure 7 displays the time series of daily ZWD biases for these models relative to the GPS RO measurements in 2020. The upper and lower panels denote the results gained by using global atmospheric profiles from GPS RO observations of the MetOp-A and MetOp-B satellites, respectively. As can be seen from Fig. 7, the GPT3 model overestimates ZWD values in contrast to the integrals of GPS RO atmospheric profiles for MetOp-A and MetOp-B satellites. It exhibits positive daily biases generally varying from 0 to 0.8 cm throughout the year 2020. In addition, ZWD-es, Saastamoinen, and A&N(es) that are dependent on surface meteorological parameters underestimate time-varying ZWD signals and have positive biases of 0 to 1 cm, 0 to 1 cm, and 0 to 0.5 cm, respectively. Concerning the ML approaches, we can see that the RF-based (RF-A and RF-B) and BPNN-based (BPNN-A and BPNN-B) models perform better in modeling time-varying ZWD signals and they exhibit relatively small biases with magnitudes within 0.3 cm throughout the year 2020.

Fig. 7
figure 7

Daily ZWD biases between model estimations and GPS RO measurements in 2020

Figure 8 shows the daily RMS errors of ZWD model estimations relative to the GPS RO observations in 2020. The upper and lower panels denote the results validated based on the GPS RO atmospheric profiles of the MetOp-A and MetOp-B satellites, respectively. We can see that the GPT3 empirical model estimates time-varying ZWD values with daily RMS errors of 3.0–4.0 cm throughout the year 2020. The relatively worse accuracy of GPT3 is due to its model coefficients of meteorological variables, which cannot well reflect the humidity distribution and variations in the lower troposphere, especially short-term fluctuations. Compared with the GPT3 model, the Saastamoinen model and ZWD-es determine ZWD based on the surface meteorological data and obtain daily accuracies of 2.8–3.7 cm during the year 2020. Furthermore, we can find that daily accuracy gained by the ML-based models and A&N(es) is superior to other traditional ZWD models based on surface meteorological data. For example, the RF-based models (RF-A and RF-B) yield time-varying ZWD signals with daily RMS errors fluctuating between 2.0 and 2.7 cm throughout the year 2020, which has improved by 0.8 to 1.0 cm in contrast to the Saastamoinen model and ZWD-es. This effect appears because they take into account the vertical humidity distribution features or the relationship between ZWD and the meteorological parameters at different heights of the troposphere.

Fig. 8
figure 8

Daily RMS of ZWD residuals between modeled values and GPS RO measurements in 2020. The reason for the lower RMS with MetOp-A in June is not known and needs further investigations

Conclusions

The tropospheric zenith wet delay is a crucial parameter for space geodetic and meteorological applications. In this study, global modeling of the tropospheric ZWD was realized based on the surface meteorological data obtained from radiosonde data and GPS RO measurements through two ML regression approaches: the RF and BPNN. The performance of the ML-based models and other formulae for ZWD calculation was validated by using global atmospheric profiles from sounding data and GPS RO measurements in 2020. Our results show that the RF-B can achieve a bias close to zero and an overall accuracy of 3.1 cm in comparison with the radiosonde-derived results and 2.4 cm in contrast to the GPS RO measurements of the MetOp satellites. The ZWD accuracy for RF-B has improved by 23.8% against Saastamoinen and 12.2% against A&N(es) when radiosonde data are used in the validation, while its accuracy has improved by 21.8% against Saastamoinen and 7% against A&N(es) when GPS RO data are employed in the validation. Furthermore, the RF-based models can achieve better accuracy in terms of global modeling for the tropospheric ZWD than the BPNN-based approaches and other formulae with surface meteorological parameters. Concerning the spatial features of accuracy, the ML-based models clearly mitigate the negative biases in regions with monsoon climate and tropical rainforest climate patterns. Compared with the GPT3 empirical model, the accuracy of the ZWD modeling based on ML approaches has significantly improved across the globe by taking into account the surface meteorological parameters, especially in regions with monsoon climate types.