1 Introduction

Statistical analysis of spatial–temporal data is an indispensable issue in practice. It enables adequately modeling stochastic phenomena considering both time and space dependencies. Not long ago, the data’s spatial dimension (geographic dimension) has been incorporated into the Markov chain (MC) framework. Rey (2001) had developed a spatial Markov chain (MC) methodology for analyzing the spatial–temporal dynamics of random phenomena, which had been applied in a diversity of several fields, including GDP disparities among European regions (Le Gallo, 2004), manufacturing in Brazilian regions (Schettini et al., 2011), regional wealth disparity in Zhejiang, China (Yue et al., 2014), pro-environmental behavior in Italian provinces (Agovino et al., 2016), foreign direct investment in Mexico states (Torres Preciado et al., 2017), electric vehicle charging in cities (Shepero & Munkhammar, 2018), proximity effects in the US on obesity epidemic rates (Agovino et al., 2019), air pollution index in Peninsular Malaysia (Alyousifi et al., 2020), drought class transitions in Southwest China (Yang et al., 2020), COVID-19 dynamics in Asian countries (Dehghan Shabani & Shahnazi, 2020). Despite the long history of using this methodology, two interesting problems raise to the surface. Firstly, the spatial MC methodology inserts the spatial effects in MC analysis by conditioning (the probability of transmission of a specific phenomenon in a region toward one of its classes (states)) to (an average of the classes of the same phenomenon in its communicated regions) at the current period rather than directly to (the classes of the phenomenon in the communicated regions themselves). Secondly, it supposes that the spatial and temporal dependencies are of first-order only because it was constructed based on the first order MC.

To fill this gap, we suggest a new strategy for analyzing the evolution of random phenomena over time and space based on the high (\(r\)-th) order multivariate MC (HOM-MC) framework. It implies that the probability distribution of the phenomenon’s classes in each region of \(m\) communicated regions at time \(n+1\) can be estimated by the phenomenon’s classes in all \(m\) regions (including itself) at times \(n,n-1,\dots ,n-r+1\). Thus, it can provide a more inclusive analysis for the high depth transitional dynamics of geographical dimensions. Hence, it provides a useful tool to analyze the spatial and temporal dynamics of a phenomenon simultaneously. In practice, it is intractable to utilize the full parameterized HOM-MC model with \(s\) possible states due to its excessive number of parameters, \(m{s}^{r m+1}\), that increases exponentially when both \(r\) and \(m\) grow (cf. Ching et al., 2013). Thence, the so-called parsimonious “reduced-parametric” modeling is highly crucial for facilitating the numerical computations and the results’ interpretation. The parsimonious models for HOM-MC had been developed within the last decade. In 2008, Ching et al. suggested a reduced-parametric HOM-MC model based on the idea of mixture transition distribution. It is an extension of their parsimonious models for both the multivariate MC case (Ching et al., 2002) and the high-order MC case (Ching et al., 2004) “as a generalization of Raftery (1985), which is an extension of Pegram (1980)”. The sufficient conditions for the model’s stationary property were discussed by Zhu and Ching (2011) and recently by Li et al. (2019) as an application of their proposed tensor eigenvalue problem. Also, its applications have been paid attention to from some recent publications (cf. Cechin & Corso, 2019; Oyieke & Inambao, 2019). Although the mentioned HOM-MC model has captured a considerably lower number of parameters, \(r{m}^{2}\left({s}^{2}+1\right)\), than those of the full parameterized model, it has focused only on the positive correlations among the chains (so here denoted as \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)\)). For the same aspect, the modeling of HOM-MC with further parsimony has been discussed in some contemporary publications (cf. Elshehawey & Qian, 2021).

The present article is devoted to proposing a novel discrete-time HOM-MC model for analyzing the spatial–temporal dynamics of random phenomena. It can gather parsimony (i.e., involves a remarkably reduced number of parameters, especially when \(r\), \(m\), and \(s\) grow) with realism (i.e., handles both negative and positive associations among the chains). For simplicity, we will denote it by \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) for some \(s\beta >1\). The parameters’ number incorporated in the proposed model is \(r{m}^{2}\left({s}^{2}+2\right)\) only, significantly lower than those of the full parameterized model. A particular case when \(r=1\), \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(1,\beta \right)\), coincides with that of Wang and Huang (2013). We provide the convergence conditions that guarantee the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) stationarity together with simple convergence conditions to reduce the intricacy and speed up the computations. Its parameter estimation procedures, based on maximum likelihood estimation (MLE) beside linear programming (LP), are also discussed. After that, a Monte Carlo simulation is implemented for manifesting the efficiency of approximating the actual transition probabilities of the HOM-MC model (that can describe the space–time dynamics) by the corresponding transition probabilities calculated from the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\), in comparison with those estimated from the \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)\). A full description of the performed algorithm is also introduced.

In light of the rapid prevalence of coronavirus disease 2019 (COVID-19) at a universal scale, which started to attack humans at the end of 2019, there is an increasing tendency to conduct studies not only on the clinical knowledge of the epidemic but even on applying mathematical models for predicting the probable prevalence of this global pandemic (cf. Al-qaness et al., 2020; Ceylan, 2020; He et al., 2020; Jia et al., 2020; Maier & Brockmann, 2020). Grasping the spatial–temporal dynamics of COVID-19 is indispensable for its control because it aids in clarifying the impact and extent of the pandemic, as well as it can help planning, decision-making, and community action. For a comprehensive discussion of the spatial–temporal dynamics of COVID-19, we recommend the article of Franch-Pardo et al. (2020) and the papers therein. Here, we apply the proposed \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) to analyze the spatial–temporal dynamics of COVID-19 outbreak risk level through the six world health organization (WHO) regions: European (EURO), Americas (PAHO), Eastern Mediterranean (EMRO), Western Pacific (WPRO), South-East Asia (SEARO), and African (AFRO), over the period from 21 January 2020 to 9 August 2020. It is an attempt to provide a valid and objective tool for evaluating the risk level of spreading COVID-19 infection over the world to verify the effectiveness of the strategic plan for infection control and quarantine. Our findings show clear evidence for the excellent performance of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) in analyzing the spatial–temporal dynamics of the COVID-19 outbreak, which could inspire the institutions responsible for public health and infection control to improve their strategies with reliable monitoring of pandemic spread. As far as we know, the proposed methodology is novel in the field.

Concerning this article’s structure: In Sect. 2, we introduce the HOM-MC framework for analyzing spatiotemporal dynamics of random phenomena. In Sect. 3, the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) is suggested for analyzing spatiotemporal dynamics, and its convergence conditions are provided. Section 4 presents the estimation procedures for the parameters of the proposed model. Section 5 exhibits the empirical research, where a Monte Carlo simulation experiment is performed, and an application to analyze the spatial–temporal dynamics of COVID-19 outbreak over the world is given. The final section concludes the article.

2 HOM-MC framework for spatiotemporal dynamics

This section introduces the high-order multivariate MC (HOM-MC) framework for analyzing spatial–temporal dynamics. Assume that there are \(m\ge 2\) communicated regions. The class of a phenomenon occurring in the \(k\)-th region at period \(n\) represents a random variable, \({X}_{n}^{\left(k\right)}\). The chain \({\left\{{X}_{n}^{\left(k\right)}\right\}}_{n=1}^{N}\) denotes the classifying process of the underlying random phenomenon in the \(k\)-th region for each \(k=\mathrm{1,2},\dots ,m\). Every chain \({\left\{{X}_{n}^{\left(k\right)}\right\}}_{n=1}^{N}\) has the same number of possible mutually exclusive classes, \(s\), and all the chains for these \(m\) communicated regions are expected to be related to each other.

Consider \(\left\{\left({X}_{n}^{\left(1\right)},{X}_{n}^{\left(2\right)},\dots ,{X}_{n}^{\left(m\right)}\right);\hspace{0.33em}n\in \left\{\mathrm{1,2},\dots ,N\right\}\right\}\) be a discrete-time multivariate MC, where \({X}_{n}^{\left(k\right)}\) (for \(k=\mathrm{1,2},\dots ,m\)) has values in \(\mathcal{S}=\left\{\mathrm{1,2},\dots ,s\right\}\) as a state space. We have an \(r\)-th order homogenous multivariate MC if the following assumption hold, for \(k=\mathrm{1,2},\dots ,m\),

$$\mathrm{Pr}\left({X}_{n+1}^{\left(k\right)}={l}_{k}|\mathop {\overset{\rightharpoonup}{\bf X}_{n}}=\mathop {\overset{\rightharpoonup}{\bf h}_{1}},\mathop {\overset{\rightharpoonup}{\bf X}_{n-1}}=\mathop {\overset{\rightharpoonup}{\bf h}_{2}},\dots ,\mathop {\overset{\rightharpoonup}{\bf X}_{n-r+1}}=\mathop {\overset{\rightharpoonup}{\bf h}_{r}},\dots ,\mathop {\overset{\rightharpoonup}{\bf X}_{1}}=\mathop {\overset{\rightharpoonup}{\bf h}_{n}}\right)$$
$$=\mathrm{Pr}\left({X}_{n+1}^{\left(k\right)}={l}_{k}|\mathop {\overset{\rightharpoonup}{\bf X}_{n}}=\mathop {\overset{\rightharpoonup}{\bf h}_{1}},\mathop {\overset{\rightharpoonup}{\bf X}_{n-1}}=\mathop {\overset{\rightharpoonup}{\bf h}_{2}},\dots ,\mathop {\overset{\rightharpoonup}{\bf X}_{n-r+1}}=\mathop {\overset{\rightharpoonup}{\bf h}_{r}}\right)\equiv {p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}, r\ge 1$$
(1)

where \(\mathop {\overset{\rightharpoonup}{\bf X}_{n}}=\left({X}_{n}^{\left(1\right)},{X}_{n}^{\left(2\right)},\dots ,{X}_{n}^{\left(m\right)}\right)\), \(\mathop {\overset{\rightharpoonup}{\bf h}_{i}}=\left({h}_{1,i},{h}_{2,i},\dots ,{h}_{m,i}\right)\), \(\boldsymbol{\mathcal{J}}=\mathop {\overset{\rightharpoonup}{\bf h}_{1}},\mathop {\overset{\rightharpoonup}{\bf h}_{2}},\dots ,\mathop {\overset{\rightharpoonup}{\bf h}_{r}}\), \({X}_{n+1}^{\left(k\right)}\) is the class of the considered phenomenon in \(k\)-th region at time \(n+1\), and \({l}_{k},{h}_{j,i}\in \mathcal{S}\) for \(j=\mathrm{1,2},\dots ,m\) and \(i=\mathrm{1,2},\dots ,r\). That is, the future state of the studying phenomenon in the \(k\)-th region relies only on the immediately \(r\) previous states of the same phenomenon in all \(m\) communicated regions. In other words, the transitional probabilities of a region’s state in the next epoch \(n+1\) are conditioned by the states of all communicated regions (including itself) in the epochs \(n,n-1,\dots ,n-r+1\).

Here, the probabilities \({p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\) in (1) represent the conditional probabilities that the considered phenomenon in the \(k\)-th region will be in class \({l}_{k}\) at period \(n+1\), given that the same phenomenon in all \(m\) regions are in classes \(\mathop {\overset{\rightharpoonup}{\bf h}_{1}},\mathop {\overset{\rightharpoonup}{\bf h}_{2}},\dots ,\mathop {\overset{\rightharpoonup}{\bf h}_{r}}\) at periods \(n,n-1,\dots ,n-r+1\), respectively. The transition probability matrices (TPMs) which characterize the \(r\)-th order multivariate MC model for \(m\) communicated regions, each has the same number of classes \(s\) includes all the probabilities of transition from \(\mathop {\overset{\rightharpoonup}{\bf X}_{n}}, \mathop {\overset{\rightharpoonup}{\bf X}_{n-1}},\dots ,\mathop {\overset{\rightharpoonup}{\bf X}_{n-r+1}}\) to \({X}_{n+1}^{\left(k\right)}\) (for \(k=\mathrm{1,2},\dots ,m\)), that can be arranged in \(m\) rectangular matrices each of size \({s}^{rm}\times s\), in which the sum of each row equals one:

$${\varvec{T}}{{\varvec{P}}}_{k}={\left({p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\right)}_{{s}^{rm}\times s},\hspace{1em}k=\mathrm{1,2},\dots ,m,$$
(2)

where

$${\sum }_{{l}_{k}=1}^{s}{p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}=1,\hspace{1em}0\le {p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\le 1,\hspace{1em}\forall \hspace{0.33em}1\le {h}_{j,i}\le s,\hspace{0.33em}1\le j\le m,\hspace{0.33em}1\le i\le r.$$
(3)

From (2), it is noted that analyzing the spatial–temporal dynamics for a phenomenon using a full parameterized HOM-MC model requires estimating \(m\) of the TPMs, \({\varvec{T}}{{\varvec{P}}}_{k}\); each contains \({s}^{rm+1}\) of transition probabilities (parameters of the model). This number grows exponentially due to increasing the number of regions, \(m\), and lags, \(r\). As a result, there is an obstacle that discourages from using the HOM-MC model directly in its full parameterized version because of its related complicated computations. Therefore, the so-called parsimonious “reduced-parametric” modeling is essential to simplify computations and facilitate the results’ interpretation. To fill this gap, we propose a new discrete-time \(r\)-th order multivariate MC model, which gathers parsimony (involves only \(r{m}^{2}\left({s}^{2}+2\right)\) of parameters) with realistic (handles both negative and positive associations among the states of chains), to facilitate analyzing the evolution of a phenomenon over time and space concurrently. For simplicity, we will denote it by \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) for some \(s\beta >1\).

3 The \(\varvec{\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,{\beta}\right)}\) for analyzing spatiotemporal dynamics

In this section, we suggest governing the temporal and spatial evolutions of the classes of a phenomenon in \(m\) communicated regions by the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\):

(4)

for \(k=\mathrm{1,2},\dots ,m\), where \({\mathbf{X}}_{n+1}^{\left(k\right)}\) represents the phenomenon’s classes probability distribution at period \(n+1\) for the region \(k\), \({{\varvec{P}}}_{i}^{\left(k,j\right)}\) describes the transmission over \(i\) lag from the phenomenon’s classes in the region \(j\) at period \(n-i+1\) to the phenomenon’s classes in the region \(k\) at period \(n+1\), \({\mathbf{e}}_{s}\) is an all-ones column vector, as well as \({\lambda }_{k,j}^{\left(i\right)}\) and \({\lambda }_{k,-j}^{\left(i\right)}\) are the weights describing the positive and negative effects of \({{\varvec{P}}}_{i}^{\left(k,j\right)}{\mathbf{X}}_{n-i+1}^{\left(j\right)}\) on \({\mathbf{X}}_{n+1}^{\left(k\right)}\), respectively, that satisfy

$${\sum }_{j=-m}^{m}{\sum }_{i=1}^{r}{\lambda }_{k,j}^{\left(i\right)}=1,\hspace{1em}{\lambda }_{k,j}^{\left(i\right)}\ge 0,\hspace{1em}\forall 1\le k\le m,\hspace{0.33em}1\le \left|j\right|\le m,1\le i\le r.$$
(5)

In the right-hand side of formula (4), the part \({\left(s\beta -1\right)}^{-1}\left(\beta {\mathbf{e}}_{s}-{\mathbf{X}}_{n-i+1}^{\left(j\right)}\right)\) is used to model the negative correlations between \({\mathbf{X}}_{n+1}^{\left(k\right)}\) and \({\mathbf{X}}_{n-i+1}^{\left(j\right)}\), where the factor \({\left(s\beta -1\right)}^{-1}\) is the normalization constant to ensure that \({\mathbf{X}}_{n-i+1}^{\left(j\right)}\) (\(\forall j\)) is a probability vector. According to this part, a raise (lowering) in \({\mathbf{X}}_{n-i+1}^{\left(j\right)}\) results in a lowering (raise) in \({\mathbf{X}}_{n+1}^{\left(k\right)}\).

The \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) implies that the phenomenon’s classes probability distribution at time \(n+1\) for the region \(k\) can be estimated by the phenomenon’s classes of all \(m\) communicated regions (including itself) at times \(n,n-1,\dots ,n-r+1\) via \(r\) lags TPM. For more details, the TPM, \({{\varvec{P}}}_{i}^{\left(k,j\right)}\), provides information about how the phenomenon’s classes in the region (\(j\)) at time \(n-i+1\) affect the transmission of the phenomenon’s classes at time \(n+1\) in the region (\(k\)). It takes the form

$${{\varvec{P}}}_{i}^{\left(k,j\right)}={\left({p}_{{l}_{k}|{h}_{j,i}}\right)}_{s\times s}, 1\le {l}_{k},{h}_{j,i}\le s,\hspace{0.33em}1\le k,j\le m,\hspace{0.33em}1\le i\le r,$$
(6)

where \({p}_{{l}_{k}|{h}_{j,i}}=\mathrm{Pr}\left({X}_{n+1}^{\left(k\right)}={l}_{k}|{X}_{n-i+1}^{\left(j\right)}={h}_{j,i}\right)\) is the \(i\)-th lag transition probability from the phenomenon’s class \({h}_{j,i}\) in the region (\(j\)) at time \(n-i+1\) to the phenomenon’s class \({l}_{k}\) in the region (\(k\)) at time \(n+1\) for \({l}_{k},{h}_{j,i}\in \mathcal{S}\), where

$$\sum_{{l}_{k}=1}^{s}{p}_{{l}_{k}|{h}_{j,i}}=1,\hspace{1em}0\le {p}_{{l}_{k}|{h}_{j,i}}\le 1,\hspace{1em}\forall {l}_{k},{h}_{j,i}\in \mathcal{S},1\le k,j\le m,1\le i\le r.$$
(7)

When \(k=j\), the TPM \({{\varvec{P}}}_{i}^{\left(k,k\right)}\) expresses the higher-order inter-dependency among the phenomenon’s classes of the \(k\)-th region. If \(k\ne j\), the TPM \({{\varvec{P}}}_{i}^{\left(k,j\right)}\) represents the higher-order intra-dependency among the phenomenon’s classes from the \(j\)-th to \(k\)-th communicated regions. These higher-order inter-dependencies and intra-dependencies of the phenomenon’s classes for the regions can be thought of as the temporal and spatial (cross-sectional) dependencies of the phenomenon’s classes in the communicated regions, respectively.

In the matrix form, the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) in (4) can be rewritten as

(8)

where

$${\mathbf{Y}}_{n}={\left(\begin{array}{c}{\mathbf{Y}}_{n}^{\left(1\right)}\\ {\mathbf{Y}}_{n}^{\left(2\right)}\\ \vdots \\ {\mathbf{Y}}_{n}^{\left(m\right)}\end{array}\right)}_{srm\times 1}{\text{with}}\; {\mathbf{Y}}_{n}^{\left(k\right)}={\left(\begin{array}{c}{\mathbf{X}}_{n}^{\left(k\right)}\\ {\mathbf{X}}_{n-1}^{\left(k\right)}\\ \vdots \\ {\mathbf{X}}_{n-r+1}^{\left(k\right)}\end{array}\right)}_{rs\times 1}{\text{for}} \;k=\mathrm{1,2},\dots ,m,$$
(9)
$${{\varvec{B}}}^{\pm }={\left({{\varvec{B}}}_{sr\times sr}^{\left(k,\pm j\right)}\right)}_{m\times m}={\left(\begin{array}{cccc}{{\varvec{B}}}^{\left(1,\pm 1\right)}& {{\varvec{B}}}^{\left(1,\pm 2\right)}& \cdots & {{\varvec{B}}}^{\left(1,\pm m\right)}\\ {{\varvec{B}}}^{\left(2,\pm 1\right)}& {{\varvec{B}}}^{\left(2,\pm 2\right)}& \cdots & {{\varvec{B}}}^{\left(2,\pm m\right)}\\ \vdots & \vdots & \ddots & \vdots \\ {{\varvec{B}}}^{\left(m,\pm 1\right)}& {{\varvec{B}}}^{\left(m,\pm 2\right)}& \cdots & {{\varvec{B}}}^{\left(m,\pm m\right)}\end{array}\right)}_{srm\times srm},$$
(10)

such that, if \(k=j\):

$${{\varvec{B}}}^{\left(k ,k\right)}={\left(\begin{array}{ccccc}{\lambda }_{k ,k}^{\left(1\right)} {{\varvec{P}}}_{1}^{\left(k,k\right)}& {\lambda }_{k ,k}^{\left(2\right)} {{\varvec{P}}}_{2}^{\left(k,k\right)}& \cdots & {\lambda }_{k ,k}^{\left(r-1\right)} {{\varvec{P}}}_{r-1}^{\left(k,k\right)}& {\lambda }_{k ,k}^{\left(r\right)} {{\varvec{P}}}_{r}^{\left(k,k\right)}\\ {{\varvec{I}}}_{s}& \mathbf 0& \cdots & \mathbf 0& \mathbf 0\\ \mathbf 0& {{\varvec{I}}}_{s}& \cdots & \mathbf 0& \mathbf 0\\ \vdots & \ddots & \ddots & \ddots & \vdots \\ \mathbf 0& \cdots & \mathbf 0& {{\varvec{I}}}_{s}& \mathbf 0\end{array}\right)}_{s r\times s r},$$
$${{\varvec{B}}}^{\left(k ,-k\right)}={\left(\begin{array}{ccccc}{\lambda }_{k ,-k}^{\left(1\right)} {{\varvec{P}}}_{1}^{\left(k,k\right)}& {\lambda }_{k, -k}^{\left(2\right)} {{\varvec{P}}}_{2}^{\left(k,k\right)}& \cdots & {\lambda }_{k ,-k}^{\left(r-1\right)} {{\varvec{P}}}_{r-1}^{\left(k,k\right)}& {\lambda }_{k, -k}^{\left(r\right)} {{\varvec{P}}}_{r}^{\left(k,k\right)}\\ {\bf 0}&{ \bf 0}& \cdots &{ \bf 0}& { \bf 0}\\ { \bf 0}&{ \bf 0}& \cdots & { \bf 0}& { \bf 0}\\ \vdots & \vdots & \ddots & \vdots & \vdots \\ { \bf 0}& { \bf 0}& \cdots & { \bf 0}& { \bf 0}'\end{array}\right)}_{s r\times s r},$$

\({{\varvec{I}}}_{s}\) is an \(s\times s\) identity matrix, and if \(k\ne j\):

$${{\varvec{B}}}^{\left(k,\pm j\right)}={\left(\begin{array}{cccc}{\lambda }_{k,\pm j}^{\left(1\right)} {{\varvec{P}}}_{1}^{\left(k,j\right)}& {\lambda }_{k ,\pm j}^{\left(2\right)} {{\varvec{P}}}_{2}^{\left(k,j\right)}& \cdots & {\lambda }_{k ,\pm j}^{\left(r\right)} {{\varvec{P}}}_{r}^{\left(k,j\right)}\\ { \bf 0}& { \bf 0}& \cdots &{ \bf 0}\\ \vdots & \vdots & \ddots & \vdots \\ { \bf 0}&{ \bf 0}& \cdots & { \bf 0}\end{array}\right)}_{s r\times s r}.$$

The following proposition provides the ordinary convergence conditions that ensure the stationary distribution for the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\).

Proposition 1.

The \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) has a stationary distribution under the constraints:

$$s{\sum }_{j=1}^{m}{\sum }_{i=1}^{r}\left|{\lambda }_{k,j}^{\left(i\right)}-\frac{{\lambda }_{k ,-j}^{\left(i\right)}}{s\beta -1}\right|\le \alpha \hspace{1em}\text{for }k=\mathrm{1,2},\dots ,m,$$
(11)

where \(\beta\) is the variability of the convergence condition, and \(\alpha <1\).

We note that Proposition 1, as we will see later in Sect. 4, involves a large number of constraints (\({2}^{rm}\)) for each \(k\), which increases exponentially as \(r\) and \(m\) increase. Hence, it requires a very long time to implement the related computations. To reduce the complexity and speed up the calculations, Proposition 2 introduces simple convergence conditions to guarantee the stationarity for the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\).

Proposition 2.

The matrix \({\varvec{B}}={{\varvec{B}}}^{+}+{{\varvec{B}}}^{-}\) has an eigenvalue equals one, and the spectral radius of the matrix \({\varvec{B}}\), \(\rho \left({\varvec{B}}\right)\le 1\) (means that all the eigenvalues of \({\varvec{B}}\) having modulus less than or equals to one), if \({\lambda }_{k,j}^{\left(i\right)}\) (for \(1\le k,\left|j\right|\le m\) and \(1\le i\le r\)) satisfy the conditions (5), and the iteration of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) is convergent, if.

$${\lambda }_{k,j}^{\left(i\right)}\ge \frac{1}{s\beta -1}{\lambda }_{k,-j}^{\left(i\right)},\hspace{1em}\forall 1\le k,j\le m,\hspace{0.33em}1\le i\le r,s\beta >1.$$
(12)

The upcoming corollary shows an interesting particular case from the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right).\)

Corollary.

When \({\lambda }_{k,-j}^{\left(i\right)}=0 \, \forall 1\le i\le r, \, 1\le k,j\le m\), the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) becomes \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)\):

$${\mathbf{X}}_{n+1}^{\left(k\right)}={\sum }_{j=1}^{m}{\sum }_{i=1}^{r}{\lambda }_{k,j}^{\left(i\right)} {{\varvec{P}}}_{i}^{\left(k,j\right)} {\mathbf{X}}_{n-i+1}^{\left(j\right)},\,k=\mathrm{1,2},\dots ,m,$$
(13)

where

$${\sum }_{j=1}^{m}{\sum }_{i=1}^{r}{\lambda }_{k,j}^{\left(i\right)}=1,\hspace{1em}{\lambda }_{k,j}^{\left(i\right)}\ge 0,\hspace{1em}\forall 1\le i\le r, \, 1\le k,j\le m.$$
(14)

which coincides with the \(r\)-th order multivariate MC model in Ching et al. (2008).

4 Parameters estimation procedure

This section provides computational procedures for estimating the parameters \({{\varvec{P}}}_{i}^{\left(k,j\right)}\) as well as \({\lambda }_{k,j}^{\left(i\right)}\) and \({\lambda }_{k,-j}^{\left(i\right)}\) (for \(1\le k,j\le m\) and \(1\le i\le r\)) of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\). The maximum likelihood estimates (MLE) can be acquired for the components of \({{\varvec{P}}}_{i}^{\left(k,j\right)}\) as follows (cf. Ching et al., 2008)

$${\widehat{p}}_{{l}_{k}|{h}_{j,i}}=\left\{\begin{array}{l}\frac{{f}_{{l}_{k},{h}_{j,i}}}{{\sum }_{{l}_{k}=1}^{s}{f}_{{l}_{k},{h}_{j,i}}}, \quad {\sum }_{{l}_{k}=1}^{s}{f}_{{l}_{k},{h}_{j,i}}\ne 0,\\ \frac{1}{s}, \quad \text{ otherwise}\end{array}\right.$$
(15)

where

$${f}_{{l}_{k},{h}_{j,i}}=\sum_{n=i}^{N-1}\mathrm{I}\left({X}_{n+1}^{\left(k\right)}={l}_{k},{X}_{n-i+1}^{\left(j\right)}={h}_{j,i}\right) \forall 1\le {l}_{k},{h}_{j,i}\le s,1\le k,j\le m,1\le i\le r,$$
(16)

and \(\mathrm{I}\left(A\right)\) is an indicator variable equals 1 if \(A\) is true and 0 otherwise.

One procedure for estimating the model parameters, \(\left\{{\lambda }_{k,j}^{\left(i\right)},{\lambda }_{k,-j}^{\left(i\right)}\right\}\equiv \lambda\), is to determine the values of \(\left\{{\lambda }_{k,j}^{\left(i\right)},{\lambda }_{k,-j}^{\left(i\right)}\right\}\) that minimize

$$\Vert {\widehat{{\varvec{B}}}}^{+}\widehat{\mathbf{Y}}+\frac{1}{s\beta -1}\hspace{0.33em}{\widehat{{\varvec{B}}}}^{-}\left(\beta {\mathbf{e}}_{srm}-\widehat{\mathbf{Y}}\right)-\widehat{\mathbf{Y}}\Vert ,$$
(17)

under a particular vector norm \(\Vert \cdot \Vert\), where \(\widehat{\mathbf{Y}}={\left(\begin{array}{cccc}{\left({\widehat{\mathbf{Y}}}^{\left(1\right)}\right)}^{^{\prime}}& {\left({\widehat{\mathbf{Y}}}^{\left(2\right)}\right)}^{^{\prime}}& \cdots & {\left({\widehat{\mathbf{Y}}}^{\left(m\right)}\right)}^{^{\prime}}\end{array}\right)}^{^{\prime}}\) with \({\widehat{\mathbf{Y}}}^{\left(k\right)}={\left(\begin{array}{cccc}{\left({\widehat{\mathbf{X}}}^{\left(k\right)}\right)}^{^{\prime}}& {\left({\widehat{\mathbf{X}}}^{\left(k\right)}\right)}^{^{\prime}}& \cdots & {\left({\widehat{\mathbf{X}}}^{\left(k\right)}\right)}^{^{\prime}}\end{array}\right)}^{^{\prime}}\) is the estimated stationary vector of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\). To get \(\widehat{\mathbf{Y}}\) from the given chains, we first obtain the estimated vectors \({\widehat{\mathbf{X}}}^{\left(k\right)}\) (for \(k=\mathrm{1,2},\dots ,m\)) by calculating the ratio of occurring each state of the state space \(\mathcal{S}\) in each \(k\)-th chain. Then, combining these vectors, \({\left\{{\widehat{\mathbf{X}}}^{\left(k\right)}\right\}}_{k=1}^{m}\), to establish the vector \(\widehat{\mathbf{Y}}\).

Choosing \(\Vert \cdot \Vert\) to be \({\Vert \cdot \Vert }_{\infty }\) norm, and converting the resulting optimization (\(min\)-\(max\)) problem into \(m\) linear programming (LP) problems (cf. Chvatal, 1983, pp. 222–223) based on the ordinary convergence conditions in (11), we have to

$$\underset{{{\varvec{\Lambda}}}_{k}}{min}\hspace{0.33em}{w}^{\left(k\right)}\hspace{1em}\text{s.t.}\hspace{1em}\left\{\begin{array}{l}{w}^{\left(k\right)}\cdot {\mathbf{e}}_{s}\ge \pm {\widehat{\mathbf{X}}}^{\left(k\right)}\mp {{\varvec{C}}}_{k}{{\varvec{\Lambda}}}_{k},\\ {{\varvec{\Lambda}}}_{k}^{^{\prime}}\cdot {\mathbf{e}}_{2rm}=1,\hspace{1em}s{{\varvec{A}}}_{k}\cdot {{\varvec{\Lambda}}}_{k}\le \alpha \cdot {\mathbf{e}}_{{2}^{rm}},\hspace{1em}\alpha <1,\\ {w}^{\left(k\right)}\ge 0,\hspace{1em}{{\varvec{\Lambda}}}_{k}\ge 0.\end{array}\right.$$
(18)

for each \(k=\mathrm{1,2},\dots ,m\), where \({{\varvec{\Lambda}}}_{k}=\left(\begin{array}{c}{{\varvec{\lambda}}}_{k}^{+}\\ {{\varvec{\lambda}}}_{k}^{-}\end{array}\right)\), \({{\varvec{C}}}_{k}=\left(\begin{array}{cc}{{\varvec{C}}}_{k}^{+}& \frac{1}{s\beta -1}{{\varvec{C}}}_{k}^{-}\end{array}\right)\), \({{\varvec{A}}}_{k}=\left(\begin{array}{cc}{{\varvec{A}}}_{1k}& -\frac{1}{s\beta -1}{{\varvec{A}}}_{1k}\end{array}\right)\),

$${{\varvec{\lambda}}}_{k}^{+}={\left(\begin{array}{cccc}\begin{array}{cccc}{\lambda }_{k,1}^{\left(1\right)}& {\lambda }_{k,1}^{\left(2\right)}& \cdots & {\lambda }_{k,1}^{\left(r\right)}\end{array}& \begin{array}{cccc}{\lambda }_{k,2}^{\left(1\right)}& {\lambda }_{k,2}^{\left(2\right)}& \cdots & {\lambda }_{k,2}^{\left(r\right)}\end{array}& \cdots & \begin{array}{cccc}{\lambda }_{k,m}^{\left(1\right)}& {\lambda }_{k,m}^{\left(2\right)}& \cdots & {\lambda }_{k,m}^{\left(r\right)}\end{array}\end{array}\right)}^{\mathrm{^{\prime}}},$$
$${{\varvec{\lambda}}}_{k}^{-}={\left(\begin{array}{cccc}\begin{array}{cccc}{\lambda }_{k,-1}^{\left(1\right)}& {\lambda }_{k,-1}^{\left(2\right)}& \cdots & {\lambda }_{k,-1}^{\left(r\right)}\end{array}& \begin{array}{cccc}{\lambda }_{k,-2}^{\left(1\right)}& {\lambda }_{k,-2}^{\left(2\right)}& \cdots & {\lambda }_{k,-2}^{\left(r\right)}\end{array}& \cdots & \begin{array}{cccc}{\lambda }_{k,-m}^{\left(1\right)}& {\lambda }_{k,-m}^{\left(2\right)}& \cdots & {\lambda }_{k,-m}^{\left(r\right)}\end{array}\end{array}\right)}^{\mathrm{^{\prime}}},$$
$$\begin{aligned}{{\varvec{C}}}_{k}^{+}&=\left({\widehat{{\varvec{P}}}}_{1}^{\left(k,1\right)}{\widehat{\mathbf{X}}}^{\left(1\right)}|\cdots |{\widehat{{\varvec{P}}}}_{r}^{\left(k,1\right)}{\widehat{\mathbf{X}}}^{\left(1\right)}|{\widehat{{\varvec{P}}}}_{1}^{\left(k,2\right)}{\widehat{\mathbf{X}}}^{\left(2\right)}|\cdots |{\widehat{{\varvec{P}}}}_{r}^{\left(k,2\right)}{\widehat{\mathbf{X}}}^{\left(2\right)}|\cdots |{\widehat{{\varvec{P}}}}_{1}^{\left(k,m\right)}{\widehat{\mathbf{X}}}^{\left(m\right)}|\cdots |{\widehat{{\varvec{P}}}}_{r}^{\left(k,m\right)}{\widehat{\mathbf{X}}}^{\left(m\right)}\right),\\ {{\varvec{C}}}_{k}^{-}& =\left({\widehat{{\varvec{P}}}}_{1}^{\left(k,1\right)}\left(\beta {\mathbf{e}}_{s}-{\widehat{\mathbf{X}}}^{\left(1\right)}\right)|\cdots |{\widehat{{\varvec{P}}}}_{r}^{\left(k,1\right)}\left(\beta {\mathbf{e}}_{s}-{\widehat{\mathbf{X}}}^{\left(1\right)}\right)|{\widehat{{\varvec{P}}}}_{1}^{\left(k,2\right)}\left(\beta {\mathbf{e}}_{s}-{\widehat{\mathbf{X}}}^{\left(2\right)}\right)|\cdots |\right. \end{aligned}$$
$$\left.{\widehat{{\varvec{P}}}}_{r}^{\left(k,2\right)}\left(\beta {\mathbf{e}}_{s}-{\widehat{\mathbf{X}}}^{\left(2\right)}\right)|\hspace{0.33em}\cdots \hspace{0.33em} |{\widehat{{\varvec{P}}}}_{1}^{\left(k,m\right)}\left(\beta {\mathbf{e}}_{s}-{\widehat{\mathbf{X}}}^{\left(m\right)}\right)|\cdots |{\widehat{{\varvec{P}}}}_{r}^{\left(k,m\right)}\left(\beta {\mathbf{e}}_{s}-{\widehat{\mathbf{X}}}^{\left(m\right)}\right)\right),$$

and

$${{\varvec{A}}}_{1k}={\left(\begin{array}{cccccc}1& 1& \cdots & 1& 1& 1\\ 1& 1& \cdots & 1& 1& -1\\ 1& 1& \cdots & 1& -1& 1\\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ -1& -1& \cdots & -1& -1& 1\\ -1& -1& \cdots & -1& -1& -1\end{array}\right)}_{{2}^{rm}\times rm}.$$

Notice that \({{\varvec{A}}}_{1k}\) is a matrix of size \({2}^{rm}\times rm\) with each element \(\in \left\{1,-1\right\}\). This means that the ordinary convergence conditions add a large number of constraints, \({2}^{rm}\), which increases exponentially as \(r\) and \(m\) grow. For example, in constructing a 3rd order model for a group of 6 regions, we have to consider \({2}^{18}={262144}\) of additional constraints in each minimization problem to ensure the model’s stationarity. Hence, these minimization problems require a very long time to be implemented, even if using a computer with high specifications.

To overcome this intricacy, we consider the simple convergence conditions in (12) to reduce the complexity and speed up the convergence of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\). In this case, we have to carry out the following improved \(m\) of LP problems, for each \(k=\mathrm{1,2},\dots ,m\):

$$\underset{{{\varvec{\Lambda}}}_{k}}{min}\hspace{0.33em}{w}^{\left(k\right)}\hspace{1em}\text{s.t.}\hspace{1em}\left\{\begin{array}{l}{w}^{\left(k\right)}\cdot {\mathbf{e}}_{s}\ge \pm {\widehat{\mathbf{X}}}^{\left(k\right)}\mp {{\varvec{C}}}_{k}{{\varvec{\Lambda}}}_{k},\\ {{\varvec{\Lambda}}}_{k}^{^{\prime}}\cdot {\mathbf{e}}_{2rm}=1,\hspace{1em}\boldsymbol{\rm K}\cdot {{\varvec{\Lambda}}}_{k}\le 0,\\ {w}^{\left(k\right)}\ge 0,\hspace{1em}{{\varvec{\Lambda}}}_{k}\ge 0.\end{array}\right.$$
(19)

where \({{\varvec{\Lambda}}}_{k}\) and \({{\varvec{C}}}_{k}\) are as defined before, and \(\boldsymbol{\rm K}=\left(\begin{array}{cc}-{{\varvec{I}}}_{rm}& \frac{1}{s\beta -1}{{\varvec{I}}}_{rm}\end{array}\right)\) for \(s\beta >1\).

5 Empirical research

This section is divided into two subsections. In the first one, we provide a Monte Carlo simulation experiment with a complete description of the performed algorithm. We allocate the second subsection to apply the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) to analyze the spatial–temporal prevalence of the COVID-19 pandemic.

5.1 Monte Carlo simulation experiment

In this subsection, we perform an experiment of Monte Carlo simulation to evaluate the performance of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) in comparison with the \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)\). The goal is to demonstrate the efficacy of approximating the actual probability, \({p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\), by the estimated probability from the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\). We consider a simple process of third order MC (\(r=3\)) for two chains (\(m=2\)) with the same length \(N\), each takes values in \(\mathcal{S}=\left\{\mathrm{1,2}\right\}\) (\(s=2\)). The experiment is repeated for \(N=\mathrm{50,100,500,1000,5000},\) and the simulation is done for \({N}_{sim}=10000\) times. Each time, \({p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\) (\(\boldsymbol{\mathcal{J}}={h}_{1,1},{h}_{2,1},{h}_{1,2},{h}_{2,2},{h}_{1,3},{h}_{2,3}\)) are allowed to take different probability values because the simulation outcomes are sensitive to these values. We execute the following algorithm where \(\boldsymbol{\mathcal{J}}={h}_{1,1},{h}_{2,1},{h}_{1,2},{h}_{2,2},{h}_{1,3},{h}_{2,3}\):

Step 1. Set values of \({p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\) (\(k=\mathrm{1,2}\))\(,\) such that (3) holds. Then, set \(t=1\).

Step 2. Simulate a path \(\left\{\left({X}_{n}^{\left(1\right)},{X}_{n}^{\left(2\right)}\right)\right\}\), \(n=\mathrm{1,2},\dots ,N\):

Step 2.1. Initialize the process \(\left\{\left({X}_{n}^{\left(1\right)},{X}_{n}^{\left(2\right)}\right)\right\}\) where \({X}_{n}^{\left(k\right)}\in \mathcal{S}=\left\{\mathrm{1,2}\right\}\) for \(k=\mathrm{1,2}\).

Step 2.2. For each \(k\), generate a random variable \({u}_{k}\sim U\left(\mathrm{0,1}\right)\). Assume that \({X}_{n-i+1}^{\left(1\right)}={h}_{1,i}\) and \({X}_{n-i+1}^{\left(2\right)}={h}_{2,i}\) for all \(i=\mathrm{1,2},3\). Then set

$${X}_{n}^{\left(k\right)}=\left\{\begin{array}{l}1,\hspace{1em}{\text{if}}\hspace{0.33em}{u}_{k}\le {p}_{1|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)},\\ 2,\hspace{1em}{\text{if}}\hspace{0.33em}{p}_{1|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}<{u}_{k}\le {p}_{1|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}+{p}_{2|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}.\end{array}\right.$$
(20)

Step 2.3. Revert to step 2.1, till \(n=N\).

Step 3. Using the generated chains \({\left\{\left({X}_{n}^{\left(1\right)},{X}_{n}^{\left(2\right)}\right)\right\}}_{n=1}^{N}\), estimate \({{\varvec{P}}}_{i}^{\left(k,j\right)}\) by MLE method for \(k,j=\mathrm{1,2}\) and \(i=\mathrm{1,2},3\), then obtain the estimated weights.

  1. (a)

    \({\lambda }_{k,j}^{\left(i\right)}\) and \({\lambda }_{k,-j}^{\left(i\right)}\) by LP method, and obtain from them \({\widetilde{p}}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\) using the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(3,\beta \right)\):

    $${\widetilde{p}}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}={\sum }_{j=1}^{m}{\sum }_{i=1}^{r}\left[\left({\widehat{\lambda }}_{k,j}^{\left(i\right)}-\frac{{\widehat{\lambda }}_{k,-j}^{\left(i\right)}}{s\beta -1}\right){\widehat{p}}_{{l}_{k}|{h}_{j,i}}+\frac{\beta {\widehat{\lambda }}_{k,-j}^{\left(i\right)}}{s\beta -1} {\sum }_{{h}_{j,i}=1}^{s}{\widehat{p}}_{{l}_{k}|{h}_{j,i}}\right], \, \beta >\frac{1}{s}$$
    (21)
  2. (b)

    \({\lambda }_{k,j}^{\left(i\right)}\) from the \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(3\right)\) via the corresponding LP procedure, and obtain \({\widehat{p}}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\).

Step 4. Evaluate the accuracy of \({\left({\widetilde{p}}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\right)}_{t}\) and \({\left({\widehat{p}}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\right)}_{t}\) by making a comparison with the corresponding actual probabilities, \({p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\), predefined in step 1, via the following statistics for all \({l}_{k}\in \mathcal{S}\), \(k=\mathrm{1,2}\)

$$\left\{\begin{array}{l}{\psi }_{{l}_{k}}^{+,-}\left(t\right)={\sum }_{\boldsymbol{\mathcal{J}}=1}^{2}{\left({\left({\widetilde{p}}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\right)}_{t}-{\left({p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\right)}_{t}\right)}^{2},\\ {\psi }_{{l}_{k}}^{+}\left(t\right)={\sum }_{\boldsymbol{\mathcal{J}}=1}^{2}{\left({\left({\widehat{p}}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\right)}_{t}-{\left({p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\right)}_{t}\right)}^{2}.\end{array}\right.$$
(22)

Step 5. Put \(t=t+1\). Go back to step 1 until \(t={N}_{sim}\).

To evaluate the two estimated probabilities’ performance, we calculate a global average for the statistics stated in step 4, as \({\bar{\psi }}_{{l}_{k}}^{+,-}\) and \({\bar{\psi }}_{{l}_{k}}^{+}\), respectively.

The above algorithm is implemented considering different values of \(\beta\), and the outputs are reported in Table 1 for \({l}_{k}=1\) only (because the outcomes are similar to when \({l}_{k}=2\)).

Table 1 The results of \({\bar{\psi }}_{{l}_{k}}^{+}/{\bar{\psi }}_{{l}_{k}}^{+,-}\) from Monte Carlo simulation

The calculated ratio \({\bar{\psi }}_{{l}_{k}}^{+}/{\bar{\psi }}_{{l}_{k}}^{+,-}\) arising from the performed Monte Carlo simulation experiment shows the good performance of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) to estimate the transition probabilities \({p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\). It is clear from Table 1 that the estimator \({\widetilde{p}}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\) always dominates \({\widehat{p}}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\), since they always have \({\bar{\psi }}_{{l}_{k}}^{+}/{\bar{\psi }}_{{l}_{k}}^{+,-}>1\). Meanwhile, the differences between the two models become greater when the sample size gets smaller. Besides, the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) outperforms the \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)\) in modeling short chains, i.e., for small \(N\) values.

5.2 Application to the COVID-19 outbreak

In this part, we apply the proposed \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) for analyzing the spatial–temporal dynamics of the COVID-19 outbreak over the world.

5.2.1 Data description and chains formation

We consider the daily prevalence of the COVID-19 epidemic in six regions: African (AFRO), Americas (PAHO), Eastern Mediterranean (EMRO), European (EURO), South-East Asia (SEARO), and Western Pacific (WPRO), according to the partition of the world by WHO to group its member nations. The daily data of cumulative confirmed cases were obtained from the WHO situation reports of COVID-19, numbered from 1 to 202 (Because of the study’s conducting time), downloaded from the WHO official website: www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports. This dataset is for the 202 days from 21 January 2020 to 9 August 2020. For all regions, the number of new confirmed cases every day, obtained by taking the difference between the cumulative values of the consecutive days, are classified into one of the following possible five categories (\(s=5\)) for assessing the risk level of spreading COVID-19 infection.

State 1:

\(\le 50\)(Low-Risk Spread),

State 2:

\(51-500\)(Moderate-Risk Spread),

State 3:

\(501-5000\)(High-Risk Spread),

State 4:

\(5001-50000\)(Very High-Risk Spread),

State 5:

\(>50000\)(Extreme-Risk Spread)

  

Those mentioned above feasible five categories have been realistically defined based upon the observed classes’ midpoints for the counts of COVID-19 confirmed cases, which the WHO had determined in the last considered situation report.

Hence, we have six chains, \({\left\{{X}_{n}^{\left(k\right)}\right\}}_{n=1}^{201}\), \(k=\mathrm{1,2},\dots ,6\), each consists of the same \(s=5\) classes, i.e., the state-space is \(\mathcal{S}=\left\{\mathrm{1,2},\mathrm{3,4},5\right\}\). They describe the classification of COVID-19 outbreak over \(N=201\) consecutive days in \(m=6\) communicated regions: WPRO, EURO, SEARO, EMRO, PAHO, and AFRO, respectively. See (Appendix).

5.2.2 Estimate the stationary probability vectors

The stationary probability distributions of the COVID-19 outbreak’s classes for the six regions \({\mathbf{X}}^{\left(k\right)}(k=\mathrm{1,2},\dots ,6)\) are estimated by calculating the ratio of occurring each class of \(\mathcal{S}\) in each \(k\)-th chain as

$${\widehat{\mathbf{X}}}^{\left(1\right)}={\left(\begin{array}{ccccc}0.0050& 0.0796& 0.8607& 0.0547& 0\end{array}.0000\right)}^{\mathrm{^{\prime}}},$$
$${\widehat{\mathbf{X}}}^{\left(2\right)}={\left(\begin{array}{ccccc}0.1692& 0.0348& 0.0547& 0.7413& 0.0000\end{array}\right)}^{\mathrm{^{\prime}}},$$
$${\widehat{\mathbf{X}}}^{\left(3\right)}={\left(\begin{array}{ccccc}0.2687& 0.0796& 0.1940& 0.3682& 0.0896\end{array}\right)}^{\mathrm{^{\prime}}},$$
$${\widehat{\mathbf{X}}}^{\left(4\right)}={\left(\begin{array}{ccccc}0.1741& 0.0348& 0.2488& 0.5423& 0\end{array}.0000\right)}^{\mathrm{^{\prime}}},$$
$${\widehat{\mathbf{X}}}^{\left(5\right)}={\left(\begin{array}{ccccc}0.2338& 0.0398& 0.0249& 0.2886& 0.4129\end{array}\right)}^{\mathrm{^{\prime}}},$$
$${\widehat{\mathbf{X}}}^{\left(6\right)}={\left(\begin{array}{ccccc}0.2736& 0.1095& 0.3085& 0.3085& 0.0000\end{array}\right)}^{\mathrm{^{\prime}}}.$$

where \({\widehat{\mathbf{X}}}^{\left(k\right)}(k=\mathrm{1,2},\dots ,6)\) are the estimated steady-state distributions of the COVID-19 outbreak’s classes for the WHO regions: WPRO, EURO, SEARO, EMRO, PAHO, and AFRO, respectively. It is noted that

  1. (a)

    The PAHO and SEARO are the only two WHO regions that enter the class “Extreme-Risk Spread” of the risk level of spreading COVID-19 infection with probabilities 0.4129 and 0.0896, respectively.

  2. (b)

    The COVID-19 pandemic is most likely to have a very high-risk spread level in the three WHO regions: EURO, EMRO, and SEARO. The highest probability occurs in the EURO (0.7413) followed by the EMRO (0.5423) then the SEARO (0.3682). The WPRO has a much higher probability (0.8607) to suffer from a high-risk spread level for the COVID-19 pandemic than other levels. While the COVID-19 prevalence in the AFRO is equally likely to be classified as either “High-Risk Spread” or “Very High-Risk Spread.”

Hence, we can arrange the WHO regions according to the classification of the risk level with the highest probability as High-Risk Spread (WPRO—AFRO), Very High-Risk Spread (EURO—SEARO—EMRO—AFRO), and Extreme-Risk Spread (PAHO). This means that compared to the other regions, the WPRO has the best risk level while the PAHO has the worst risk level; because the situation in the WPRO has been positively affected by China’s success to control the pandemic at an early stage of the period under study.

Now, the crucial step in determining the appropriate \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) is to select the adequate values of \(\beta\) and \(r\). We aim first to select the best value of \(\beta\) for each order of the model.

5.2.3 Select \(\varvec{\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,{\beta}\right)}\) with proper \({\beta}\)

The proposed \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\), using the simple convergence conditions in (12), is fitted to the classifying chains of COVID-19 outbreak in the six WHO regions for orders \(r=\mathrm{1,2},\mathrm{3,4}\) with different values of \(\beta\). The performances of the models are evaluated using both the information criteria of Akaike (AIC) “that recommended by Tong (1975)” as well as Bayesian (BIC) “as preferred by Katz (1981)”. These statistics are given by

$$\left\{\begin{array}{l}{\text{AIC}}=-2\widehat{\mathcal{L}}+2q,\\ {\text{BIC}}=-2\widehat{\mathcal{L}}+q\mathit{ln}\left(M\right),\end{array}\right.$$
(23)

where \(M\) is the length of the chain minus the order \(r\), and \(q\) is the number of the estimated non-zero independent parameters, as well as \(\widehat{\mathcal{L}}\) represents the estimated log-likelihood of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\):

$$\widehat{\mathcal{L}}={\sum }_{k=1}^{m}{\sum }_{{l}_{k}=1}^{s}{\sum }_{\mathop {\overset{\rightharpoonup}{\bf h}_{1}}=1}^{s}{\sum }_{\mathop {\overset{\rightharpoonup}{\bf h}_{2}}=1}^{s}\cdots {\sum }_{\mathop {\overset{\rightharpoonup}{\bf h}_{r}}=1}^{s}{n}_{{l}_{k},{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\,\mathit{ln}\left({\widetilde{p}}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\right),$$
(24)

where \({\widetilde{p}}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\) is as defined in (21), \({n}_{{l}_{k},{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}=\sum_{n=r}^{N-1}\mathrm{I}\left({X}_{n+1}^{\left(k\right)}={l}_{k},\mathop {\overset{\rightharpoonup}{\bf X}_{n}}=\mathop {\overset{\rightharpoonup}{\bf h}_{1}},\dots ,\mathop {\overset{\rightharpoonup}{\bf X}_{n-r+1}}=\mathop {\overset{\rightharpoonup}{\bf h}_{r}}\right)\) and \(\mathrm{I}\left(A\right)\) is as predefined.

All comparisons between the fitted models (for \(r=\mathrm{1,2},\mathrm{3,4}\)) are carried out “within” the 4th order model, so we have conditioned on the first four values in all data sequences and ignore their contributions to \(\widehat{\mathcal{L}}\) in all cases. The outcomes for the models \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(1,\beta \right)\), \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(2,\beta \right)\), \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(3,\beta \right)\) and \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(4,\beta \right)\) for \(0.3\le \beta \le 5\) are reported in Figs. 1, 2, 3, 4, respectively.

Fig. 1
figure 1

BIC and AIC-statistics of the fitted \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(1,\beta \right)\) with different \(\beta\) for COVID-19 outbreak in WHO regions

Fig. 2
figure 2

BIC and AIC-statistics of the fitted \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(2,\beta \right)\) with different \(\beta\) for COVID-19 outbreak in WHO regions

Fig. 3
figure 3

BIC and AIC-statistics of the fitted \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(3,\beta \right)\) with different \(\beta\) for COVID-19 outbreak in WHO regions

Fig. 4
figure 4

BIC and AIC-statistics of fitted \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(4,\beta \right)\) with different \(\beta\) for COVID-19 outbreak in WHO regions

From Fig. 1, it is shown that the smallest AIC and BIC values for the first-order model are achieved when \(\beta =3.3\). Hence, the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{1,3.3}\right)\) is the best choice based on both AIC and BIC criteria.

It is illustrated from the first two sub-figures in Fig. 2 that the best \(\beta\)- value falls between 0.7 and 1.4. To determine the proper value of \(\beta\) more precisely, we replot BIC and AIC values against \(0.7\le \beta \le 1.4\) in the other two sub-figures. The \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) is found to be the favorable select second-order model.

It is shown from Fig. 3 that the third-order model with \(\beta =1.5\), \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{3,1.5}\right)\) is preferred because it has the lowest BIC and AIC values amongst the other third-order models.

From Fig. 4, it is clarified that \(\beta =1.1\) is the preferable value of \(\beta\) when \(r=4\). Therefore, the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{4,1.1}\right)\) is the chosen model in this case.

5.2.4 Comparison between \(\varvec{\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,{\beta}\right)}\) and \(\varvec{\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)}\)

To differentiate the power of the proposed \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) against the \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)\) (Ching et al. (2008)’s model), the prediction accuracies of the four selected models, \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{1,3.3}\right)\), \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\), \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{3,1.5}\right)\), and \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{4,1.1}\right)\) are assessed for each region in addition to their prediction errors. The outcomes are summarized in Fig. 5 together with those of the corresponding \(\mathcal{M}{\mathcal{C}}_{6}^{+}\left(r\right)\) for \(r=\mathrm{1,2},\mathrm{3,4}\). For the \(k\)-th region, the prediction accuracy is calculated by the formula (cf. Ching et al., 2008)

$$\text{P.A.}\left(k\right)=\frac{1}{N-r}\times \sum_{n=r}^{N-1}\mathrm{I}\left({\widehat{X}}_{n+1}^{\left(k\right)}={X}_{n+1}^{\left(k\right)}\right)\times 100\%, \, k=\mathrm{1,2},\dots ,6,$$
(25)

where \(N\) is the length of chain, \({X}_{n+1}^{\left(k\right)}\) represents the actual class at epoch \(n+1\) for the region (\(k\)), and \({\widehat{X}}_{n+1}^{\left(k\right)}\) is the predicted class at epoch \(n+1\) for the region (\(k\)), which is the class with the superior probability in the vector \({\widehat{\mathbf{X}}}_{n+1}^{\left(k\right)}\) predicted by the underlying HOM-MC model, i.e.,

Fig. 5
figure 5

Prediction performance of \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) against \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)\) with different orders, \(r\), for COVID-19 outbreak in WHO regions

$${\widehat{X}}_{n+1}^{\left(k\right)}=l,\hspace{1em}{\text{if}}\hspace{0.33em}{\left[{\widehat{\mathbf{X}}}_{n+1}^{\left(k\right)}\right]}_{l}\ge {\left[{\widehat{\mathbf{X}}}_{n+1}^{\left(k\right)}\right]}_{h},\hspace{1em}\forall \hspace{0.33em}1\le l,h\le s.$$
(26)

While the prediction error is evaluated for each model of \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) using the formula

$$\text{P.E}=\sqrt{\frac{1}{N-r}\times \sum_{n=r}^{N-1}{\Vert {\mathbf{Y}}_{n+1}-\left[{\widehat{{\varvec{B}}}}^{+}{\mathbf{Y}}_{n}+\hspace{0.33em}\frac{1}{s\beta -1}{\widehat{{\varvec{B}}}}^{-}\left(\beta {\mathbf{e}}_{srm}-{\mathbf{Y}}_{n}\right)\right]\Vert }_{2}^{2}}.$$
(27)

and for each \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)\) by the same equation but exclude the term \(\frac{1}{s\beta -1}{\widehat{{\varvec{B}}}}^{-}\left(\beta {\mathbf{e}}_{srm}-{\mathbf{Y}}_{n}\right)\).

As obvious from Fig. 5, the selected versions of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) are preferable to their corresponding of the \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)\) for \(r=\mathrm{1,2},\mathrm{3,4}\) in fitting the spatial–temporal dynamics of the COVID-19 outbreak data in accordance with their prediction accuracies for the six WHO regions. A comparison between the prediction errors of the models shows that the four selected models, \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{1,3.3}\right)\), \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\), \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{3,1.5}\right)\), and \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{4,1.1}\right)\) have smaller prediction errors than the \(\mathcal{M}{\mathcal{C}}_{6}^{+}\left(r\right)\) for \(r=\mathrm{1,2},\mathrm{3,4}\), respectively. Generally, the prediction performance of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) outperforms that of the \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)\). Hence, we cannot ignore the negative correlations among the classes assessing the risk level of spreading COVID-19 infection. Note that we obtain the same conclusion if comparing the best \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) and the best \(\mathcal{M}{\mathcal{C}}_{m}^{+}\left(r\right)\), which are of order 2 and 3, respectively.

5.2.5 Choosing \(\varvec{\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,{\beta}\right)}\) with the adequate order

The current pivotal step is to compare the performances of the four selected models with the proper \(\beta\)-values, \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{1,3.3}\right)\), \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\), \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{3,1.5}\right)\), and \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{4,1.1}\right)\), to choose the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(r,\beta \right)\) with the best order. In light of this, the prediction accuracies for each WHO region based on these preferable four models, along with their prediction errors, are summarized in Table 2. Besides, the performances of the underlying four models are evaluated using the BIC and AIC criteria in (23), and the results are reported in Table 3.

Table 2 Prediction performances for the preferable \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) for COVID-19 outbreak in WHO regions
Table 3 AIC and BIC-statistics for the preferable \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) for COVID-19 outbreak in WHO regions

It is clear from Tables 2 and 3 that the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) is preferred to the other three versions of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(r,\beta \right)\) in fitting the spatial–temporal dynamics of the COVID-19 outbreak data by both the BIC and AIC criteria. It also has a much better prediction performance than the other three competitive models according to the calculated prediction accuracies for each region and the prediction error for all six regions.

5.2.6 The \(\varvec{\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(2,1\right)}\) with ordinary convergence conditions

The definitive task is to evaluate the behavior of the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) under the ordinary convergence conditions and compare it with that obtained for the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) under the simple convergence conditions. For this purpose, the prediction performance of the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(\mathrm{2,1}\right)\) with the ordinary convergence conditions is further assessed \(\alpha <1\), in addition to their BIC and AIC statistics, and the outcomes are depicted in Fig. 6. It is shown from Fig. 6 that both BIC and AIC statistics of the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) with ordinary conditions get their largest values when \(\alpha =0.1\) and then diminish gradually as the value of \(\alpha\) increases. Besides, its prediction error decreases markedly with the increase in \(\alpha\)-values. Meanwhile, the model achieves higher prediction accuracies for larger values of \(\alpha\). Therefore, the performance of the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) with ordinary conditions improves as the value of \(\alpha\) nears one. A comparison with the results obtained for the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) with simple conditions shows that the \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(\mathrm{2,1}\right)\) with simple conditions still outperforms the same model with ordinary conditions.

Fig. 6
figure 6

Performance of \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) with ordinary convergence conditions for COVID-19 outbreak in WHO regions

Furthermore, to support the evaluation of the simple convergence conditions against the ordinary convergence conditions, we plot the objective function values obtained from the LP solved for each WHO region to build the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) based on the simple conditions versus the ordinary ones (for \(\alpha <1\)) in Fig. 7. Examining Fig. 7, we observe that, for all WHO regions, the value of the objective function from the LP based on the ordinary convergence conditions declines with the growth in \(\alpha\)-value. The minimal objective function value is reached when the value of \(\alpha\) approaches one. Visibly, the objective function values from the LP using the simple convergence conditions, which are much closer to zero, are smaller than those result from using the ordinary conditions for all WHO regions.

Fig. 7
figure 7

Objective function value from LP for \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) based on simple and ordinary convergence conditions, by WHO region

To evaluate the computational efficiency of the LP based on simple and ordinary convergence conditions to construct the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) for the COVID-19 outbreak in the six WHO regions, we calculate the execution time of the LP under the simple conditions and the ordinary ones (when \(\alpha =0.9999\)). We use the (\(tic-toc\)) function (in MATLAB) to compute the elapsed time (in seconds) for implementing the LP problems for the six WHO regions under the considered cases of convergence conditions. For each case, we repeat the calculation 1000 times and take the average of the elapsed times. The results are offered in Table 4.

Table 4 Computational efficiency of LP based on simple and ordinary convergence conditions to construct \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) for COVID-19 outbreak in WHO regions

It is illustrated from Table 4 that the simple convergence conditions perform considerably better than the ordinary convergence conditions in saving the time required for solving the necessary LP problems to estimate the weights parameters for constructing the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\).

In general, the simple convergence conditions make a significant improvement in the \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\)’s performance for the daily dynamics of the COVID-19 outbreak across the WHO regions.

5.2.7 The form of \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(2,1\right)\) and its discussion

In the present \(\mathcal{M}{\mathcal{C}}_{6}^{+,-}\left(\mathrm{2,1}\right)\) for analyzing the spatial–temporal dynamics of COVID-19 outbreak data, \({\mathbf{X}}_{n+1}^{\left(k\right)}\) represents the probability distribution of the COVID-19 outbreak’s classes “the risk level of spreading” at period \(n+1\) for the region\(k\), and \({\widehat{{\varvec{P}}}}_{i}^{\left(k,j\right)}\) is the estimated \(i\)-step TPM from the COVID-19 outbreak’s classes in the region \(j\) at period \(n-i+1\) to the COVID-19 outbreak’s classes in the region \(k\) at period \(n+1\) for \(k,j=\mathrm{1,2},\dots ,6\) and \(i=\mathrm{1,2}\), which describe the first and second-order spatial–temporal dependencies among the risk level of spreading COVID-19 infection across the WHO areas, such that

$${\mathbf{X}}_{n+1}^{\left(k\right)}=\left\{\begin{array}{l}0.9378{\widehat{{\varvec{P}}}}_{2}^{\left(\mathrm{1,5}\right)}{\mathbf{X}}_{n-1}^{\left(5\right)}+{\mathbf{b}}_{1},\quad \quad \quad \quad \quad \quad \quad \quad \;\,\,\,\text{for }\; k=1:\left({\text{WPRO}}\right)\\ 0.9802{\widehat{{\varvec{P}}}}_{1}^{\left(\mathrm{2,1}\right)}{\mathbf{X}}_{n}^{\left(1\right)}+{\mathbf{b}}_{2},\quad \quad \quad \quad \quad \quad \quad \quad \; \; \;\;\, \text{for}\; k=2:\left({\text{EURO}}\right)\\ 0.9163{\widehat{{\varvec{P}}}}_{1}^{\left(\mathrm{3,2}\right)}{\mathbf{X}}_{n}^{\left(2\right)}+0.0712{\widehat{{\varvec{P}}}}_{1}^{\left(\mathrm{3,4}\right)}{\mathbf{X}}_{n}^{\left(4\right)}+{\mathbf{b}}_{3}, \quad \text{for}\; k=3:\left({\text{SEARO}}\right)\\ 0.9800{\widehat{{\varvec{P}}}}_{1}^{\left(\mathrm{4,1}\right)}{\mathbf{X}}_{n}^{\left(1\right)}+{\mathbf{b}}_{4}, \quad \quad \quad \quad \quad \quad \quad \quad \quad \;\text{for}\; k=4:\left({\text{EMRO}}\right)\\ 0.3168{\widehat{{\varvec{P}}}}_{1}^{\left(\mathrm{5,1}\right)}{\mathbf{X}}_{n}^{\left(1\right)}+0.6659{\widehat{{\varvec{P}}}}_{1}^{\left(\mathrm{5,4}\right)}{\mathbf{X}}_{n}^{\left(4\right)}+{\mathbf{b}}_{5}, \quad \text{for}\; k=5:\left({\text{PAHO}}\right)\\ 0.1957{\widehat{{\varvec{P}}}}_{1}^{\left(\mathrm{6,2}\right)}{\mathbf{X}}_{n}^{\left(2\right)}+0.7890{\widehat{{\varvec{P}}}}_{1}^{\left(\mathrm{6,4}\right)}{\mathbf{X}}_{n}^{\left(4\right)}+{\mathbf{b}}_{6}, \quad \text{for}\; k=6:\left({\text{AFRO}}\right)\end{array}\right.$$
(28)

where

figure a

with L, M, H, VH, and E indicate the classes: Low-Risk Spread, Moderate-Risk Spread, High-Risk Spread, Very High-Risk Spread, and Extreme-Risk Spread, respectively, as well as \({\mathbf{b}}_{u} (u=\mathrm{1,2},\mathrm{3,4},\mathrm{5,6})\) are vectors:

$${\mathbf{b}}_{1}={\left(\begin{array}{ccccc}0.0025& 0.0082& 0.0450& 0.0040& 0.0025\end{array}\right)}^{^{\prime}},$$
$${\mathbf{b}}_{2}={\left(\begin{array}{ccccc}0.0068& 0.0012& 0.0017& 0.0093& 0.0008\end{array}\right)}^{^{\prime}},$$
$${\mathbf{b}}_{3}={\left(\begin{array}{ccccc}0.0079& 0.0008& 0.0012& 0.0018& 0.0008\end{array}\right)}^{^{\prime}},$$
$${\mathbf{b}}_{4}={\left(\begin{array}{ccccc}0.0071& 0.0010& 0.0044& 0.0067& 0.0008\end{array}\right)}^{^{\prime}},$$
$${\mathbf{b}}_{5}={\left(\begin{array}{ccccc}0.0079& 0.0015& 0.0013& 0.0033& 0.0032\end{array}\right)}^{^{\prime}},$$
$${\mathbf{b}}_{6}={\left(\begin{array}{ccccc}0.0086& 0.0020& 0.0022& 0.0021& 0.0004\end{array}\right)}^{^{\prime}}.$$

Illation. From the above-constructed model in (28), we note that.

  1. (a)

    The risk levels of spreading COVID-19 infection in the two WHO regions, EURO and EMRO, expressed by \({\mathbf{X}}_{n+1}^{\left(2\right)}\) and \({\mathbf{X}}_{n+1}^{\left(4\right)}\), respectively, are highly related to the risk level in the WPRO in the previous day, \({\mathbf{X}}_{n}^{\left(1\right)}\). While the risk level in the WPRO at the (\(n+1\))-th day, \({\mathbf{X}}_{n+1}^{\left(1\right)}\), is strongly affected by its state in the PAHO at the (\(n-1\))-th day, \({\mathbf{X}}_{n-1}^{\left(5\right)}\).

  2. (b)

    For PAHO, the pervasion COVID-19 disease risk status at the (\(n+1\))-th day, exemplified by \({\mathbf{X}}_{n+1}^{\left(5\right)}\), is impacted by its state at the \(n\)-th day in WPRO, \({\mathbf{X}}_{n}^{\left(1\right)}\), but the risk status in EMRO in the same day, \({\mathbf{X}}_{n}^{\left(4\right)}\) influences it by nearly double.

  3. (c)

    The dynamics of COVID-19 outbreak risk level in the other two WHO regions SEARO and AFRO, represented by \({\mathbf{X}}_{n+1}^{\left(3\right)}\) and \({\mathbf{X}}_{n+1}^{\left(6\right)}\), respectively, are affected by the prevalence degree of the epidemic in the immediately preceding day in both EURO and EMRO. In contrast to SEARO, the phenomena’s class in AFRO, \({\mathbf{X}}_{n+1}^{\left(6\right)}\), associates more with its state at the \(n\)-th day in EMRO, \({\mathbf{X}}_{n}^{\left(4\right)}\), than in EURO, \({\mathbf{X}}_{n}^{\left(2\right)}\).

  4. (d)

    The \(\mathcal{M}{\mathcal{C}}_{m}^{+,-}\left(\mathrm{2,1}\right)\) in (28) can be used to approximate the transition probabilities \({p}_{{l}_{k}|{\boldsymbol{\mathcal{J}}}}^{\left(k\right)}\) that constitute the TPMs in (2) with \(r=2,s=5,\) and \(m=6\) for \(k=\mathrm{1,2},\dots ,6\), which describes the spatial–temporal dynamics in light of the full parameterized second-order multivariate MC model via the relationship (21). In addition, it can be applied as a forecasting technique for the risk level of spreading COVID-19 infection in the six WHO regions at the next period.

6 Conclusion

We have employed a modern strategy, based on the HOM-MC structure, to analyze the high-depth spatial–temporal dynamics for random phenomena concurrently. To combine parsimony with realism, we have introduced a novel reduced-parametric discrete-time HOM-MC model to cope with positive and negative associations among several chains. The parameter estimation procedures have been discussed, and the simple convergence conditions have been exploited to improve the linear programming’s formulation. The outcomes of the executed Monte Carlo simulation experiment illustrate the present model’s outstanding performances for approximating the actual transition probabilities of the HOM-MC describing the space–time dynamics. It always dominates the corresponding model for only positive correlations studied by Ching et al. (2008).

Besides, we have constructed the proposed model with different orders to analyze the daily prevalence of the COVID-19 pandemic risk states over the six WHO regions that constitute the world. The performance evaluation of the models and their comparisons have been carried out using the AIC and BIC criteria, as well as the prediction error and the prediction accuracy for each region. The findings reveal that the proposed model is immensely privileged to analyze the spatial–temporal dynamics of the COVID-19 outbreak with satisfied prediction precision, and the simple convergence conditions show a considerable improvement in its performance. Thus, it can work as a valid and objective tool for realistic monitoring of the pandemic spread, which could inspire the institutions responsible for public health and infection control to verify the effectiveness of their strategic plan for infection control and quarantine.

According to the promising results achieved by our parsimonious HOM-MC model, it can be applied in various application aspects. Avoiding overfitting and learning the HOM-MC models’ structure would be a subject that may be worth discussing in future work.