1 Introduction

To characterize the near-Earth space environment, the frequent observation of space debris is of primary importance. From the observations acquired by optical or radar sensors, orbits are computed and assigned to objects in a catalogue. Due to the reduced number of observations and their sparse distribution, it is required in most of the cases to associate short-arc sequences of observations of the same object, also called tracklets, to compute an orbit with enough accuracy. In Milani et al. (2012), Zittersteijn et al. (2016), and Reihs et al. (2020), examples of orbit determination methods are given based on the association, also called correlation or linking, of the observed tracklets. In some methods, a six-parameters orbit is calculated from one tracklet and propagated to the epoch of a second tracklet considered for the association. The propagated orbit is then compared with the orbit calculated at the second epoch to decide whether the two orbits originate from the same object. The comparison can be done in the “measurements” space, between pseudo-observations generated from the propagated orbit and real observations. In this case, the difference in the measurements is then evaluated w.r.t. to a predefined threshold (see e.g. Früh and Schildknecht 2012). Alternatively, the comparison can be performed in the state/orbital elements space, e.g. with a covariance-based threshold (Hill et al. 2012). The latter association approach allows for a more refined comparison considering a more complete orbit description and also its uncertainty. In this approach, it is possible to define a distance between orbits which, together with a given threshold, is applied to evaluate their similarity, i.e. how well the two orbits match. The association is accepted if the distance is smaller than the defined threshold. A straight way to introduce a distance between two orbits exploits the correspondence between orbital elements and state vector. If the two orbits are represented by their state vectors, the Euclidean distance in the ℝ6 ambient space is one possible metric. Since the computed orbits might differ in the accuracy, it is useful to introduce a normalized distance, which is independent from the accuracy of the observations and of the derived orbits. This is also necessary if the applied threshold should remain consistent despite orbits characterized by different uncertainty. Assuming that the orbit uncertainty is described by a multivariate “normal” distribution, then the normalization can be formulated in terms of Mahalanobis distance (De Maesschalck et al. 2000). In the latter, the Euclidean metric is scaled along the principal component axes of the distribution according to their variance. Thus, in case of unit variance along these axes, the Mahalanobis distance is reduced to the standard Euclidean distance.

Usually, the orbit uncertainty is described w.r.t. the given parameters of the reference frame. If the orbit is given by a state vector, the uncertainty distribution refers to the Cartesian components of position and velocity in the Euclidean ℝ6 space. Alternatively, the distribution can refer to orbital elements or to any other orbit parameterization. The choice of the latter often depends on details of the method considered for orbit determination. For example, in a least squares approach, the choice of the parameters is made within the algorithm implementation, in the design matrix. Obviously, the obtained uncertainty distribution can be transformed to another reference frame, but this is usually done through a linear transformation, which could cause an incorrect representation of the uncertainty in the new frame, if the size of the initial distribution exceeds the linear range of the transformation. As in the least squares procedure, typically a normal distribution for the uncertainty is assumed. However, this can turn out to be an inappropriate model depending on the size of the initial distribution and especially if the expected transformation is highly nonlinear. The problem of an appropriate representation of the uncertainty is treated, e.g. in Sabol et al. (2013).

The main question here is whether there is a parameterization and a related metric that best represent the similarity between orbits. The parameterization regards the description of the orbit, e.g. whether we want to consider a state vector, or orbital elements, or any other set of parameters that uniquely defines an orbit. Within the space spanned by these parameters, a metric can be defined, which provides a distance value for two elements of the space. There are different examples of improvements made in both aspects. In Vallado and Alfano (2014) and Hill et al. (2012), the authors rely on state vectors but define a distance in a curvilinear frame with a position and a velocity component along the curved trajectory of one of the orbits and other components perpendicular to it. The adopted formulation tries to compensate for the wrong representation of the uncertainty in along-track direction caused by the assumption of a normal distribution. In fact it is known that, due to the faster increase in the along-track uncertainty as a function of time, the error ellipsoid will tend to elongate into a “banana” shape along the orbit trajectory.

Different definitions of distance criteria based on a parameterization via orbital elements were proposed. Some of them are not strictly distance functions in a mathematical sense, because they do not fulfil the metric axioms. For example, refer to Kholshevnikov (2008) and Kholshevnikov et al. (2016) for an overview of these criteria. In the latter paper, also a category of metrics is treated that the authors called “natural”, since they are defined by quantities with a physical meaning: the orbital angular momentum and the Laplace–Runge–Lenz vector (see Sect. 2 for the explicit formulation of these two quantities). In particular in this category, one proposed metric is defined by the Euclidean distance in ℝ6, in which the space spanned by these natural parameters is embedded. The spanned space is a five-dimensional surface in ℝ6 and alternative metrics on this surface or on topologically equivalent spaces are possible. Moser (1970) first proved that the space of bounded Keplerian orbits with constant energy is topologically equivalent to the Cartesian product of two spheres. In Maruskin (2010) an equivalent topology with a similar parameterization is proposed and a Riemannian metric is introduced. A closed formula for the geodesic distance between two points in the Riemannian manifold is obtained.

In the metrics suggested by Kholshevnikov et al. (2016) and Maruskin (2010), the orbit is identified with five orbital elements. The sixth parameter, indicating the position of the object along the orbit, is usually not considered for applications in the Solar system astronomy, e.g. where the common origin of celestial bodies is investigated. However, for space debris and artificial satellites in the frame of initial orbit determination and tracklet correlation, this additional information can be relevant. For this reason, it can be interesting to evaluate the above metrics taking into account also, e.g. the orbital anomaly.

In this article, first the distance definitions treated in Maruskin (2010) and Kholshevnikov et al. (2016) are described. After considering the extension of both distances with the orbital anomaly, the two metrics are normalized according to the definition of Mahalanobis distance. Finally, the newly formulated metrics are assessed and compared each other and with distances between state vectors.

2 Geodesic distance in natural Riemannian metric

2.1 Metric definition

In Maruskin (2010), a geodesic distance in the space of bounded Keplerian orbits is proposed. The author examines a topology equivalent to the one found by Moser (1970). In the latter, it was shown that the space \({\mathbb{V}}(E)\) of closed orbits with fixed energy \(E\) can be parameterized through the Laplace–Runge–Lenz vector \(\mathbf{e}=\frac{\dot{\mathbf{r}}\boldsymbol{ }\times \boldsymbol{ }\mathbf{H}}{\mu }-\frac{\mathbf{r}}{\left|\mathbf{r}\right|}\) and a normalized angular momentum vector

$$\mathbf{h}=\frac{\mathbf{H}}{\sqrt{\mu a}} ,$$
(1)

given the position of the orbiting object \(\mathbf{r}(t)\), the angular momentum \(\mathbf{H}=\mathbf{r}\times \dot{\mathbf{r}}\), the semimajor axis \(a\), and the gravitational parameter \(\mu \). The norm of \(\mathbf{e}\) is the eccentricity \(e\) of the orbit. The two vectors are perpendicular to each other and it was proven that the spanned space has a topology equivalent to \({S}^{2}\times {S}^{2}\), the Cartesian product of two unit spheres. We indicate this topological equivalence with \({\mathbb{V}}(E)\cong {S}^{2}\times {S}^{2}\). A linear combination of these two vectors,

$${\varvec{\upeta}}=\mathbf{e}+\mathbf{h},$$
(2)
$${\varvec{\upxi}}=\mathbf{e}-\mathbf{h}$$
(3)

was further discussed in Maruskin and Scheeres (2009). The vectors \({\varvec{\eta}}\) and \({\varvec{\xi}}\) satisfy the conditions:

$${\varvec{\upeta}}\cdot {\varvec{\upeta}}=1,$$
(4)
$${\varvec{\upxi}}\cdot {\varvec{\upxi}}=1.$$
(5)

These properties further emphasize the above topological equivalence. The advantage of this parameterization is that both the vectors have unit length and equally represent the two \({S}^{2}\) subspaces. Maruskin (2010) then extends this topology to the cone \({\mathbb{K}}({S}^{2}\times {S}^{2})\), where \({\mathbb{K}}(X)\) is defined for a subset \(X\) of the vector space \(V\) as

$${\mathbb{K}}\left(X\right)=\left\{\mathbf{v}\in V :\mathbf{v}=a\mathbf{x}, a\ge 0, \mathbf{x}\in X\right\}$$

where \(a\) is the semimajor axis. The defined topology \({\mathbb{H}}\cong {\mathbb{K}}({S}^{2}\times {S}^{2})\) extends the space \({\mathbb{V}}(E)\) of fixed energy \(E\) to a space considering all bounded orbits through their semimajor axis. On the manifold \({\mathbb{H}}\subset {\mathbb{R}}^{6}\) the geodesic distance between two points identified with \({{\varvec{\upxi}}}_{1}\), \({{\varvec{\upeta}}}_{1}\), \({a}_{1}\) and \({{\varvec{\upxi}}}_{2}\), \({{\varvec{\upeta}}}_{2}\), \({a}_{2}\) is calculated as

$$d= \sqrt{2({a}_{1}^{2}+{a}_{2}^{2}-2{a}_{1}{a}_{2}\mathrm{cos}\psi )},$$
(6)

where

$$\psi =\sqrt{\frac{1}{2}\left({\mathrm{arccos}}^{2}\left({{\varvec{\upeta}}}_{1}\cdot {{\varvec{\upeta}}}_{2}\right)+{\mathrm{arccos}}^{2}\left({{\varvec{\upxi}}}_{1}\cdot {{\varvec{\upxi}}}_{2}\right)\right)}.$$
(7)

Note that the terms within \(\psi \) are angle differences between the vectors \({\varvec{\upxi}}\)’s, respectively \({\varvec{\upeta}}\)’s. In the cited article it is indicated that a geodesic distance on \({\mathbb{K}}({S}^{1}\times {S}^{1})\) can be generalized and applied to a manifold with \(n\) spheres \({\mathbb{K}}({S}^{1}\times {\cdots \times S}^{1})\subset {\mathbb{R}}^{2n}\). In this case, the general expression contains \(n\) angle differences \({\theta }_{i}\)’s (\(i=1,\dots , n\)) for any additional sphere:

$$\psi =\sqrt{\frac{1}{n}{\sum }_{i=1}^{n}{\theta }_{i}^{2}}.$$
(8)

2.2 Orbital anomaly

We would like to extend the distance in Eq. (6) to all six orbital parameters, i.e. including also one parameter that describes the position of the object along its orbit. There are different possible parameters for this purpose. Here, we examine the commonly used true, eccentric, and mean anomalies. While the mean anomaly has a linear increase over the time of an orbit revolution, the true and eccentric anomaly do not increase linearly in the case of eccentric orbits. As known for example, the angular velocity close to the periapsis increases, and so does the change rate in the true or eccentric anomaly. This means that two objects on different positions along the same orbit will always have the same difference in mean anomaly over time. However, the same does not apply for the difference in true or eccentric anomaly. For the definition of distance, it is then preferable to choose the mean anomaly in order to have a constant value over time.

To add this parameter in the above formulation, we exploit the property expressed in Eq. (8) which easily allows us to extend the distance to a topology with more spheres. Let us consider the space \({\mathbb{H}}\cong {\mathbb{K}}({S}^{2}\times {S}^{2}\times {S}^{1})\subset {\mathbb{R}}^{6}\). It is easy to show that the set of the anomaly values can be represented with an additional separate space and that there is a homeomorphism between this set and the unit sphere \({S}^{1}\) (see “Appendix”, Theorem 1). For consistency with the previous definition, the additional sphere is extended to the cone \({\mathbb{K}}({S}^{1})\) giving a new topology \({\mathbb{K}}({S}^{2}\times {S}^{2})\times {\mathbb{K}}({S}^{1})={\mathbb{K}}({S}^{2}\times {S}^{2}\times {S}^{1})\). Eq. (7) is extended with the mean anomaly \(M\) to

$$\psi =\sqrt{\frac{1}{3}\left({\mathrm{arccos}}^{2}\left({{\varvec{\upeta}}}_{1}\cdot {{\varvec{\upeta}}}_{2}\right)+{\mathrm{arccos}}^{2}\left({{\varvec{\upxi}}}_{1}\cdot {{\varvec{\upxi}}}_{2}\right)+{\left({M}_{2}-{M}_{1}\right)}^{2}\right)}.$$
(9)

3 Euclidean distance with natural parameterization

3.1 Metric definition

In the work of Kholshevnikov et al. (2016), one of the analysed natural metrics considers the Euclidean distance in the space ℝ6. The angular momentum vector \(\mathbf{H}\) and the Laplace–Runge–Lenz vector \(\mathbf{e}\) are scaled to the vectors \(\mathbf{u}\) and \(\mathbf{v}\) such that

$$ \left| {\mathbf{u}} \right| = \sqrt p \quad \left| {\mathbf{v}} \right| = e\sqrt p $$
(10)

where \(p\) is the semilatus rectum. The distance between two points identified with \({\mathbf{u}}_{1}\), \({\mathbf{v}}_{1}\) and \({\mathbf{u}}_{2}\), \({\mathbf{v}}_{2}\) in the ambient space ℝ6 is then expressed as

$$d=\sqrt{\frac{1}{L}\left({\left({\mathbf{u}}_{1}-{\mathbf{u}}_{2}\right)}^{2}+{\left({\mathbf{v}}_{1}-{\mathbf{v}}_{2}\right)}^{2}\right)},$$
(11)

where \(L\) is an arbitrary factor which can be set to \(L=1\).

As stated in Kholshevnikov et al. (2016) and explicitly shown in the “Appendix” in Theorem 2, this metric is topologically equivalent to the one in Eq. (6) and applicable to the same parameter space. However, from the proof in Theorem 2, we note that the two metrics are not Lipschitz equivalent and they behave differently. While in the metric (11) the natural parameters \(\mathbf{u}\) and \(\mathbf{v}\) are first evaluated in distinct ℝ3 subspaces, in the metric (6) the angular information in the subspaces \({S}^{2}\subset {\mathbb{R}}^{3}\) is combined together in \(\psi \) before introducing a “scaling” with \({a}_{1}\) and \({a}_{2}\).

3.2 Orbital anomaly

As for the metric defined by Maruskin also in this case, we want to extend the distance definition to a space taking into account the mean anomaly M. The unit sphere \({S}^{1}\), homeomorphic to the mean anomaly set (“Appendix”, Theorem 1), can be embedded in \({\mathbb{R}}^{2}\). The original space \({\mathbb{R}}^{6}\) can be naturally extended with a separate space \({\mathbb{R}}^{2}\) to \( {\mathbb{R}}^{8} = {\mathbb{R}}^{6} \times {\mathbb{R}}^{2} . \)

Similarly as for \(\mathbf{u}\) and \(\mathbf{v},\) we define a new vector \(\mathbf{w}\) so that:

$$\mathrm{\sphericalangle }\left({\mathbf{w}}_{1},{\mathbf{w}}_{2}\right)=\left|{M}_{1}-{M}_{2}\right|,\quad \left|{\mathbf{w}}_{1}\right|=\sqrt{{p}_{1}}, \quad\left|{\mathbf{w}}_{2}\right|=\sqrt{{p}_{2}}.$$

This vector can be expressed as

$$\mathbf{w}=\sqrt{p}\left(\begin{array}{c}\mathrm{cos}M\\ \mathrm{sin}M\end{array}\right).$$
(12)

In the choice of \(\mathbf{w}\), relevant is the angle between two such vectors, which indicates the difference in mean anomaly. The length \(\sqrt{p}\) of the vector is chosen to be consistent with the construction of \(\mathbf{u}\) and \(\mathbf{v}\), where \(\sqrt{p}\) is a common scaling factor similar to the scaling semimajor axis \(a\) in (6). The distance extended with the mean anomaly is

$$d=\sqrt{{\left({\mathbf{u}}_{1}-{\mathbf{u}}_{2}\right)}^{2}+{\left({\mathbf{v}}_{1}-{\mathbf{v}}_{2}\right)}^{2}+{\left({\mathbf{w}}_{1}-{\mathbf{w}}_{2}\right)}^{2}}.$$
(13)

This metric and the one extended in Eq. (9) are still topologically equivalent, as shown in the “Appendix” in Corollary 1.

4 Mahalanobis distance

4.1 Definition

The Mahalanobis distance in ℝN describes a distance between a point \(\mathbf{x}\) and a normal distribution with mean \({\varvec{\upmu}}\) and covariance matrix \(\mathbf{S}\) (De Maesschalck et al. 2000). This definition can be easily extended to the distance between two points \(\mathbf{x}\) and \(\mathbf{y}\) with respect to the same distribution:

$${d}_{\mathrm{M}}\left(\mathbf{x},\mathbf{y}\right)=\sqrt{{\left(\mathbf{x}-\mathbf{y}\right)}^{\mathrm{T}}{\mathbf{S}}^{-1}\left(\mathbf{x}-\mathbf{y}\right)}.$$
(14)

For our purpose we want to use the Mahalanobis distance as a measure of the similarity of two orbits. If we consider two points \(\mathbf{x}\) and \(\mathbf{y}\) in the orbit space and their uncertainty given by their covariances, the Mahalanobis distance is a way to scale the actual distance between \(\mathbf{x}\) and \(\mathbf{y}\) according to the covariances \({\mathbf{S}}_{\mathbf{x}}\) and \({\mathbf{S}}_{\mathbf{y}}\) and can be written as (Hill et al. 2012):

$$ d_{{\text{M}}} \left( {{\mathbf{x}},{\mathbf{y}}} \right) = \sqrt {\left( {{\mathbf{x}} - {\mathbf{y}}} \right)^{{\text{T}}} {\mathbf{S}}^{ - 1} \left( {{\mathbf{x}} - {\mathbf{y}}} \right)} ,\;\;{\text{where}}\;\;{\mathbf{S}} = {\mathbf{S}}_{{\mathbf{x}}} + {\mathbf{S}}_{{\mathbf{y}}} . $$
(15)

4.2 Metric axioms

According to the mathematical definition of metric, a distance function must satisfy the following axioms:

  1. 1.

    \(d\left({\mathbf{x}}_{1},{\mathbf{x}}_{2}\right)\ge 0\)1. while \(d\left({\mathbf{x}}_{1},{\mathbf{x}}_{2}\right)=0\) if and only if \({\mathbf{x}}_{1}={\mathbf{x}}_{2}\)

  2. 2.
    $$d\left({\mathbf{x}}_{1},{\mathbf{x}}_{2}\right)=d\left({\mathbf{x}}_{2},{\mathbf{x}}_{1}\right)$$
    (16)
  3. 3.

    \(d\left({\mathbf{x}}_{1},{\mathbf{x}}_{3}\right)\le d\left({\mathbf{x}}_{1},{\mathbf{x}}_{2}\right)+d\left({\mathbf{x}}_{2},{\mathbf{x}}_{3}\right)\) (triangle axiom)

It can be shown that all three metric axioms are satisfied in the original definition of Mahalanobis distance in Eq. (14). While the first and second axioms can be easily verified, for the triangle axiom we see that the covariance matrix \(\mathbf{S}\) is positive-semidefinite and symmetric and it can be expressed as \(\mathbf{S}={\mathbf{T}}^{-1}\mathbf{D}\mathbf{T}\), where \({\mathbf{T}}^{\mathrm{T}}={\mathbf{T}}^{-1}\) is an orthogonal and \(\mathbf{D}\) a diagonal matrix. While the orthogonal matrix does not change the resulting distance, the diagonal elements in \(\mathbf{D}\) are responsible for a scaling of the Euclidean distance, the latter being the case with a unit matrix \(\mathbf{D}\), and as such do not affect the triangle inequality.

However, in our application of the Mahalanobis distance in Eq. (15) every \(\mathbf{x}\) and \(\mathbf{y}\) is related to a different distribution. Thus, the covariance matrices are actually variables of a distance function \({d}_{\mathrm{M}}(\mathbf{x},\mathbf{y},{\mathbf{S}}_{\mathbf{x}},{\mathbf{S}}_{\mathbf{y}})\) between two points \((\mathbf{x},{\mathbf{S}}_{\mathbf{x}})\) and \((\mathbf{y},{\mathbf{S}}_{\mathbf{y}})\) in the geometric space extended with the covariances. If we consider this extension, the first axiom is clearly not fulfilled. In fact, we can have different combinations \((\mathbf{x},{\mathbf{S}}_{\mathbf{x}})\) and \((\mathbf{y},{\mathbf{S}}_{\mathbf{y}})\) with \(\mathbf{x}=\mathbf{y}\) and \({\mathbf{S}}_{\mathbf{x}}\ne {\mathbf{S}}_{\mathbf{y}}\) so that \({d}_{\mathrm{M}}\left(\mathbf{x},\mathbf{y},{\mathbf{S}}_{\mathbf{x}},{\mathbf{S}}_{\mathbf{y}}\right)=0\). It is easy to show that even the triangle axiom is not satisfied. The reason for not being compliant with two of the axioms is that in this case we try to define a distance between distributions and not a conventional metric. There are known statistical distances which give a measure of how probability distributions differ from each other, e.g. like the Kullback–Leibler divergence (Kullback and Leibler 1951), or the Bhattacharyya distance (Choi and Lee 2003). Despite the fact that sometimes the name “distance” is used, they are not real metrics and do not respect all axioms. However, unlike in statistical distances we are not interested in the shape difference of the distributions but only in the distance between their means, scaled according to their covariances. Thus, the statistical aspect introduced with the matrix \(\mathbf{S}\) in Eq. (15) can be formally separated by the actual distance between the distribution means in the underlying space. Note that if \(\mathbf{S}\) is not considered Eq. (15) reduces to the standard Euclidean distance. Since \(\mathbf{S}\) is symmetric and positive-semidefinite, it can be expressed as product of two matrices as \(\mathbf{S}={\mathbf{B}}^{\mathrm{T}}\mathbf{B}\). Then, \({d}_{\mathrm{M}}\left(\mathbf{x},\mathbf{y}\right)={d}_{\mathrm{E}}\left({\mathbf{x}}^{\mathbf{^{\prime}}},{\mathbf{y}}^{\mathbf{^{\prime}}}\right)\), where \({d}_{\mathrm{E}}\) is the Euclidean distance and \({\mathbf{x}}^{\mathbf{^{\prime}}}=\mathbf{B}\mathbf{x}\), \({\mathbf{y}}^{\mathbf{^{\prime}}}=\mathbf{B}\mathbf{y}\). This indicates that the spatial distance between the statistical characterized points \({\mathbf{x}}^{\mathbf{^{\prime}}}\) and \({\mathbf{y}}^{\mathbf{^{\prime}}}\) is correctly evaluated in a compliant metric space.

5 Mahalanobis distance of state vectors

5.1 Euclidean space

In the Euclidean space, the state vectors are given in Cartesian coordinates. For the state vectors \(\mathbf{x}=\left(\begin{array}{c}{\mathbf{r}}_{{\varvec{x}}}\\ {\mathbf{v}}_{{\varvec{x}}}\end{array}\right)\) and \(\mathbf{y}=\left(\begin{array}{c}{\mathbf{r}}_{{\varvec{y}}}\\ {\mathbf{v}}_{{\varvec{y}}}\end{array}\right)\) with Cartesian covariances \({\mathbf{C}}_{\mathbf{x}}\) and \({\mathbf{C}}_{\mathbf{y}}\) the Mahalanobis distance is according to (15):

$$ d_{{\text{M}}} \left( {{\mathbf{x}},{\mathbf{y}},{\mathbf{C}}_{{\mathbf{x}}} ,{\mathbf{C}}_{{\mathbf{y}}} } \right) = \sqrt {\left( {{\mathbf{x}} - {\mathbf{y}}} \right)^{{\text{T}}} {\mathbf{C}}^{ - 1} \left( {{\mathbf{x}} - {\mathbf{y}}} \right)} ,\;\;{\text{where}}\;\;{\mathbf{C}} = {\mathbf{C}}_{{\mathbf{x}}} + {\mathbf{C}}_{{\mathbf{y}}} . $$

The choice of Cartesian coordinates makes it difficult to well describe the uncertainty related to the state vectors. In fact, the latter is expected to be somehow curved along the existing orbit, which cannot be expressed using the linear covariance formalism in Cartesian coordinates.

5.2 Curvilinear coordinates

An improvement is brought with the idea of curvilinear coordinates, which in this context refer to a method published in Hill et al. (2012) and Vallado and Alfano (2014). In the following, we explain the underlying principle of the method (for the actual formulation refer to the latter articles). A scenario with two flying objects is assumed and the position of the second object w.r.t. to the first one (or in its coordinate system) is sought. If we choose a reference system at the first object based on its curved orbit trajectory, we define new curved (or curvilinear) coordinates. The idea is illustrated in Fig. 1 and is simplified considering coplanar orbits. Object 1 has position P1 and orbit C1, while object 2 is in P2 with orbit C2. At point P1 a reference system based on the normal, tangential, and cross-track orbital component (denoted as NTW1 system) is defined (see e.g. n1 normal component). Position P2 has a difference α in the true anomaly w.r.t. the reference orbit. At point S2, a second NTW2 system is defined and the position P2 is expressed in this new reference system. The position and velocity at P2 in the curvilinear coordinates system with origin in P1 is given by the difference of normal and cross-track components in NTW1 and NTW2 and by the position difference S along the tangential component. For the latter, the length of the arc between P1 and S2 has to be calculated numerically. We define the two state vectors \({\mathbf{x}}_{i}\) as

$$ {\mathbf{x}}_{i} = \left( {\begin{array}{*{20}c} {{\mathbf{r}}_{i} } \\ {{\mathbf{v}}_{i} } \\ \end{array} } \right)_{{{\text{NTW}}_{i} }} ,\;\;{\text{where}}\;\;{\mathbf{r}}_{i} = \left( {\begin{array}{*{20}c} {{\mathbf{r}}_{{i,{\text{N}}}} } \\ {{\mathbf{r}}_{{i,{\text{T}}}} } \\ {{\mathbf{r}}_{{i,{\text{W}}}} } \\ \end{array} } \right)_{{{\text{NTW}}_{i} }} \;\;{\text{and}}\;\;{\mathbf{v}}_{i} = \left( {\begin{array}{*{20}c} {{\mathbf{v}}_{{i,{\text{N}}}} } \\ {{\mathbf{v}}_{{i,{\text{T}}}} } \\ {{\mathbf{v}}_{{i,{\text{W}}}} } \\ \end{array} } \right)_{{{\text{NTW}}_{i} }} \;\;{\text{for}}\;\;i = 1,2. $$
Fig. 1
figure 1

Scheme describing curvilinear coordinates

The notation NTWi indicates that the vector is expressed in the NTWi system. Since NTW1 has the origin in P1, we have \({\mathbf{r}}_{1}=0\) and according to the idea described above we replace \({\mathbf{r}}_{2,\mathrm{T}}\) with S. Then, the Mahalanobis distance is as follows:

$$\begin{aligned} d_{{\text{M}}} \left({{\mathbf{x}}_{1} ,{\mathbf{x}}_{2},{\mathbf{C}}_{{1,{\text{NTW}}_{1} }},{\mathbf{C}}_{{2,{\text{NTW}}_{2} }} } \right) = &\,\sqrt {\left({{\mathbf{x}}_{1} - {\mathbf{x}}_{2} } \right)^{{\text{T}}}{\mathbf{C}}^{ - 1} \left( {{\mathbf{x}}_{1} - {\mathbf{x}}_{2} }\right)} ,\\{\text{ where}}\;\;\;{\mathbf{C}} =&\,{\mathbf{C}}_{{1,{\text{NTW}}_{1} }} +{\mathbf{C}}_{{2,{\text{NTW}}_{2} }} .\end{aligned}$$

6 Mahalanobis distance following Maruskin’s metric

If we consider in the two-dimensional case ℝ2 the Maruskin’s metric described in Eq. (6) we see that the defined distance is simply, up to a proportional factor, the length \(d\) of one edge of a triangle with two other edges of length \({a}_{1}\) and \({a}_{2}\) and an angle \(\psi \) between them (see Fig. 2). The length \(d\) can be calculated with the law of cosines.

Fig. 2
figure 2

Triangle showing the relation between \(a_{1}\), \(a_{2}\),\({\uppsi }\), and \(d\) in ℝ2

If the angle \(\psi \) is small, \(d\) can be approximated, e.g. by

$$d\approx \sqrt{{{(a}_{2}-{a}_{1})}^{2}+{\left({a}_{1}\psi \right)}^{2}}.$$
(17)

To conserve the symmetry (2. metric axiom) in \(a,\) we may use \(\mathrm{cos}\psi \approx 1-\frac{{\psi }^{2}}{2}\) and we obtain

$${d}^{2}={a}_{1}^{2}+{a}_{2}^{2}-2{a}_{1}{a}_{2}\mathrm{cos}\psi \approx {{(a}_{2}-{a}_{1})}^{2}+{a}_{1}{a}_{2}{\psi }^{2}$$

For the generalization in ℝ6, Eq. (9) for the angle \(\psi \) has to be used instead. The sum of squares appearing in Eqs. (8) and (17) suggests the possibility to scale the summands according to their uncertainty as in the Mahalanobis distance. We can define in ℝ4 the vector \({\mathbf{z}}^{\mathrm{T}}=\left(\begin{array}{cccc}{a}_{2}-{a}_{1},& \frac{\sqrt{{a}_{1}{a}_{2}}{\theta }_{1}}{\sqrt{3}},& \frac{\sqrt{{a}_{1}{a}_{2}}{\theta }_{2}}{\sqrt{3}},& \frac{\sqrt{{a}_{1}{a}_{2}}}{\sqrt{3}}({M}_{2}-{M}_{1})\end{array}\right)\) where \({\theta }_{1}=\mathrm{arccos}({{\varvec{\upeta}}}_{1}\cdot {{\varvec{\upeta}}}_{2})\) and \({\theta }_{2}=\mathrm{arccos}({{\varvec{\upxi}}}_{1}\cdot {{\varvec{\upxi}}}_{2})\), and we can write the distance as \(d\approx \sqrt{{\mathbf{z}}^{\mathrm{T}}\mathbf{z}}\). Since the Mahalanobis distance is normalized, we can scale \(\mathbf{z}\) with a proportional factor and rewrite \(\mathbf{z}\)as

$$\mathbf{z}=\left(\begin{array}{c}{\frac{\sqrt{3}}{\sqrt{{a}_{1}{a}_{2}}}}(a_{2}-{a}_{1})\\{\theta }_{1}\\ {\theta }_{2}\\{M}_{2}-{M}_{1}\end{array}\right).$$
(18)

Now we need to transform the covariance matrix according to the new coordinates and calculate the partial derivatives w.r.t. to the standard orbital elements semimajor axis \(a\), eccentricity \(e\), inclination \(i\), right ascension of ascending node \(\Omega \), argument of perigee \(\upomega \), and mean anomaly \(M\):

$${T}_{ij}= \frac{\partial {z}_{i}}{\partial {\alpha }_{j}}$$

where \({\alpha }_{j}\in \left\{{a}_{1},\dots ,{M}_{1},{a}_{2},\dots ,{M}_{2}\right\}\). The first row of \(\mathbf{T}\) can be calculated as

$$\left({T}_{1j}\right)=\left(\begin{array}{cccccc}-\frac{\sqrt{3}{a}_{2}\left({a}_{1}+{a}_{2}\right)}{2\sqrt{{\left({a}_{1}{a}_{2}\right)}^{3}}}& 0& \dots & \frac{\sqrt{3}{a}_{1}\left({a}_{1}+{a}_{2}\right)}{2\sqrt{{\left({a}_{1}{a}_{2}\right)}^{3}}}& 0& \dots \end{array}\right).$$

For the second row, we have

$$\frac{\partial {z}_{2}}{\partial {\alpha }_{j}}=\frac{\partial {z}_{2}}{\partial {\theta }_{1}}\frac{\partial {\theta }_{1}}{\partial {\alpha }_{j}}=\frac{\partial {\theta }_{1}}{\partial {{\varvec{\upeta}}}_{1}}\frac{\partial {{\varvec{\upeta}}}_{1}}{\partial {\alpha }_{j}}+\frac{\partial {\theta }_{1}}{\partial {{\varvec{\upeta}}}_{2}}\frac{\partial {{\varvec{\upeta}}}_{2}}{\partial {\alpha }_{j}}.$$
(19)

We denote with \({\mathbb{M}}_{n\times m}({\mathbb{R}})\) the space of \(n\times m\) matrices over \({\mathbb{R}}\). The Jacobian \(\frac{\partial {\theta }_{1}}{\partial {{\varvec{\upeta}}}_{1}}\in {\mathbb{M}}_{1\times 3}({\mathbb{R}})\) is calculated as

$$\frac{\partial {\theta }_{1}}{\partial {{\varvec{\upeta}}}_{1}}=-\frac{1}{\sqrt{1-{\left({{\varvec{\upeta}}}_{1}\cdot {{\varvec{\upeta}}}_{2}\right)}^{2}}}{{\varvec{\upeta}}}_{2}.$$
(20)

The calculation for \(\frac{\partial {\theta }_{1}}{\partial {{\varvec{\upeta}}}_{2}}\) is analogous. The partial derivatives \(\frac{\partial {{\varvec{\upeta}}}_{i}}{\partial {\alpha }_{j}}\) can be found using the explicit representations of \(\mathbf{e}\) and \(\mathbf{h}\). We can express the unit vectors \(\mathbf{P}\) and \(\mathbf{W}\) parallel to \(\mathbf{e}\) and \(\mathbf{h}\), respectively,as:

$$\begin{aligned} {\mathbf{P}} &= \left( {\begin{array}{*{20}c} {\cos {\upomega }\cos {\Omega } - \sin {{\omega \cos }} \,i\,{{ \text{sin}\Omega }}} \\ {\cos {\upomega }\sin {\Omega } - \sin {{\omega \cos }}\,i\,{{ \cos \Omega }}} \\ {\sin \,i\,\sin {\upomega }} \\ \end{array} }\right), \\ {\mathbf{W}} & = \left( {\begin{array}{*{20}c} {\sin \,i\,\sin {\Omega }} \\ { - \sin\, i\,\cos {\Omega }} \\ {{\text{cos }} i\,}\\ \end{array} } \right). \\ \end{aligned}$$

Then, the partial derivatives are as follows:

$$ \begin{aligned} & \frac{{\partial {{\varvec{\upeta}}}}}{\partial a} = 0\quad \quad \quad \quad \quad \quad \;\quad \quad \frac{{\partial {{\varvec{\upeta}}}}}{{\partial {\Omega }}} = e\frac{{\partial {\mathbf{P}}}}{{\partial {\Omega }}} + \sqrt {1 - e^{2} } \frac{{\partial {\mathbf{W}}}}{{\partial {\Omega }}} \\ & \frac{{\partial {{\varvec{\upeta}}}}}{\partial e} = {\mathbf{P}} - \frac{e}{{\sqrt {1 - e^{2} } }}{\mathbf{W}}\quad \quad \quad \frac{{\partial {{\varvec{\upeta}}}}}{{\partial {\upomega }}} = e\frac{{\partial {\mathbf{P}}}}{{\partial {\upomega }}} \\ & \frac{{\partial {{\varvec{\upeta}}}}}{\partial i} = e\frac{{\partial {\mathbf{P}}}}{\partial i} + \sqrt {1 - e^{2} } \frac{{\partial {\mathbf{W}}}}{\partial i}\quad \;\frac{{\partial {{\varvec{\upeta}}}}}{\partial M} = 0 \\ \end{aligned} $$
(21)

Note that in the above calculation \({\alpha }_{j}\) contains two set of orbital parameters. Hence, \(\frac{\partial {{\varvec{\upeta}}}_{i}}{\partial {\alpha }_{j}}\in {\mathbb{M}}_{3\times 12}({\mathbb{R}})\) is only nonzero in half of its entries. The third row of \(\mathbf{T}\) is very similar, only a change of sign is needed at terms with two addends as well as a replacement of \({\varvec{\upeta}}\) by \({\varvec{\upxi}}\). The last row of \(\mathbf{T}\) is quite simple and yields

$$\left({T}_{4j}\right)=\left(\begin{array}{cccccc}0& \dots & -1& 0& \dots & 1\end{array}\right).$$

Combining the results we find

$$\mathbf{T}=\left(\begin{array}{cccccc}-\frac{\sqrt{3}{a}_{2}}{{a}_{1}^{2}}& 0\dots 0& 0& \frac{\sqrt{3}}{{a}_{2}}& 0\dots 0& 0\\ 0& {\mathbf{T}}_{{{\varvec{\upeta}}}_{1}}& 0& 0& {\mathbf{T}}_{{{\varvec{\upeta}}}_{2}}& 0\\ 0& {\mathbf{T}}_{{{\varvec{\upxi}}}_{1}}& 0& 0& {\mathbf{T}}_{{{\varvec{\upxi}}}_{2}}& 0\\ 0& 0\dots 0& -1& 0& 0\dots 0& 1\end{array}\right).$$

where \({\mathbf{T}}_{{\upeta }_{1}}\) and \({\mathbf{T}}_{{\upxi }_{1}}\) are abbreviations for the \(1\times 4\) matrices given by the considerations (21):

$$ \begin{aligned} {\mathbf{T}}_{{{{\varvec{\upeta}}}_{1} }} & = \left( {\begin{array}{*{20}c} {\frac{{\partial \theta_{1} }}{{\partial {{\varvec{\upeta}}}_{1} }}\frac{{\partial {{\varvec{\upeta}}}_{1} }}{{\partial e_{1} }}} & {\frac{{\partial \theta_{1} }}{{\partial {{\varvec{\upeta}}}_{1} }}\frac{{\partial {{\varvec{\upeta}}}_{1} }}{{\partial i_{1} }}} & {\frac{{\partial \theta_{1} }}{{\partial {{\varvec{\upeta}}}_{1} }}\frac{{\partial {{\varvec{\upeta}}}_{1} }}{{\partial {\Omega }_{1} }}} & {\frac{{\partial \theta_{1} }}{{\partial {{\varvec{\upeta}}}_{1} }}\frac{{\partial {{\varvec{\upeta}}}_{1} }}{{\partial {\upomega }_{1} }}} \\ \end{array} } \right), \\ {\mathbf{T}}_{{{{\varvec{\upxi}}}_{1} }} & = \left( {\begin{array}{*{20}c} {\frac{{\partial \theta_{1} }}{{\partial {{\varvec{\upxi}}}_{1} }}\frac{{\partial {{\varvec{\upxi}}}_{1} }}{{\partial e_{1} }}} & {\frac{{\partial \theta_{1} }}{{\partial {{\varvec{\upxi}}}_{1} }}\frac{{\partial {{\varvec{\upxi}}}_{1} }}{{\partial i_{1} }}} & {\frac{{\partial \theta_{1} }}{{\partial {{\varvec{\upxi}}}_{1} }}\frac{{\partial {{\varvec{\upxi}}}_{1} }}{{\partial {\Omega }_{1} }}} & {\frac{{\partial \theta_{1} }}{{\partial {{\varvec{\upxi}}}_{1} }}\frac{{\partial {{\varvec{\upxi}}}_{1} }}{{\partial {\upomega }_{1} }}} \\ \end{array} } \right). \\ \end{aligned} $$

In order to calculate the components of these matrices, one makes use of (20) and (21). \({\mathbf{T}}_{{{\varvec{\upeta}}}_{2}}\) and \({\mathbf{T}}_{{{\varvec{\upxi}}}_{2}}\) are analogous and one replaces \({{\varvec{\upeta}}}_{1}\) by \({{\varvec{\upeta}}}_{2}\) and \({{\varvec{\upxi}}}_{1}\) by \({{\varvec{\upxi}}}_{2}\). Since \(\mathbf{T}\) contains two sets of orbital parameters, the covariance matrix needs to incorporate information about two orbits:

$${\mathbf{C}}_{\mathrm{1,2}}=\left(\begin{array}{cc}{\mathbf{C}}_{1}& 0\\ 0& {\mathbf{C}}_{2}\end{array}\right)\in {\mathbb{M}}_{12\times 12}\left({\mathbb{R}}\right).$$

\({\mathbf{C}}_{1}\) and \({\mathbf{C}}_{2}\) are the covariance matrices of the two orbits expressed in orbital elements. Then, we can transform the covariance matrix as

$${\mathbf{C}}_{\mathbf{z}}=\mathbf{T}{\mathbf{C}}_{\mathrm{1,2}}{\mathbf{T}}^{\mathrm{T}}\in {\mathbb{M}}_{4\times 4}({\mathbb{R}})$$

and the Mahalanobis distance \({d}_{\mathrm{M}}\) becomes

$${d}_{\mathrm{M}}=\sqrt{{\mathbf{z}}^{\mathrm{T}}{\mathbf{C}}_{\mathbf{z}}^{-1}\mathbf{z}}.$$

7 Mahalanobis distance following Kholshevnikov’s metric

For the definition of Mahalanobis distance, we combine the vectors in Eqs. (10) and (12) into a single vector

$$\mathbf{z}=\left(\begin{array}{c}\mathbf{u}\\ \mathbf{v}\\ \mathbf{w}\end{array}\right)$$
(22)

and write Eq. (13) as \(d=\sqrt{{\mathbf{z}}^{\mathrm{T}}\mathbf{z}}\). To transform the covariance matrix according to these coordinates, we need to determine the partial derivatives of \(\mathbf{u}\), \(\mathbf{v}\), and \(\mathbf{w}\) w.r.t. the orbital parameters \({\alpha }_{i}\). We have:

$$ \begin{aligned} & \frac{{\partial {\mathbf{u}}}}{{\partial \alpha_{i} }} = \frac{1}{2\sqrt p }\frac{\partial p}{{\partial \alpha_{i} }}{\mathbf{W}} + \sqrt p \frac{{\partial {\mathbf{W}}}}{{\partial \alpha_{i} }} \\ & \frac{{\partial {\mathbf{v}}}}{{\partial \alpha_{i} }} = \frac{\partial }{{\partial \alpha_{i} }}(e\sqrt {p)} {\mathbf{P}} + \sqrt p \frac{{\partial {\mathbf{P}}}}{{\partial \alpha_{i} }} \\ & \frac{{\partial {\mathbf{w}}}}{{\partial \alpha_{i} }} = \frac{1}{2\sqrt p }\frac{\partial p}{{\partial \alpha_{i} }}\left( {\begin{array}{*{20}c} {{\text{cos}} M} \\ {{\text{sin}} M} \\ \end{array} } \right) + \sqrt p \frac{\partial }{{\partial \alpha_{i} }}\left( {\begin{array}{*{20}c} {{\text{cos}} M} \\ {{\text{sin}} M} \\ \end{array} } \right) \\ \end{aligned} $$
(23)

Recall that p is only a function of \(e\) and \(a\), the term \(\frac{\partial }{\partial {\alpha }_{i}}\left(\begin{array}{c}\mathrm{cos} M\\ \mathrm{sin} M\end{array}\right)\) is only nonzero when \({\alpha }_{i}=M\), and \(\mathbf{P}\) and \(\mathbf{W}\) only depend on \(i,\Omega ,\upomega \). Then, we find \(\mathbf{T}\in {\mathbb{M}}_{8\times 6}({\mathbb{R}})\) as

$$\mathbf{T}=\left(\begin{array}{cccccc}\frac{1-{e}^{2}}{2\sqrt{p}}\mathbf{W}& \frac{-ae}{\sqrt{p}}\mathbf{W}& \sqrt{p}\frac{\partial \mathbf{W}}{\partial i}& \sqrt{p}\frac{\partial \mathbf{W}}{\partial\Omega }& 0& 0\\ \frac{e(1-{e}^{2})}{2\sqrt{p}}\mathbf{P}& \left(\sqrt{p}-\frac{-a{e}^{2}}{\sqrt{p}}\right)\mathbf{P}& e\sqrt{p}\frac{\partial \mathbf{P}}{\partial i}& e\sqrt{p}\frac{\partial \mathbf{P}}{\partial\Omega }& e\sqrt{p}\frac{\partial \mathbf{P}}{\partial\upomega }& 0\\ \frac{1-{e}^{2}}{2\sqrt{p}}\mathrm{cos}M& \frac{-ae}{\sqrt{p}}\mathrm{cos }M& 0& 0& 0& -\sqrt{p}\mathrm{sin}M\\ \frac{1-{e}^{2}}{2\sqrt{p}}\mathrm{sin}M& \frac{-ae}{\sqrt{p}}\mathrm{sin }M& 0& 0& 0& \sqrt{p}\mathrm{cos}M\end{array}\right)$$

and hence

$${\mathbf{C}}_{\mathbf{z}}=\mathbf{T}{\mathbf{C}}_{\mathrm{p}}{\mathbf{T}}^{\mathrm{T}}\in {\mathbb{M}}_{8\times 8}\left({\mathbb{R}}\right),$$

where \({\mathbf{C}}_{\mathrm{p}}\) is the covariance of the orbital elements, while \({\mathbf{C}}_{\mathbf{z}}\) is the covariance of the components in \(\mathbf{z}\).

In this case, we define the Mahalanobis distance to be

$${d}_{\mathrm{M}}\left({\mathbf{z}}_{1},{\mathbf{z}}_{2}\right)=\sqrt{{\left({\mathbf{z}}_{1}-{\mathbf{z}}_{2}\right)}^{\mathrm{T}}{\mathbf{C}}_{{\mathbf{z}}_{1}-{\mathbf{z}}_{2}}^{-1}\left({\mathbf{z}}_{1}-{\mathbf{z}}_{2}\right)},$$

where \({\mathbf{C}}_{{\mathbf{z}}_{1}-{\mathbf{z}}_{2}}={\mathbf{C}}_{{\mathbf{z}}_{1}}+{\mathbf{C}}_{{\mathbf{z}}_{2}}\).

8 Comparison of metrics

8.1 Criterion for metric evaluation

A possible criterion to evaluate and compare the metrics can be based on the statistical distribution of the distance values. The values of the squared Mahalanobis distance follow a chi-squared distribution \({\chi }_{k}^{2}\) with \(k\) degrees of freedom. This can be used for statistical gating with the integral of the distribution. It is possible to define a threshold for the distance between orbits below which two orbits are considered equal with a certain probability. The associated level of confidence is given by the cumulative function of \({\chi }_{k}^{2}\) at the threshold value. For example, for \(k=4\) a threshold of 9.5 gives about 95% confidence level.

The degrees of freedom usually correspond to the dimension of the space in which we calculate the Mahalanobis distance. This can be seen from the definition of the distance function. From (15) we know that \(\mathbf{x}-\mathbf{y}\) is normal distributed with a covariance \(\mathbf{S}={\mathbf{S}}_{\mathbf{x}}+{\mathbf{S}}_{\mathbf{y}}\). If we consider normal distributed random variables \(\mathbf{X}\), \(\mathbf{Y}\) and we denote with ~ a distribution equivalence, we can write

$${d}_{\mathrm{M}}^{2}\sim {\left(\mathbf{X}-\mathbf{Y}\right)}^{\mathrm{T}}{\mathbf{S}}^{-1}\left(\mathbf{X}-\mathbf{Y}\right) \sim {\mathbf{X}}^{\mathrm{T}}{\mathbf{S}}_{\mathbf{x}}^{-1}\mathbf{X}.$$

We have seen above the decomposition \({\mathbf{S}}_{\mathbf{x}}^{-1}={\mathbf{T}}^{-1}\mathbf{D}\mathbf{T}\) and we know that by definition the diagonal elements in \(\mathbf{D}\) scale the components in order to be standard normal distributed. Then, we can define a diagonal matrix \(\mathbf{D}{\prime}\) so that \(\mathbf{D}=\mathbf{D}{\prime}\mathbf{D}{\prime}\) where \({\mathbf{X}}^{\prime}=\mathbf{D}^{\prime}\mathbf{T}\mathbf{X}\) is standard normal distributed. Thus, we have:

$${d}_{\mathrm{M}}^{2}\sim {\mathbf{X}}^{\mathrm{T}}{\mathbf{S}}_{\mathbf{x}}^{-1}\mathbf{X}\sim {{\mathbf{X}}^{\mathbf{^{\prime}}}}^{\mathrm{T}}{\mathbf{X}}^{\mathbf{^{\prime}}} \sim {X}_{1}^{2}+{X}_{2}^{2}+\boldsymbol{ }\dots +{X}_{{\varvec{k}}}^{2}\boldsymbol{ }\boldsymbol{ }\boldsymbol{ }\sim \boldsymbol{ }\boldsymbol{ }\boldsymbol{ }{\chi }_{k}^{2}.$$

For example in the case of distances between state vectors in \({\mathbb{R}}^{6},\) we clearly have 6 degrees of freedom. This is consistent with the description of an orbit with six orbital parameters. However, in the Maruskin’s metric, the vectors have four components as defined in (18). In fact, the parameters \(i,\Omega , e, \omega \) span the two spheres \({S}^{2}\) but in the metric they characterize only two angular distances. Therefore, we expect in this case four degrees of freedom.

In the Kholshevnikov’s metric, we have defined in (22) a vector with eight components. On the other hand, we know that by definition \(\mathbf{u}\) and \(\mathbf{v}\) are always perpendicular and satisfy the condition \(\mathbf{u}\cdot \mathbf{v}=0\). Also, the direction of \(\mathbf{w}\) only depends on the mean anomaly \(M\). This means that we have two degrees of freedom less and we expect for this metric a total of six. Note that the above considerations about the \({\chi }_{k}^{2}\) distribution are valid only in the linear approximation. We assume in fact that the covariance matrix perfectly normalizes the Mahalanobis distance and that the components of the vectors \(\mathbf{X}\), \(\mathbf{Y}\) are normal distributed. In the truth the transformations necessary to represent the orbit w.r.t a specific coordinate system are often nonlinear. For example, in the conversion between orbital elements and state vector a linear transformation of the covariance matrix introduces a certain inaccuracy in the model, while the transformed orbits \(\mathbf{X}\), \(\mathbf{Y}\) are not normal distributed any longer. A consequence of this limitation is that the real distribution of the distance function can differ from the expected \({\chi }_{k}^{2}\) values.

As seen above, the knowledge of the distribution function is relevant to evaluate the confidence level for a given threshold. In this sense, it is beneficial to use a metric which best fits the theoretical distribution, or best represents the real conditions under the linearized model.

A common method to assess how good a distribution is matched by observations is the Pearson’s \({\chi }^{2}\) test. According to this method, we can calculate the test statistic \({X}_{0}^{2}\) as

$${X}_{0}^{2}=\sum_{i=1}^{m}\frac{{\left({n}_{i}-n{p}_{i}\right)}^{2}}{n{p}_{i}}$$

where \(m\) describes the number of classes/bins, \({n}_{i}\) the number of observations in class \(i\) from a total amount of \(n\) observations. The expected amount of observations for every bin is given by \(n{p}_{i}\) with \({p}_{i}\) being the probability according to the model distribution, in our case \({\chi }_{k}^{2}\). For our evaluation, we use a number of bins \(m=\) 100 so that \({X}_{0}^{2}\) follows a \({\chi }_{99}^{2}\) distribution, with \(m-1\) degrees of freedom. Pearson’s \({\chi }^{2}\) test states that a hypothesis should be rejected if \({\chi }_{1-\alpha , 99}^{2}<{X}_{0}^{2}\) at a significance level \(\alpha \). With \(\alpha =0.05,\) this would imply rejection for \({X}_{0}^{2}>\) 123. In the simulations, we will interpret values in this indicative order of magnitude as good fitted distributions.

8.2 Simulation of orbit distance

In order to compare the different proposed distance functions, we want to consider several representative orbits introducing a synthetic noise in the orbit elements. There are two scenarios in which the orbit distance can be applied. In one case, the previously calculated orbit of an unknown object has to be correlated or identified using the orbit distance with an object of an existing orbit catalogue. Here, we assume a refined orbit with low position and velocity uncertainty. In another scenario, the distance is used in an initial orbit determination procedure. A first orbit determined from observations at a given epoch is propagated in time to be compared with an orbit of the supposed same object from observations at a later epoch. In this situation, the first orbit determination provides an orbit with a high uncertainty and the propagation will even increase this especially in along-track direction into a curved shape. We want to consider both scenarios with orbits ranging from relatively low to high covariance values. For the second scenario, the propagation of the covariance, since it considers only linear terms, is not able to reproduce the real curved uncertainty. Therefore, to simulate this, we take a normal distributed sample of orbits around a nominal one and propagate every single sample orbit. In this way, we obtain a propagated uncertainty distribution without any linear approximation. Figure 3 left shows an example of a propagated position uncertainty after two revolutions. For illustration purposes, we consider a geostationary orbit with a large initial uncertainty of 10 km in position and 10 m/s in velocity w.r.t. radial, along-track and cross-track components. We can clearly recognize the expected “banana” shaped distribution of the red points indicating the propagated positions. It is interesting to note that after a fraction of the orbital revolution, the curved distribution is not aligned with the original orbit, but it is slightly rotated, as shown in Fig. 3 right. In fact, the uncertainty in along-track velocity leads to slightly eccentric orbits with perigee at the initial position and apogee on the opposite part after half revolution. While the perigee is in common for these orbits, the different apogee distance results in positions not along the nominal orbit. This shows that a propagation model purely based on the curved trajectory of the nominal orbit might be not accurate enough.

Fig. 3
figure 3

Propagation uncertainty after two revolutions (left) and after one and half revolution (right)

For the simulations, we generate a random sample of 10,000 data points, normal distributed around a given set of mean parameters. The number of data points is chosen according to the Pearson’s \({\chi }^{2}\) test to have enough statistical relevance (\(n{p}_{i}>1\)) also in the low probability bins (\({p}_{i}\approx 0.01\)). In a first evaluation, a basic set with \(a=42,\!000\) km, \(e=0.1\), \(i=1\) rad, \(\Omega =1\) rad, \(\upomega =1\) rad, and \(M=1\) rad is chosen. According to the two different orbit determination approaches presented in Olmedo et al. (2008), we can assume errors in position and velocity ranging from 10 km and 1 m/s up to 1000 km and 10 m/s. Thus, for the covariance of the normal distribution used for sampling, we can take the initial values 10 km and 1 m/s in radial, along-track, cross-track direction. We only consider the diagonal elements of the covariance matrix neglecting correlations between the parameters. The effect of the correlation part is expected to be less and less significant w.r.t. the main components of the matrix after a propagation in time. In fact, in the propagation the initial uncertainty spreads especially in the along-track component, in which after few hours a position error in the order of 100 km and after one revolution of more than 1000 km is possible. For further analysis, we select those cases where the along-track error is 10 km, 100 km (after around 6 h propagation), and 1000 km (after around 1.5 days propagation). In the radial and cross-track component, the error remains around 10 km and for the velocity components around 1 m/s. In addition, we consider one case with 10 km and 10 m/s errors. The choice is motivated by the fact that we expect to see a degradation of the theoretical distribution with the increase in the uncertainty in the position, especially along-track, as a consequence of the nonlinear error propagation along the orbit (see Fig. 3). Note that the propagation procedure is based on the formulation in Beutler (2005) and implies a propagated covariance w.r.t. orbital elements. Additional transformations from state vector to orbital elements and vice versa are necessary and can introduce further inaccuracies in the linear approximation.

The distance between the propagated random data points is pairwise evaluated and its probability density is plotted for visual comparison with the expected theoretical \({\chi }_{k}^{2}\) function. In a second step the stability of the distance function is evaluated by changing one single orbital element at a time while keeping the other ones unchanged. Errors of 10 km in position and 10 m/s in velocity and the same basic set of orbital parameters as above are considered except a larger \(e=0.4\) to emphasize differences between the metrics. Due to a larger number of simulations in this second step, the comparison will be based on the outputs of the Pearson’s \({\chi }^{2}\) test.

Table 1 summarizes the different results shown in the next subsections.

Table 1 Summary of results shown in the following subsections

8.3 Simulations with distance of state vectors

We first evaluate the distance between state vectors in the Euclidean space and with curvilinear coordinates. Here and in the subsequent simulations for comparison, we always consider the four cases discussed above with the following uncertainties in along-track position and velocity: 10 km, 1 m/s; 10 km, 10 m/s; 100 km, 1 m/s; 1000 km, 1 m/s. As specified above, the 100 km and 1000 km errors correspond to propagation intervals of around 6 h and 1.5 days, respectively. Figures 4 and 5 display the distribution obtained with the simulations of the distance in the Euclidean space in the four different cases. In red, the theoretical distribution is indicated. With a relatively small position error of 10 km, even with 10 m/s velocity uncertainty, the simulated distribution reflects well the ideal function. However, increasing the error to 100 km and 1000 km causes a deviation of the histogram from the \({\chi }^{2}\) trend and results in a more and more flattened shape including higher distance values. The situation is better for the distance in curvilinear coordinates illustrated in Figs. 6 and 7. Even in the case with 1000 km the histogram does not show any flattening tendency. However, already at small error values the peak of the distribution is shifted to shorter distances. This might be due to the slight rotation of the sample distribution w.r.t. the orbital trajectory explained above and indicated in Fig. 3 (right). For the latter case, no Pearson’s \({\chi }^{2}\) test with different orbital parameters is shown, since the test statistics \({X}_{0}^{2}\) always exceeds the rejection value.

Fig. 4
figure 4

Distribution of Mahalanobis distance between state vectors with 10 km, 1 m/s (left) and 10 km, 10 m/s (right) uncertainty

Fig. 5
figure 5

Distribution of Mahalanobis distance between state vectors with 100 km, 1 m/s (left) and 1000 km, 1 m/s (right) uncertainty

Fig. 6
figure 6

Distribution of Mahalanobis distance in curvilinear coordinates with 10 km, 1 m/s (left) and 10 km, 10 m/s (right) uncertainty

Fig. 7
figure 7

Distribution of Mahalanobis distance in curvilinear coordinates with 100 km, 1 m/s (left) and 1000 km, 1 m/s (right) uncertainty

8.4 Simulations with Maruskin’s metric

Simulations for the same cases above have been performed using the Mahalanobis distance following the Maruskin’s metric (Figs. 8 and 9). The distribution shape remains stable up to errors of 1000 km. No flattening or shift of the peak of the function is evident. Only the case with an error of 5000 km (obtained with 4.5 days propagation) in Fig. 10 exhibits the beginning of a degradation of the distribution. It seems that this distance definition is less sensitive to a higher positional uncertainty than the one between state vectors. The natural parameterization better describes the curved uncertainty and is not directly related to the curved orbital trajectory as for the distance in curvilinear coordinates. Nevertheless, let us remind that the formulation bases on the approximation with small angles in Eq. (17) and as such with higher uncertainty, and therefore larger angles, should also start to diverge from the optimal behaviour.

Fig. 8
figure 8

Distribution of Mahalanobis distance following Maruskin’s metric with 10 km, 1 m/s (left) and 10 km, 10 m/s (right) uncertainty

Fig. 9
figure 9

Distribution of Mahalanobis distance following Maruskin’s metric with 100 km, 1 m/s (left) and 1000 km, 1 m/s (right) uncertainty

Fig. 10
figure 10

Distribution of Mahalanobis distance following Maruskin’s metric with 5000 km, 1 m/s

In Fig. 11, the output \({X}_{0}^{2}\) of the Pearson’s \({\chi }^{2}\) test with different orbital parameters is shown. We see that the function matching remains relatively stable varying the parameters \(a\), \(\Omega \), \(\upomega \), and \(M\). The value of \({X}_{0}^{2}\) is below 300, in the order of magnitude of the upper bound for the 95% confidence level. The \({X}_{0}^{2}\) output becomes slightly higher with smaller semimajor axis values. This can be explained with the approximation in Eq. (17) which becomes more critical with a smaller orbital curvature radius. The test fails clearly for orbital inclinations \(i\) close to 0 and \(\pi \). For angles below 0.05 rad and above 3.1 rad, \({X}_{0}^{2}\) shows a very steep increase, visible in Fig. 11 on the right. The same happens with the eccentricity \(e\) which starts to rapidly increase with values below 0.1 and above 0.9 (Fig. 11 on the left). This behaviour is due to the characteristic of the used set of orbital parameters which is known to have singularities for \(i=0,i=\pi \), and \(e=0\), while if \(e\approx 1\) the singularity appears in the partial derivatives, e.g. visible in Eqs. (21) with a term \(1-{e}^{2}\) in the denominator.

Fig. 11
figure 11

Test statistic \(X_{0}^{2}\) indicating the matching of the expected distance distribution following Maruskin’s metric for different values of \(a\), \(e\) (left), and \(i\), \(\Omega\), \(\omega\), \(M\) (right). The \({\text{a}}_{{{\text{rel}}}}\) values are indicated according to \(a_{{{\text{rel}}}} = \frac{{a - a_{\min } }}{{a_{{{\text{max}}}} - a_{{{\text{min}}}} }}\) with \(a_{\min }= 8000\) km and \(a_{\max }= 50,\!000\) km

8.5 Simulations with Kholshevnikov’s metric

Also for the Mahalanobis distance following the Kholshevnikov’s metric (Figs. 12 and 13), we considered the same uncertainty as above. In this case, the shape of the distribution starts to become flat already with an error of 1000 km. This distance function is still better than the one using the standard Euclidean distance between state vectors. However, if compared with the simulation in curvilinear coordinates with 1000 km error, it does not look better. While in one case the peak is shifted to the left (Fig. 7, right), in the other its height is reduced (Fig. 13, right). It seems that, although the natural metric should better fit the real model, there is a degradation at higher errors. Recall that in this metric, we compute the Euclidean distance in the natural parameterization. We see here the advantages of this parameterization, but still over a certain level of position uncertainty, the drawback of using the Euclidean distance becomes noticeable. The same problems, but alleviated, seen in the Euclidean distance of state vectors arise, where the region described by the covariance used in the Mahalanobis distance does not follow the curved trajectory (see also 5.1).

Fig. 12
figure 12

Distribution of Mahalanobis distance following Kholshevnikov’s metric with 10 km, 1 m/s (left) and 10 km, 10 m/s (right) uncertainty

Fig. 13
figure 13

Distribution of Mahalanobis distance following Kholshevnikov’s metric with 100 km, 1 m/s (left) and 1000 km, 1 m/s (right) uncertainty

The results with different orbital parameters according to the Pearson’s \({\chi }^{2}\) test are illustrated in Fig. 14. Considerations similar to the ones for the Maruskin’s distance can be done in this case, with singularities for \(e\) equals to 0 or 1, and \(i\) equals to 0 or \(\pi \).

Fig. 14
figure 14

Test statistic \(X_{0}^{2}\) indicating the matching of the expected distance distribution following Kholshevnikov’s metric for different values of \(a\), \(e\) (left), and \(i\), \({\Omega }\), \(\omega\), \(M\) (right). The \(a_{{{\text{rel}}}}\) values are indicated according to \(a_{{{\text{rel}}}} = \frac{{a - a_{\min } }}{{a_{\max } - a_{\min } }}\) with \(a_{\min }\)= 8000 km and \(a_{\max }\)= 50,000 km

8.6 Summary of results

We evaluated four different Mahalanobis distances. They differ in the parameterization of the problem and in the metric used. It turned out that the combination of these two points characterize the fitting quality of the distance function to the theoretical model and so its applicability. The limiting problem lies in the description of the uncertainty using a linear model and not following the curved orbit trajectory. We have the following cases:

  • Euclidean distance of state vectors. The parameterization is based on state vectors and the distance is calculated in the standard Euclidean metric. This combination does not consider the curved trajectory. Results are good only up to 10 km error.

  • Distance of state vectors in curvilinear coordinates. The parameterization is based on state vectors but a modified “metric” through curvilinear coordinates is used. The overall stability is improved with no degradation even at higher errors above 1000 km. However, the simulated distribution appears slightly shifted even at low errors. Curvilinear coordinates take nonlinear effects into account only to a certain extent.

  • Distance following Kholshevnikov’s metric. The natural parameterization is based on angular momentum and Laplace–Runge–Lenz vector, while the distance is computed in the Euclidean metric. The parameters do not directly describe position and velocity, therefore not directly related to the orbital trajectory. Nevertheless, the distance between vectors is evaluated in the Euclidean metric and does not consider the spherical symmetry of the space spanned by angular momentum and Laplace–Runge–Lenz vectors. The simulations fit very well the model and degradation starts only at an error of 1000 km.

  • Distance following Maruskin’s metric. This distance is based on natural parameters in a Riemannian metric and it is only concerned by the linearization in Eq. (17). The match with the expected model is very good beyond the 1000 km case and it starts to diverge with errors above 5000 km.

Beside the dependency from the orbit uncertainty, the \({\chi }^{2}\) tests show that both Maruskin’s and Kholshevnikov’s metrics rapidly degrade close to singularities in eccentricity and inclination. This clearly limits the applicability of these metrics for nearly circular (in our case with \(e<\) 0.1), equatorial (\(i<\) 0.05 rad and \(i>\) 3.1 rad), and highly eccentric (\(e>\) 0.9) orbits. This limitation is relevant in a general context of space surveillance, since a large amount of active satellites in Low Earth Orbits (LEO) have eccentricities below 0.001 and many geostationary satellites, for example, have inclinations smaller than 0.01 rad. However, if we look at the definition of the metrics in Eqs. (6) and (11), we see that these functions do not implicitly exhibit any singularities. The latter are introduced with the choice of the orbital parameter set. Using non-singular orbital elements would be a possibility to extend the applicability of the proposed approach. In Vananti and Schildknecht (2019), the authors were using the argument of latitude instead of the mean anomaly, but the benefits of this choice were not further investigated in relation to possible singularities.

9 Conclusions

In different applications of celestial mechanics, it is useful to define the concept of distance between orbits. In this work, we consider two metrics proposed by Maruskin and Kholshevnikov. Both follow a natural parameterization based on angular momentum and Laplace–Runge–Lenz vectors. While the former computes the distance in a Riemannian space, the latter operates in the standard Euclidean metric. We extended the formulation of the two distances to consider the mean anomaly as a sixth parameter in the description of the orbits. In addition, the two definitions were expressed in normalized form adopting the concept of Mahalanobis distance. The latter implicitly assumes a linear model in the description of the statistical uncertainty. However, this is an approximation and does not reflect the reality, e.g. where it is known that the orbital position uncertainty tends to follow a curved trajectory along the orbit. While the theoretical distribution of Mahalanobis distances follows a known statistics (\({\chi }_{k}^{2}\) function), the above formulated distances slightly diverge from the ideal distribution due to this approximation. The extent of the divergence is a measure of the quality and applicability of the defined distance functions. In fact, if the modified distribution is closer to the theoretical one, it is possible to better evaluate the confidence level of related quantities, e.g. for statistical gating. Simulations with the two above distances and, as reference, the commonly used distances between state vectors in standard and curvilinear coordinates, were performed. For the simulations, a random sample of data points, normal distributed around given sets of orbital parameters, was generated and propagated in time to obtain representative uncertainty distributions of hypothetical space objects in position and velocity. The Mahalanobis distance between the propagated random data points was pairwise evaluated and its probability density was compared with the expected \({\chi }_{k}^{2}\) function. The simulations show that the choice of the parameterization and the metric used significantly influences the matching of the theoretical distribution. This can be explained by the fact that certain parameterizations or metrics are more or less prone to the effects of the above mentioned linear approximation. The parameterization based on state vectors is directly related to the orbital trajectory and can only take it into account to a limited extent introducing in best case curvilinear coordinates. Instead, the distances following Maruskin and Kholshevnikov are based on angular momentum and Laplace–Runge–Lenz vectors and therefore not directly related to the orbital trajectory. This condition improves both formulations regarding the expected distribution. However, the Riemannian metric proposed by Maruskin better considers the spherical symmetry of the space spanned by the latter vectors, and the simulations show a higher degree of matching with the expected model up to an uncertainty of about 5000 km in along-track direction. Although the results obtained with the latter definition of distance are promising, further research is necessary to better assess other possible metrics based on the natural parameterization, e.g. evaluating different weights assigned to the parameter space. The metric normalization formulated as a Mahalanobis distance poses further difficulties and in the present work this problem was solved introducing an approximation. However, other normalization approaches, e.g. based on other statistical distances, should be investigated in the future. Furthermore, the formulation of the two metrics in terms of standard orbital elements is very sensitive to the singularities occurring with this parameter set in the case of near-circular and equatorial orbits. To extend the applicability of the proposed approach to all orbits, further research is necessary, e.g. in the formulation of the problem using non-singular orbit elements.