Skip to main content
Log in

Addressing non-normality in multivariate analysis using the t-distribution

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

The main aim of this paper is to propose a set of tools for assessing non-normality taking into consideration the class of multivariate t-distributions. Assuming second moment existence, we consider a reparameterized version of the usual t distribution, so that the scale matrix coincides with covariance matrix of the distribution. We use the local influence procedure and the Kullback–Leibler divergence measure to propose quantitative methods to evaluate deviations from the normality assumption. In addition, the possible non-normality due to the presence of both skewness and heavy tails is also explored. Our findings based on two real datasets are complemented by a simulation study to evaluate the performance of the proposed methodology on finite samples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions. Dover, New York (1970)

    Google Scholar 

  • Anderson, T.W.: An Introduction to Multivariate Statistical Analysis. Wiley, New York (2003)

    Google Scholar 

  • Arellano-Valle, R.B., Contreras-Reyes, J., Genton, M.: Shannon entropy and mutual information for multivariate skew-elliptical distributions. Scandinavian J. Stat. 40, 42–62 (2012)

    Article  MathSciNet  Google Scholar 

  • Arellano-Valle, R.B., Ferreira, C.S., Genton, M.G.: Scale and shape mixtures of multivariate skew-normal distributions. J. Multivariate Anal. 166, 98–110 (2018)

    Article  MathSciNet  Google Scholar 

  • Azzalini, A., Genton, M.G.: Robust likelihood methods based on the skew-\(t\) and related distributions. Int. Stat. Rev. 76, 106–129 (2008)

    Article  Google Scholar 

  • Bolfarine, H., Galea, M.: On structural comparative calibration under a \(t\)-model. Comput. Stat. 11, 63–85 (1996)

    MathSciNet  Google Scholar 

  • Bodnar, T., Gupta, A.K., Parolya, N.: On the strong convergence of the optimal linear shrinkage estimator for large dimensional covariance matrix. J. Multivariate Anal. 132, 215–228 (2014)

    Article  MathSciNet  Google Scholar 

  • Contreras-Reyes, J., Arellano-Valle, R.: Kullback-Leibler divergence measure for multivariate skew-normal distributions. Entropy 14, 1606–1626 (2012)

    Article  MathSciNet  Google Scholar 

  • Cook, R.D.: Assessment of local influence (with discussion). J. R. Stat. Soc. B 48, 133–169 (1986)

    Google Scholar 

  • Dykstra, R.L.: Establishing the positive definiteness of the sample covariance matrix. Ann. Math. Stat. 41, 2153–2154 (1970)

    Article  Google Scholar 

  • Fang, K.T., Zhang, Y.T.: Generalized Multivariate Analysis. Springer, Berlin (1990)

    Google Scholar 

  • Feng, D., Baumgartner, R., Svetnik, V.: A robust bayesian estimate of the concordance correlation coefficient. J. Biopharm. Stat. 25, 490–507 (2015)

    Article  Google Scholar 

  • Fiorentini, G., Sentana, E., Calzolari, G.: Maximum likelihood estimation and inference in multivariate conditionally heteroscedastic dynamic regression models with Student \(t\) innovations. J. Bus. Econ. Stat. 21, 532–546 (2003)

    Article  MathSciNet  Google Scholar 

  • Galea, M., Cademartori, D., Curci, R., Molina, A.: Robust inference in the capital asset pricing model using the multivariate \(t\)-distribution. J. Risk Financ. Manage. 13, 123 (2020)

    Article  Google Scholar 

  • Gao, J., Zhang, B.: Estimation of seismic wavelets based on the multivariate scale mixture of gaussian model. Entropy 12, 14–33 (2010)

    Article  Google Scholar 

  • Gómez-Villegas, M.A., Gómez-Sánchez-Manzano, E., Maín, P., Navarro, H.: The effect of non-normality in the Power Exponential distribution. In: Pardo, L., Balakrihnan, N., Gil, M.A. (eds.) Modern Mathematical Tools and Techniques in Capturing Complexity, pp. 119–129. Springer-Verlag, Berlin (2011)

    Chapter  Google Scholar 

  • Gupta, A.K.: Multivariate skew \(t\)-distribution. Statistics 37, 359–363 (2003)

    Article  MathSciNet  Google Scholar 

  • Gupta, A.K., Varga, T., Bodnar, T.: Elliptically Contoured Models in Statistics and Portfolio Theory, 2nd edn. Springer, New York (2013)

    Book  Google Scholar 

  • Härdle, W.K., Simar, L.: Applied Multivariate Statistical Analysis, 3rd edn. Springer, New York (2012)

    Book  Google Scholar 

  • Kent, J.T., Tyler, D.E., Vardi, Y.: A curious likelihood identity for the multivariate \(t\)-distribution. Commun. Stat. Simul. Comput. 23, 441–453 (1994)

    Article  MathSciNet  Google Scholar 

  • Kent, J.T., Tyler, D.E.: Redescending \(M\)-estimates of multivariate location and scatter. Ann. Stat. 19, 2102–2119 (1991)

    Article  MathSciNet  Google Scholar 

  • Kim, H.M., Mallick, B.K.: Moments of random vectors with skew \(t\) distribution and their quadratic forms. Stat. Prob. Lett. 63, 417–423 (2003)

    Article  MathSciNet  Google Scholar 

  • Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951)

    Article  MathSciNet  Google Scholar 

  • Lange, K., Little, R.J.A., Taylor, J.M.G.: Robust statistical modeling using the \(t\) distribution. J. Am. Stat. Assoc. 84, 881–896 (1989)

    MathSciNet  Google Scholar 

  • Leal, C., Galea, M., Osorio, F.: Assessment of local influence for the analysis of agreement. Biometrical J. 61, 955–972 (2019)

    Article  MathSciNet  Google Scholar 

  • Ledoit, O., Wolf, M.: A well-conditioned estimator for large-dimensional covariance matrices. J. Multivariate Anal. 88, 365–411 (2004)

    Article  MathSciNet  Google Scholar 

  • Ledoit, O., Wolf, M.: Analytical nonlinear shrinkage of large-dimensional covariance matrices. Ann. Stat. 48, 3043–3065 (2020)

    Article  MathSciNet  Google Scholar 

  • Magnus, J.R., Neudecker, H.: Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley, Chichester (1999)

    Google Scholar 

  • Mardia, K.V.: Measures of multivariate skewness and kurtosis with applications. Biometrika 36, 519–530 (1970)

    Article  MathSciNet  Google Scholar 

  • Mardia, K.V.: Applications of some measures of multivariate skewness and kurtosis in testing normality and robustness studies. Sankhyā Ser. B 36, 115–128 (1974)

    MathSciNet  Google Scholar 

  • Maronna, R.A.: Robust \(M\)-estimators of multivariate location and scatter. Ann. Stat. 4, 51–67 (1976)

    Article  MathSciNet  Google Scholar 

  • Poon, W., Poon, Y.S.: Conformal normal curvature and assessment of local influence. J. R. Stat. Soc. B 61, 51–61 (1999)

    Article  MathSciNet  Google Scholar 

  • Serfling, R.J.: Approximation Theorems of Mathematical Statistics. Wiley, New York (2009)

    Google Scholar 

  • Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948)

    Article  MathSciNet  Google Scholar 

  • Song, P.X.K., Zhang, P., Qu, A.: Maximum likelihood inference in robust linear mixed-effects models using the multivariate \(t\) distributions. Stat. Sin. 17, 929–943 (2007)

    MathSciNet  Google Scholar 

  • Sutradhar, B.C.: Score test for the covariance matrix of elliptical \(t\)-distribution. J. Multivariate Anal. 46, 1–12 (1993)

    Article  MathSciNet  Google Scholar 

  • Svetnik, V., Ma, J., Soper, K.A., Doran, S., Renger, J.J., Deacon, S., Koblan, K.S.: Evaluation of automated and semi-automated scoring of polysomnographic recordings from a clinical trial using zolpidem in the treatment of insomnia. Sleep 30, 1562–1574 (2007)

    Article  Google Scholar 

  • Wilson, E.B., Hilferty, M.M.: The distribution of chi-square. Proc. Nat. Acad. Sci. United States Am. 17, 684–688 (1931)

    Article  Google Scholar 

  • Zhu, H.T., Lee, S.Y.: Local influence for incomplete-data models. J. R. Stat. Soc. B 63, 111–126 (2001)

    Article  MathSciNet  Google Scholar 

  • Zhu, H., Ibrahim, J.G., Lee, S., Zhang, H.: Perturbation selection and influence measures in local influence analysis. Ann. Stat. 35, 2565–2588 (2007)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by Comisión Nacional de Investigación Científica y Tecnológica, FONDECYT Grants 1140580 and 1150325. Authors are grateful for the valuable comments and suggestions made by the associate editor and anonymous reviewers who, as well as Carla Leal and Ronny Vallejos, who careful read the initial version of the manuscript, allowed improvement to the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Felipe Osorio.

Ethics declarations

Conflict of interest

The authors have declared no conflict of interest.

Supplementary information

This material is subdivided into two sections. First, we present basic properties of the multivariate t-distribution introduced by Sutradhar (1993). Then, a detailed description of the maximum likelihood estimation procedure considering an EM algorithm is provided.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 286 KB)

Appendices

Appendix A: The expected information matrix

We use the results shown in the Supplementary Material and note that the Fisher information matrix for \(\varvec{\theta }\) can be written as

$$\begin{aligned} \varvec{{\mathcal {I}}}(\varvec{\theta }) = \frac{1}{n}\sum _{i=1}^n {{\,\mathrm{E}\,}}\{\varvec{U}_i (\varvec{\theta }) \varvec{U}_i^\top (\varvec{\theta })\}, \end{aligned}$$

where the score function \(\varvec{U}_i(\varvec{\theta }) =(\varvec{U}_i^\top (\varvec{\mu }),\varvec{U}_i^\top (\varvec{\phi }), U_i(\eta ))^\top\) associated to the ith component of the log-likelihood, with \(i=1,\dots ,n\), is defined in Equations (4)-(6), and the expected value \({{\,\mathrm{E}\,}}(\cdot )\) is taken with respect to the density function in (1). Next, we obtain each of the blocks of the Fisher information matrix reported in Equation (7).

From the score functions (4) and (5), it follows that

$$\begin{aligned} {{\,\mathrm{E}\,}}\{\varvec{U}_i(\varvec{\mu })\varvec{U}_i^\top (\varvec{\mu })\}&= c_\mu (\eta )\varvec{\Sigma }^{-1},\\ {{\,\mathrm{E}\,}}\{\varvec{U}_i(\varvec{\phi })\varvec{U}_i^\top (\varvec{\phi })\}&= \frac{1}{4}\varvec{D}_p^\top \left\{ 2c_\phi (\eta ) (\varvec{\Sigma }^{-1}\otimes \varvec{\Sigma }^{-1})\varvec{N}_p\right. \\&\quad \left. + (c_\phi (\eta ) - 1)({{\,\mathrm{vec}\,}}\varvec{\Sigma }^{-1}) ({{\,\mathrm{vec}\,}}\varvec{\Sigma }^{-1})^\top \right\} \varvec{D}_p, \end{aligned}$$

where \(c_\mu (\eta )=c_\phi (\eta )/(1-2\eta )\), \(c_\phi (\eta )=(1+p\eta )/(1+(p+2)\eta )\). Note that \(c_\mu (\eta )\) and \(c_\phi (\eta )\rightarrow 1\) when \(\eta \rightarrow 0\). On the other hand, we have that \(\varvec{N}_p\varvec{D}_p=\varvec{D}_p\) (see Magnus and Neudecker 1999), which produces the expressions corresponding to the normal case.

Therefore, it is clear that

$$\begin{aligned} \frac{\partial v_i}{\partial \eta } = (1-2\eta )^{-3}\left\{ (p+2) (1-2\eta )q_i^{-1} - (1 + \eta p)q_i^{-2}\delta _i^2\right\} , \end{aligned}$$

with \(q_i = 1+c(\eta )\delta _i^2\). Thus, we obtain

$$\begin{aligned} \frac{\partial \varvec{U}_i(\varvec{\mu })}{\partial \eta }&= (1-2\eta )^{-3}\varvec{\Sigma }^{-1} \left\{ (p+2)(1 - 2\eta ) q_i^{-1}\varvec{Z}_i - (1 + \eta p)q_i^{-2}\delta _i^2\varvec{Z}_i\right\} , \\ \frac{\partial \varvec{U}_i(\varvec{\phi })}{\partial \eta }&=\frac{1}{2}(1 - 2\eta )^{-3}\varvec{D}_p^\top {{\,\mathrm{vec}\,}}\left\{ \varvec{\Sigma }^{-1}\left( (p+2)(1 - 2\eta )q_i^{-1}\varvec{Z}_i\varvec{Z}_i^{\top }\right. \right. \\&\quad \left. \left. - (1 + \eta p)q_i^{-2}\delta _i^2\varvec{Z}_i\varvec{Z}_i^{\top }\right) \varvec{\Sigma }^{-1}\right\} . \end{aligned}$$

By applying Lemmas 3 and 5 from Supplementary Material, it follows that

$$\begin{aligned} {{\,\mathrm{E}\,}}\left\{ \frac{\partial \varvec{U}_i(\varvec{\mu })}{\partial \eta }\right\}&= \varvec{0}, \\ {{\,\mathrm{E}\,}}\left\{ \frac{\partial \varvec{U}_i(\varvec{\phi })}{\partial \eta }\right\}&= \frac{c(\eta )(p+2)}{(1 + \eta p)(1 + (p+2)\eta )}\varvec{D}_p^\top {{\,\mathrm{vec}\,}}(\varvec{\Sigma }^{-1}), \end{aligned}$$

for \(i=1,\dots ,n\). The score function for \(\eta\) can be written as,

$$\begin{aligned} U_i(\eta )&= \frac{1}{2\eta ^2}\left\{ pc(\eta ) + \psi \left( \frac{1}{2\eta }\right) -\psi \left( \frac{1+p\eta }{2\eta }\right) - \frac{1+p\eta }{1-2\eta } \frac{c(\eta )\delta _i^2}{1 + c(\eta )\delta _i^2} +\log (1+c(\eta )\delta _i^2)\right\} , \\&= \frac{1}{2\eta ^2}\left\{ \log (1+ Q_{i\eta }) -\left( \psi \left( \frac{1+p\eta }{2\eta }\right) - \psi \left( \frac{1}{2\eta }\right) \right) -\left( \frac{1+p\eta }{1-2\eta }\frac{Q_{i\eta }}{1 + Q_{i\eta }} - pc(\eta )\right) \right\} , \end{aligned}$$

where \(Q_{i\eta } = c(\eta ) \delta _i^2\sim \chi _p^2/\chi _{1/\eta }^2\), \({{\,\mathrm{E}\,}}\{Q_{i\eta } (1+Q_{i\eta })^{-1}\}=\frac{p\eta }{1+p\eta }\) and \({{\,\mathrm{E}\,}}\{\log (1 + Q_{i\eta })\}=\psi \left( \frac{1+p\eta }{2\eta }\right) -\psi \left( \frac{1}{2\eta }\right)\). Let \(U_1 =\log (1+Q_{i\eta })\), \(U_2 =\frac{1+p\eta }{1 - 2\eta }\frac{Q_{i\eta }}{1+Q_{i\eta }}\), \(\overline{U}_1 = U_1 - {{\,\mathrm{E}\,}}(U_1)\) and \(\overline{U}_2 = U_2 -{{\,\mathrm{E}\,}}(U_2)\). Then,

$$\begin{aligned} {{\,\mathrm{E}\,}}\{U_i(\eta )\}&= {{\,\mathrm{E}\,}}\{(U_1 - {{\,\mathrm{E}\,}}(U_1)) - (U_2 - {{\,\mathrm{E}\,}}(U_2))\} = 0,\\ {{\,\mathrm{var}\,}}\{U_i(\eta )\}&= \frac{1}{4\eta ^4}\{{{\,\mathrm{E}\,}}(\overline{U}_1^2) -2{{\,\mathrm{E}\,}}(\overline{U}_1 \overline{U}_2) + {{\,\mathrm{E}\,}}(\overline{U}_2^2)\} \\&= \frac{1}{4\eta ^4}\{{{\,\mathrm{var}\,}}(\overline{U}_1) -2{{\,\mathrm{Cov}\,}}(\overline{U}_1,\overline{U}_2) + {{\,\mathrm{var}\,}}(\bar{U}_2)\} \\&= \frac{1}{4\eta ^4}\{{{\,\mathrm{E}\,}}(U_1^2) - {{\,\mathrm{E}\,}}^2(U_1) - 2({{\,\mathrm{E}\,}}(U_1 U_2) -{{\,\mathrm{E}\,}}(U_1){{\,\mathrm{E}\,}}(U_2)) + {{\,\mathrm{E}\,}}(U_2^2)-{{\,\mathrm{E}\,}}^2(U_2)\}. \end{aligned}$$

Using the fact that \(\psi (x+1) = \psi (x)+1/x\) we have

$$\begin{aligned} {{\,\mathrm{E}\,}}(U_1^2)&= {{\,\mathrm{E}\,}}\{(\log (1+Q_{i\eta }))^2\} = {{\,\mathrm{E}\,}}^2(U_1) - \psi '\left( \frac{1+p\eta }{2\eta }\right) + \psi '\left( \frac{1}{2\eta }\right) , \\ {{\,\mathrm{E}\,}}\{U_1 U_2)&= \frac{1+p\eta }{1-2\eta }{{\,\mathrm{E}\,}}\{Q_{i\eta } (1 + Q_{i\eta })^{-1}\log (1 + Q_{i\eta })\} \\&= \left\{ {{\,\mathrm{E}\,}}(U_1) + \frac{2\eta }{1+p\eta }\right\} {{\,\mathrm{E}\,}}(U_2), \\ {{\,\mathrm{E}\,}}(U_2^2)&= \left( \frac{1+p\eta }{1-2\eta }\right) ^2 {{\,\mathrm{E}\,}}\{Q_{i\eta }^2(1+Q_{i\eta })^{-2}\} \\&= \frac{p+2}{p}\frac{1+p\eta }{1+(p+2)\eta }{{\,\mathrm{E}\,}}(U_2)^2. \end{aligned}$$

Finally, we have that the expected information in relation to \(\eta\) is given by,

$$\begin{aligned} {{\,\mathrm{var}\,}}\{U_i(\eta )\}&= \frac{1}{4\eta ^4}\left\{ -\psi '\left( \frac{1+p\eta }{2\eta }\right) + \psi '\left( \dfrac{1}{2\eta }\right) - \frac{4\eta }{1+p\eta }{{\,\mathrm{E}\,}}(U_2) \right. \\&\quad \left. + \frac{p+2}{p}\frac{1+p\eta }{1+(p+2)\eta }{{\,\mathrm{E}\,}}(U_2)^2 - {{\,\mathrm{E}\,}}(U_2)^2\right\} \\&= \frac{1}{4\eta ^4}\left\{ \psi '\left( \frac{1}{2\eta }\right) - \psi ' \left( \frac{1+p\eta }{2\eta }\right) + 2pc(\eta )^2\left( \frac{4(p+2) \eta ^2 - p\eta -1}{(1+p\eta )(1+(p+2)\eta )}\right) \right\} . \end{aligned}$$

From the expansion (Abramowitz and Stegun 1970, Sec. 6.4.12),

$$\begin{aligned} \psi '(x)&= \frac{1}{x} + \frac{1}{2x^2} + \frac{1}{6x^3} + O\left( \frac{1}{x^5}\right) \quad \text {as} \quad x\rightarrow \infty , \\ (1+ax)^{-k}&= 1-k a x + \frac{k(k+1)}{2}a^2x^2 -\frac{k(k+1)(k+2)}{6}a^3x^3 + O(x^4) \quad \text {as} \quad x\rightarrow 0, \end{aligned}$$

we find as \(\eta \rightarrow 0\) that,

$$\begin{aligned} \psi '\left( \frac{1}{2\eta }\right) - \psi '\left( \frac{1+p\eta }{2\eta }\right)&= 2\eta +2\eta ^2 + \frac{4}{3}\eta ^3 - 2\eta (1+p\eta )^{-1} - 2\eta ^2(1+p\eta )^{-2} \\&\quad - \frac{4}{3}\eta ^3(1+p\eta )^{-3} + O(\eta ^5) \\&= 2p\eta ^2 - (2p^2-4p)\eta ^3 + (2p^3-6p^2+4p)\eta ^4 + O(\eta ^5). \end{aligned}$$

Similarly,

$$\begin{aligned} 2pc(\eta )^2&\left( \frac{4(p+2)\eta ^2 - p\eta -1}{(1+p\eta )(1 + (p + 2)\eta )}\right) = \frac{2p\eta ^2(4(p+2)\eta ^2 - p\eta - 1)}{(1 - 2\eta )^2(1 + p\eta )(1 + (p+2)\eta )} \\&= (8p(p+2)\eta ^4 - 2p^2\eta ^3 - 2p\eta ^2)(1 + 4\eta + 12\eta ^2 + O(\eta ^3)) \\&\quad \times (1 - p\eta + p^2\eta ^2 + O(\eta ^3))(1 - (p+2)\eta - (p+2)^2\eta ^2 O(\eta ^3)) \\&= -2p\eta ^2 + (2p^2 - 4p)\eta ^3 + (2p^3+24p^2+16p)\eta ^4 + O(\eta ^5). \end{aligned}$$

Hence,

$$\begin{aligned} {{\,\mathrm{var}\,}}\{U_i(\eta )\} = \frac{1}{4}(4p^3+18p^2+20p) + O(\eta ) =\frac{p(p+2)(2p+5)}{2} + O(\eta ). \end{aligned}$$

Appendix B: Non-normality due to asymmetry

Another source of non-normality is the possible asymmetry present in the observations. Shannon entropy, Kullback–Leibler divergence and mutual information for multivariate skew-elliptical distributions have been considered in the literature, see for instance Arellano-Valle et al. (2012) and Contreras-Reyes and Arellano-Valle (2012). We summarize some of these results for the multivariate skew normal and skew t distributions below.

Following Arellano-Valle et al. (2012), we say that a random vector \(\varvec{Z}\in {\mathbb {R}}^p\) has a skew-normal distribution with location vector \(\varvec{\xi }\in {\mathbb {R}}^p\), dispersion matrix \(\varvec{\Omega } > 0\) and shape/skewness parameter \(\varvec{\gamma }\in {\mathbb {R}}^p\), denoted by \(\varvec{Z}\sim \mathsf {SN}_p(\varvec{\xi },\varvec{\Omega },\varvec{\gamma })\), if its probability density function is

$$\begin{aligned} f(\varvec{z}) = 2\phi _p(\varvec{z};\varvec{\xi },\varvec{\Omega })\,\Phi \{\varvec{\gamma }^\top (\varvec{z}-\varvec{\xi })\}, \qquad \varvec{z} \in {\mathbb {R}}^p, \end{aligned}$$
(B.1)

where

$$\begin{aligned} \phi _p(\varvec{z};\varvec{\xi },\varvec{\Omega }) = (2\pi )^{-p/2}\vert \varvec{\Omega }\vert ^{-1/2}\exp (-\delta _{\mathsf{skew}}^2/2), \end{aligned}$$

is the probability density function of the p-variate \(\mathsf {N}_p(\varvec{\xi },\varvec{\Omega })\) distribution, \(\Phi (\cdot )\) is the univariate \(\mathsf {N}(0,1)\) cumulative distribution function and \(\delta _{\mathsf{skew}}^2 =(\varvec{z}-\varvec{\xi })^\top \varvec{\Omega }^{-1} (\varvec{z}-\varvec{\xi }) \sim \chi ^2(p)\). The vector of means and the covariance matrix of \(\varvec{Z}\) are given, respectively, by

$$\begin{aligned} \varvec{\mu }_{\mathsf{SN}} = \varvec{\xi } + \sqrt{\frac{2}{\pi }}\varvec{\delta }, \qquad \text {and} \qquad \varvec{\Sigma }_{\mathsf{SN}} = \varvec{\Omega } - \frac{2}{\pi }\varvec{\delta }\varvec{\delta }^\top , \end{aligned}$$

where, \(\varvec{\delta } = \varvec{\Omega \gamma }/\sqrt{1 + \tau ^2}\) and \(\tau ^2=\varvec{\gamma }^\top \varvec{\Omega \gamma }\).

We say that a random vector \(\varvec{Z}\in {\mathbb {R}}^p\) has a skew-t distribution with location vector \(\varvec{\xi }\in {\mathbb {R}}^p\), dispersion matrix \(\varvec{\Omega }\in {\mathbb {R}}^{p \times p}\), shape/skewness parameter \(\varvec{\gamma }\in {\mathbb {R}}^p\) and \(\nu > 0\) degrees of freedom, denoted by \(\varvec{Z} \sim \mathsf {St}_p(\varvec{\xi },\varvec{\Omega },\varvec{\gamma },\nu )\), if its probability density function is given by

$$\begin{aligned} f(\varvec{z}) = 2t_p(\varvec{z};\varvec{\xi },\varvec{\Omega },\nu )\,T \left( \sqrt{\frac{\nu + p}{\nu + \delta _{\mathsf{skew}}^2}} \, \varvec{\gamma }^\top (\varvec{z} - \varvec{\xi }); \nu + p\right) , \end{aligned}$$
(B.2)

where

$$\begin{aligned} t_p(\varvec{z};\varvec{\xi },\varvec{\Omega },\nu ) = \frac{\Gamma \left( \frac{\nu + p}{2}\right) }{\Gamma \left( \frac{\nu }{2}\right) (\nu \pi )^{p/2}} \vert \varvec{\Omega }\vert ^{-1/2}\left( 1 + \frac{1}{\nu } \delta _{\mathsf{skew}}^2\right) ^{-(\nu +p)/2},\qquad \varvec{z}\in {\mathbb {R}}^p, \end{aligned}$$

is the probability density function of the p-variate \(t_p(\varvec{\xi },\varvec{\Omega },\nu )\) distribution, \(\delta _{\mathsf{skew}}^2 = (\varvec{z} - \varvec{\xi })^\top \varvec{\Omega }^{-1}(\varvec{z} - \varvec{\xi })/p \sim F(p,\nu )\) and \(T(x;\nu + p)\) is the \(T_1(0,1,\nu +p)\) cumulative distribution function (see, for instance Azzalini and Genton 2008; Arellano-Valle et al. 2012, for details).

If \(\varvec{Z}\sim \mathsf {St}_p(\varvec{\xi },\varvec{\Omega },\varvec{\gamma },\nu )\) then the vector of means and the covariance matrix of \(\varvec{Z}\) is given by

$$\begin{aligned} \varvec{\mu }_{\mathsf{St}}&= \varvec{\xi } + \alpha (\nu )\varvec{\delta }, \quad \nu> 1 \\ \varvec{\Sigma }_{\mathsf{St}}&= \frac{\nu }{\nu -2}\varvec{\Omega } - \{\alpha (\nu )\}^2 \varvec{\delta }\varvec{\delta }^\top , \quad \nu > 2, \end{aligned}$$

where \(\alpha (\nu ) =\{\Gamma ((\nu -1)/2)/\Gamma (\nu /2)\} \sqrt{\nu /\pi }\). Note that \(\alpha (\nu )\rightarrow \sqrt{2/\pi }\) as \(\nu \rightarrow \infty\), and we obtain the results for the skew-normal distribution given above.

Arellano-Valle et al. (2012) show that for the skew-normal and skew-t distributions the Shannon entropy has explicit form and given in the following lemmas.

Lemma 4

If \(\varvec{X}\sim \mathsf {SN}_p(\varvec{\xi },\varvec{\Omega },\varvec{\gamma })\) and \(\varvec{Y}\sim \mathsf {St}_p(\varvec{\xi }, \varvec{\Omega },\varvec{\gamma },\nu )\), then the Shannon entropy is given by

  1. (i)

    \(H(\varvec{X}) = \frac{1}{2}\log \vert \varvec{\Omega }\vert +\frac{p}{2}(1+ \log 2\pi ) - {{\,\mathrm{E}\,}}[\log \{2\Phi (\tau W)\}]\),

  2. (ii)

    \(H(\varvec{Y}) = \frac{1}{2}\log \vert \varvec{\Omega }\vert -\log \Gamma \left( \frac{\nu + p}{2}\right) + \log \Gamma \left( \frac{\nu }{2}\right) + \frac{p}{2}\log (\nu \pi ) +\frac{\nu + p}{2}\left\{ \psi \left( \frac{\nu + p}{2}\right) -\psi \left( \frac{\nu }{2}\right) \right\} - {{\,\mathrm{E}\,}}[\log \{2T(\tau W^*; \nu +p)\}]\),

with \(W \sim \mathsf {SN}(0,1,\tau )\), \(\tau ^2 =\varvec{\gamma }^\top \varvec{\Omega \gamma }\); \(W^* = \sqrt{\nu + p} \,W_{\mathsf{St}}/\sqrt{\nu + p-1 + W_{\mathsf{St}}^2}\) where \(W_{\mathsf{St}} \sim \mathsf {St}(0,1,\tau ,\nu +p-1)\).

Lemma 5

Let \(\varvec{Z}\sim \mathsf {N}_p(\varvec{\mu },\varvec{\Sigma })\). If \(\varvec{X}\sim \mathsf {SN}_p(\varvec{\xi }, \varvec{\Omega },\varvec{\gamma })\), \(\varvec{Y}\sim \mathsf {St}_p(\varvec{\xi },\varvec{\Omega },\varvec{\gamma },\nu )\) then the negentropy of \(\varvec{X}\) and \(\varvec{Y}\) are given, respectively, by,

  1. (i)

    \(H_N(\varvec{X}) = \frac{1}{2}\log \vert \varvec{\Sigma }\vert -\frac{1}{2}\log \vert \varvec{\Omega }\vert +{{\,\mathrm{E}\,}}[\log \{2\Phi (\tau W)\}]\), and

  2. (ii)

    \(H_N(\varvec{Y}) = \frac{1}{2}\log \vert \varvec{\Sigma }\vert +\frac{p}{2}(1+\log 2\pi ) - \frac{1}{2}\log \vert \varvec{\Omega }\vert +\log \Gamma \left( \frac{\nu + p}{2}\right) -\log \Gamma \left( \frac{\nu }{2}\right) - \frac{p}{2}\log (\nu \pi )- \frac{\nu + p}{2} \left\{ \psi \left( \frac{\nu + p}{2}\right) - \psi \left( \frac{\nu }{2}\right) \right\} + {{\,\mathrm{E}\,}}[\log \{2T(\tau W^*; \nu +p)\}]\).

Mardia (1970) introduced one of the popular and commonly used measures of multivariate skewness of an arbitrary p-dimensional random vector \(\varvec{Z}\) with mean vector \(\varvec{\mu }\) and covariance matrix \(\varvec{\Sigma }\). Mardia’s skewness coefficient is defined as,

$$\begin{aligned} \beta _{1,p} = {{\,\mathrm{E}\,}}[\{(\varvec{Z} - \varvec{\mu })^\top \varvec{\Sigma }^{-1}(\varvec{Z} - \varvec{\mu })\}^3], \end{aligned}$$

which can be expressed as \(\beta _{1,p} ={{\,\mathrm{tr}\,}}\{\varvec{S}^\top (\varvec{Y})\varvec{S}(\varvec{Y})\}\), where \(\varvec{S}(\varvec{Y}) ={{\,\mathrm{E}\,}}(\varvec{Y}\otimes \varvec{Y}^\top \otimes \varvec{Y})\), with \(\varvec{Y} =\varvec{\Sigma }^{-1/2} (\varvec{Z} - \varvec{\mu })\) and \(\otimes\) denotes the Kronecker product. The following lemmas, extracted from Kim and Mallick (2003), allow us to obtain explicit formulas for \(\varvec{S}(\varvec{Y})\). In particular, Fig. 11 leads us to note the interaction between the degrees of freedom and the skewness parameter on the coefficient \(\beta _{1,p}\) proposed by Mardia (1970). In fact, as \(\nu\) grows, it has less impact on \(\beta _{1,p}\).

Lemma 6

If \(\varvec{Y} \sim \mathsf {SN}_p(\varvec{0},\varvec{\Omega },\varvec{\gamma })\), then

$$\begin{aligned} \varvec{S}(\varvec{Y}) = \sqrt{2/\pi }[\varvec{\delta }\otimes \varvec{\Omega } +{{\,\mathrm{vec}\,}}(\varvec{\Omega })\varvec{\delta }^\top + (\varvec{I}_p\otimes \varvec{\delta })\varvec{\Omega } - \varvec{\delta }\otimes \varvec{\delta \delta }^\top ], \end{aligned}$$

where \(\varvec{\delta } = \varvec{\Omega \gamma }/\sqrt{1 + \tau ^2}\). In addition, if \(\varvec{\gamma } = \varvec{0}\), that is \(\varvec{Y} \sim \mathsf {N}_p(\varvec{0},\varvec{\Omega })\), then \(\beta _{1,p} = 0\).

Lemma 7

If \(\varvec{Y} \sim \mathsf {St}_p(\varvec{0},\varvec{\Omega },\varvec{\gamma },\nu )\), then

$$\begin{aligned} \varvec{S}(\varvec{Y}) = \frac{\alpha (\nu )\nu }{\nu - 3}[\varvec{\delta }\otimes \varvec{\Omega } + {{\,\mathrm{vec}\,}}(\varvec{\Omega })\varvec{\delta }^\top + (\varvec{I}_p\otimes \varvec{\delta })\varvec{\Omega } - \varvec{\delta }\otimes \varvec{\delta \delta }^\top ], \end{aligned}$$

where \(\alpha (\nu ) = \sqrt{\nu /\pi }\Gamma ((\nu -1)/2)/\Gamma (\nu /2)\) and \(\varvec{\delta } = \varvec{\Omega \gamma }/\sqrt{1 + \tau ^2}\). In addition, if \(\varvec{\gamma } = \varvec{0}\), that is \(\varvec{Y} \sim \mathsf {St}_p(\varvec{0},\varvec{\Omega },\nu )\), then \(\beta _{1,p} = 0\).

Fig. 11
figure 11

Plot of Mardia’s skewness coefficient \(\beta _{1,p}\) for the univariate skew t-distribution with \(\xi = 0\) and \(\Omega = 1\)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Osorio, F., Galea, M., Henríquez, C. et al. Addressing non-normality in multivariate analysis using the t-distribution. AStA Adv Stat Anal 107, 785–813 (2023). https://doi.org/10.1007/s10182-022-00468-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-022-00468-2

Keywords

Navigation