Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter July 11, 2022

Modeling sign concordance of quantile regression residuals with multiple outcomes

  • Silvia Columbu EMAIL logo , Paolo Frumento and Matteo Bottai

Abstract

Quantile regression permits describing how quantiles of a scalar response variable depend on a set of predictors. Because a unique definition of multivariate quantiles is lacking, extending quantile regression to multivariate responses is somewhat complicated. In this paper, we describe a simple approach based on a two-step procedure: in the first step, quantile regression is applied to each response separately; in the second step, the joint distribution of the signs of the residuals is modeled through multinomial regression. The described approach does not require a multidimensional definition of quantiles, and can be used to capture important features of a multivariate response and assess the effects of covariates on the correlation structure. We apply the proposed method to analyze two different datasets.


Corresponding author: Silvia Columbu, University of Cagliari, Cagliari, Italy, E-mail:

Funding source: Regione Autonoma della Sardegna

Award Identifier / Grant number: Operational Programme P.O.R. Sardegna F.S.E. (European Social Fund 2014-2020 - Axis III Education and Formation, Objective10.5, Line of Activity 10.5.12)

Acknowledgments

We thank Dr. Giovanni Viegi for allowing use of a subset of the data from the Po river delta epidemiological study.

  1. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: Silvia Columbu gratefully acknowledges Regione Autonoma della Sardegna for the financial support provided under the Operational Programme P.O.R. Sardegna F.S.E. (European Social Fund 2014-2020 - Axis III Education and Formation, Objective10.5, Line of Activity 10.5.12).

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

1. Paciaroni, M, Agnelli, G, Falocci, N, Caso, V, Becattini, C, Marcheselli, S, et al.. Early recurrence and cerebral bleeding in patients with acute ischemic stroke and atrial fibrillation: effect of anticoagulation and its timing: the raf study. Stroke 2015;46:2175–82. https://doi.org/10.1161/strokeaha.115.008891.Search in Google Scholar PubMed

2. Koenker, R. Quantile regression. Cambridge: Cambridge University Press; 2005.10.1017/CBO9780511754098Search in Google Scholar

3. Serfling, R. Quantile functions for multivariate analysis: approaches and applications. Stat Neerl 2002;56:214–32. https://doi.org/10.1111/1467-9574.00195.Search in Google Scholar

4. Cai, Y. Multivariate quantile function models. Stat Sin 2010.Search in Google Scholar

5. Chakraborty, B. On multivariate quantile regression. J Stat Plann Inference 2003;110:109–32. https://doi.org/10.1016/s0378-3758(01)00277-4.Search in Google Scholar

6. Chauduri, P. On a geometric notion of quantiles for multivariate data. J Am Stat Assoc 1996;91:862–72.10.1080/01621459.1996.10476954Search in Google Scholar

7. Chavas, JP. On multivariate quantile regression analysis. Stat Methods Appl 2017:365–84. https://doi.org/10.1007/s10260-017-0407-x.Search in Google Scholar

8. Dudley, RM, Koltchinskii, VI. The spatial quantiles. Unpublished Manuscript 1992.Search in Google Scholar

9. Geraci, M, Boghossian, N, Farcomeni, A, Horbar, J. Quantile contours and allometric modelling for risk classification of abnormal ratios with an application to asymmetric growth-restriction in preterm infants. Stat Methods Med Res 2020;29:1769–86. https://doi.org/10.1177/0962280219876963.Search in Google Scholar PubMed PubMed Central

10. Hallin, M, Paindaveine, D, Siman, M. Multivariate quantiles and multiple-output regression quantiles: from l1 optimization to halfspace depth. Ann Stat 2010;110:109–32. https://doi.org/10.1214/09-aos723.Search in Google Scholar

11. Kong, L, Mizera, I. Quantile tomography: using quantiles with multivariate data. Statistics Sinica 2010;22:1589–610.Search in Google Scholar

12. Liu, X, Zuo, Y. Computing halfspace depth and regression depth. Commun Stat Simulat Comput 2014;43:969–85. https://doi.org/10.1080/03610918.2012.720744.Search in Google Scholar

13. Struyf, AJ, Rousseuw, PJ. Halfspace depth and regression depth characterize the empirical distribution. J Multivariate Anal 1999;69:135–53. https://doi.org/10.1006/jmva.1998.1804.Search in Google Scholar

14. Alfo, M, Marino, F, Ranalli, M, Salvati, N, Tzavidis, N. M-quantile regression for multivariate longitudinal data with an application to the millennium cohort study. J Roy Stat Soc: Series C (Appl Stat) 2020;70:9122–46.10.1111/rssc.12452Search in Google Scholar

15. Kulkarni, H, Biswas, J, Das, K. A joint quantile regression model for multiple longitudinal outcomes. AStA Adv Stat Anal 2019. https://doi.org/10.1007/s10182-018-00339-9.Search in Google Scholar

16. Petrella, L, Raponi, V. Joint estimation of conditional quantiles in multivariate linear regression models with an application to financial distress. J Multivariate Anal 2019. https://doi.org/10.1016/j.jmva.2019.02.008.Search in Google Scholar

17. Drovandi, C, Pettitt, A. Likelihood-free Bayesian estimation of multivariate quantile distributions. Comput Stat Data Anal 2011. https://doi.org/10.1016/j.csda.2011.03.019.Search in Google Scholar

18. Guggisberg, MA. A Bayesian approach to multiple-output quantile regression. J Am Stat Assoc 2022. https://doi.org/10.1080/01621459.2022.2075369.10.1080/01621459.2022.2075369Search in Google Scholar

19. Waldmann, E, Kneib, T. Bayesian bivariate quantile regression. Stat Model Int J 2015. https://doi.org/10.1177/1471082x14551247.Search in Google Scholar

20. Li, R, Cheng, Y, Fine, JP. Quantile association regression models. J Am Stat Assoc 2014;109:230–42. https://doi.org/10.1080/01621459.2013.847375.Search in Google Scholar

21. Liang, KY, Zeger, SL. Longitudinal data analysis using generalized linear models. Biometrika 1986;73:13–22. https://doi.org/10.1093/biomet/73.1.13.Search in Google Scholar

22. Lipsitz, SR, Laird, NM, Harrington, DP. Generalized estimating equations for correlated binary data: using the odds ratio as a measure of association. Biometrika 1991;78:153–60. https://doi.org/10.1093/biomet/78.1.153.Search in Google Scholar

23. Lu, M, Yang, W. Multivariate logistic regression analysis of complex survey data with application to brfss data. J Data Sci 2012;10:157–73.10.6339/JDS.201204_10(2).0001Search in Google Scholar

24. Prentice, RL. Correlated binary regression with covariates specific to each binary observation. Biometrics 1988;44:1033–48. https://doi.org/10.2307/2531733.Search in Google Scholar

25. Breslow, NE, Calyton, DG. Approximate inference in generalized linear mixed model. J Am Stat Assoc 1993;88:9–25. https://doi.org/10.1080/01621459.1993.10594284.Search in Google Scholar

26. Das, A, Poole, WK, Bada, HS. A repeated measure approach for simultaneous modeling of multiple neurobehavioral outcomes in newborn exposed to cocaine in utero. Am J Epidemiol 2004;159:891–9. https://doi.org/10.1093/aje/kwh114.Search in Google Scholar PubMed

27. Molenberghs, G, Verbeke, G. Models for discrete longitudinal data. New York: Springer; 2005.Search in Google Scholar

28. Stiratelli, R, Laird, NM, Ware, JH. Random effects models for serial observations with binary response. Biometrics 1984;40:961–71. https://doi.org/10.2307/2531147.Search in Google Scholar

29. Gauvreau, K, Pagano, M. The analysis of correlated binary outcomes using multivariate logistic regression. Biom J 1997;39:309–25. https://doi.org/10.1002/bimj.4710390306.Search in Google Scholar

30. Genest, C, Nikoloulopoulos, AK, Rivest, L, Fortin, M. Predicting dependent binary outcomes through logistic regressions and meta-elliptical copulas. Brazilian J Probab Stat 2013;27:265–84. https://doi.org/10.1214/11-bjps165.Search in Google Scholar

31. Meester, SG, MacKay, R. A parametric model for cluster correlated categorical data. Biometrics 1994;50:954–63. https://doi.org/10.2307/2533435.Search in Google Scholar

32. Nikoloulopoulos, AK, Karlis, D. Multivariate logit copula model with an application to dental data. Stat Med 2008;27:6393–406. https://doi.org/10.1002/sim.3449.Search in Google Scholar PubMed

33. Koenker, R, Bassett, G. Regression quantiles. Econometrica 1978:33–50. https://doi.org/10.2307/1913643.Search in Google Scholar

34. Hardin, JW. The robust variance estimator for two-stage models. Stata J 2002;2:253–66. https://doi.org/10.1177/1536867x0200200302.Search in Google Scholar

35. Murphy, KN, Topel, RH. Estimation and inference in two-step econometric models. Brazilian J Bus Econ Stat 1978;20:88–97.10.1198/073500102753410417Search in Google Scholar

36. Quanjer, PH, Stanojevic, S, Cole, TJ, Baur, X, Hall, GL, Culver, BH, et al.. Multi-ethnic reference values for spirometry for the 3-95-yr age range: the global lung function 2012 equations. Eur Respir J 2012;40:1324–43. https://doi.org/10.1183/09031936.00080312.Search in Google Scholar PubMed PubMed Central

37. Stanojevic, S, Wade, A, Stocks, J. Reference values for lungfunction: past, present and future. Eur Respir J 2010;36:12–9. https://doi.org/10.1183/09031936.00143209.Search in Google Scholar PubMed

38. Bottai, M, Pistelli, F, Pede, FD, Baldacci, S, Simoni, M, Maio, S, et al.. Percentiles of inspiratory capacity in healthy nonsmokers: a pilot study. Respiration 2011;82:254–62. https://doi.org/10.1159/000327206.Search in Google Scholar PubMed

39. Carrozzi, L, Giuliano, G, Viegi, G, Paoletti, P, Pede, FD, Mammini, U, et al.. The po river delta epidemiological study of obstructive lung disease: sampling methods, environmental and population characteristics. Eur J Epidemiol 1990;6:191–200. https://doi.org/10.1007/bf00145793.Search in Google Scholar

40. Ciccio, TJD, Romano, JP. A review of bootstrap confidence intervals. J Roy Stat Soc B 1988;50:338–54. https://doi.org/10.1111/j.2517-6161.1988.tb01732.x.Search in Google Scholar

41. Efron, B, Tibshirani, R. Bootstrap methods for standard errors, confidence intervals and other measures of statistical accuracy. Stat Sci 1986;1:54–77. https://doi.org/10.1214/ss/1177013815.Search in Google Scholar

42. Loehlin, J, Nichols, R. Heredity, environment, & personality: a study of 850 sets of twins. TX, Austin: University of Texas Press; 1976.10.7560/730038Search in Google Scholar

43. Lee, Y, Molas, M, Noh, M. mdhglm: multivariate double hierarchical generalized linear models. In: R package version 1.8; 2018.Search in Google Scholar

44. Wang, J, Zheng, N. Measures of correlation for multiple variables. arXiv:1401.4827v6.Search in Google Scholar

Received: 2020-12-17
Accepted: 2022-06-15
Published Online: 2022-07-11

© 2022 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 7.5.2024 from https://www.degruyter.com/document/doi/10.1515/ijb-2022-0020/html
Scroll to top button