Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter October 24, 2022

A copula-based set-variant association test for bivariate continuous, binary or mixed phenotypes

  • Julien St-Pierre ORCID logo EMAIL logo and Karim Oualkacha ORCID logo

Abstract

In genome wide association studies (GWAS), researchers are often dealing with dichotomous and non-normally distributed traits, or a mixture of discrete-continuous traits. However, most of the current region-based methods rely on multivariate linear mixed models (mvLMMs) and assume a multivariate normal distribution for the phenotypes of interest. Hence, these methods are not applicable to disease or non-normally distributed traits. Therefore, there is a need to develop unified and flexible methods to study association between a set of (possibly rare) genetic variants and non-normal multivariate phenotypes. Copulas are multivariate distribution functions with uniform margins on the [0, 1] interval and they provide suitable models to deal with non-normality of errors in multivariate association studies. We propose a novel unified and flexible copula-based multivariate association test (CBMAT) for discovering association between a genetic region and a bivariate continuous, binary or mixed phenotype. We also derive a data-driven analytic p-value procedure of the proposed region-based score-type test. Through simulation studies, we demonstrate that CBMAT has well controlled type I error rates and higher power to detect associations compared with other existing methods, for discrete and non-normally distributed traits. At last, we apply CBMAT to detect the association between two genes located on chromosome 11 and several lipid levels measured on 1477 subjects from the ASLPAC study.


Corresponding author: Julien St-Pierre, Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada, E-mail:

Funding source: Wellcome Trust

Award Identifier / Grant number: WT091310

Funding source: Fonds de recherche Québec-Santé

Award Identifier / Grant number: 267074

  1. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: This research was supported by the Fonds de recherche Québec-Santé [267074 to K.O.]; and the Natural Sciences and Engineering Research Council of Canada [to K.O.]. This study makes use of data generated by the UK10K Consortium, derived from samples from ALSPAC, under data access agreement ID2250. A full list of the investigators who contributed to the generation of the data is available from www.UK10K.org. Funding for UK10K was provided by the Wellcome Trust award [WT091310].

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

1. Solovieff, N, Cotsapas, C, Lee, PH, et al.. Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet 2013;14:483–95. https://doi.org/10.1038/nrg3461.Search in Google Scholar PubMed PubMed Central

2. Stearns, FW. One hundred years of pleiotropy: a retrospective. Genetics 2010;186:767–73. https://doi.org/10.1534/genetics.110.122549.Search in Google Scholar PubMed PubMed Central

3. Williams, GC. Pleiotropy, natural selection, and the evolution of senescence. Evolution 1957;11:398. https://doi.org/10.2307/2406060.Search in Google Scholar

4. Yang, JJ, Li, J, Williams, LK, Buu, A. An efficient genome-wide association test for multivariate phenotypes based on the Fisher combination function. BMC Bioinf 2016;17:19. https://doi.org/10.1186/s12859-015-0868-6.Search in Google Scholar PubMed PubMed Central

5. Yang, Q, Wang, Y. Methods for analyzing multivariate phenotypes in genetic association studies. Int J Probab Stat 2012;2012:1–13. https://doi.org/10.1155/2012/652569.Search in Google Scholar PubMed PubMed Central

6. Schaid, DJ, Tong, X, Larrabee, B, et al.. Statistical methods for testing genetic pleiotropy. Genetics 2016;204:483–97. https://doi.org/10.1534/genetics.116.189308.Search in Google Scholar PubMed PubMed Central

7. Shriner, D. Moving toward system genetics through multiple trait analysis in genome-wide association studies. Front Genet 2012;3. https://doi.org/10.3389/fgene.2012.00001.Search in Google Scholar PubMed PubMed Central

8. Zhang, Y, Xu, Z, Shen, X, et al.. Testing for association with multiple traits in generalized estimation equations, with application to neuroimaging data. Neuroimage 2014;96:309–25. https://doi.org/10.1016/j.neuroimage.2014.03.061.Search in Google Scholar PubMed PubMed Central

9. Van der Sluis, S, Dolan, V, Li, J, et al.. MGAS: a powerful tool for multivariate gene-based genome-wide association analysis. Bioinformatics 2015;31:1007–15. https://doi.org/10.1093/bioinformatics/btu783.Search in Google Scholar PubMed PubMed Central

10. Guo, X, Liu, Z, Wang, X, Zhang, H. Genetic association test for multiple traits at gene level. Genet Epidemiol 2013;37:122–9. https://doi.org/10.1002/gepi.21688.Search in Google Scholar PubMed PubMed Central

11. Ott, J, Rabinowitz, D. A principal-components approach based on heritability for combining phenotype information. Hum Hered 1999;49:106–11. https://doi.org/10.1159/000022854.Search in Google Scholar PubMed

12. Aschard, H, Vilhjálmsson, BJ, Greliche, N, Morange, P-E, Trégouët, D-A, Kraft, P. Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies. Am J Hum Genet 2014;94:662–76. https://doi.org/10.1016/j.ajhg.2014.03.016.Search in Google Scholar PubMed PubMed Central

13. Klei, L, Luca, D, Devlin, B, Roeder, K. Pleiotropy and principal components of heritability combine to increase power for association analysis. Genet Epidemiol 2008;32:9–19. https://doi.org/10.1002/gepi.20257.Search in Google Scholar PubMed

14. Tang, CS, Ferreira, MAR. A gene-based test of association using canonical correlation analysis. Bioinformatics 2012;28:845–50. https://doi.org/10.1093/bioinformatics/bts051.Search in Google Scholar PubMed

15. Seoane, JA, Campbell, C, Day Ian, NM, et al.. Canonical correlation analysis for gene-based pleiotropy discovery. PLoS Comput Biol 2014;10:e1003876. https://doi.org/10.1371/journal.pcbi.1003876.Search in Google Scholar PubMed PubMed Central

16. Liu, Z, Lin, X. A geometric perspective on the power of principal component association tests in multiple phenotype studies. J Am Stat Assoc 2019;114:975–90. https://doi.org/10.1080/01621459.2018.1513363.Search in Google Scholar PubMed PubMed Central

17. Sun, J, Oualkacha, K, Forgetta, V, et al.. A method for analyzing multiple continuous phenotypes in rare variant association studies allowing for flexible correlations in variant effects. Eur J Hum Genet 2016;24:1344–51. https://doi.org/10.1038/ejhg.2016.8.Search in Google Scholar PubMed PubMed Central

18. Dutta, D, Scott, L, Boehnke, M, Lee, S. Multi-SKAT: general framework to test for rare-variant association with multiple phenotypes. Genet Epidemiol 2019;43:4–23. https://doi.org/10.1002/gepi.22156.Search in Google Scholar PubMed PubMed Central

19. Lee, S, Won, S, Kim, YJ, Kim, Y, Kim, B-J, Park, T. Rare variant association test with multiple phenotypes. Genet Epidemiol 2017;41:198–209. https://doi.org/10.1002/gepi.22021.Search in Google Scholar PubMed PubMed Central

20. Wang, Y, Liu, A, Mills, JL, et al.. Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models. Genet Epidemiol 2015;39:259–75. https://doi.org/10.1002/gepi.21895.Search in Google Scholar PubMed PubMed Central

21. Nelsen, RB. An Introduction to Copulas In: Springer Series in Statistics, 2nd ed. New York, NY: Springer; 2010. 2006. corr. 2. pr. softcover version of original hardcover edition 2006 ed.Search in Google Scholar

22. Konigorski, S, Yilmaz, YE, Janke, J, Bergmann, MM, Boeing, H, Pischon, T. Powerful rare variant association testing in a copula-based joint analysis of multiple phenotypes. Genet Epidemiol 2020;44:26–40. https://doi.org/10.1002/gepi.22265.Search in Google Scholar PubMed

23. Wei, Y, Liu, Y, Chen, W, Ding, Y. Gene-based association analysis for bivariate time-to-event data through functional regression with copula models. arXiv:1904.01116 [stat] 2019.10.1111/biom.13165Search in Google Scholar PubMed PubMed Central

24. Lin, X. Variance component testing in generalised linear models with random effects. Biometrika 1997;84:309–26. https://doi.org/10.1093/biomet/84.2.309.Search in Google Scholar

25. Lu, T, Shiou, S. Inverses of 2 × 2 block matrices. Comput Math Appl 2002;43:119–29. https://doi.org/10.1016/s0898-1221(01)00278-4.Search in Google Scholar

26. Davies, RB. Algorithm AS 155: the distribution of a linear combination of χ2 random variables. Appl. Stat. 1980;29:323. https://doi.org/10.2307/2346911.Search in Google Scholar

27. Wu, MC, Lee, S, Cai, T, et al.. Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 2011;89:82–93. https://doi.org/10.1016/j.ajhg.2011.05.029.Search in Google Scholar PubMed PubMed Central

28. Lee, S, Emond, MJ, Bamshad, MJ, Barnes, KC, Rieder, MJ, Nickerson, DA, et al.. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am J Hum Genet 2012;91:224–37. https://doi.org/10.1016/j.ajhg.2012.06.007.Search in Google Scholar PubMed PubMed Central

29. Magnus, JR. The moments of products of quadratic forms in normal variables. Stat Neerl 1978;32:201–10. https://doi.org/10.1111/j.1467-9574.1978.tb01399.x.Search in Google Scholar

30. Sun, J, Oualkacha, K, Greenwood, CMT, Lakhal-Chaieb, L. Multivariate association test for rare variants controlling for cryptic and family relatedness. Can J Stat 2019;47:90–107. https://doi.org/10.1002/cjs.11475.Search in Google Scholar

31. The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 2015;526:68–74.10.1038/nature15393Search in Google Scholar PubMed PubMed Central

32. Foulkes, WD, Shuen, AY. In brief: BRCA1 and BRCA2. J Pathol 2013;230:347–9. https://doi.org/10.1002/path.4205.Search in Google Scholar PubMed

33. Beasley, TM, Erickson, S, Allison, DB. Rank-based inverse normal transformations are increasingly used, but are they merited? Behav Genet 2009;39:580–95. https://doi.org/10.1007/s10519-009-9281-0.Search in Google Scholar PubMed PubMed Central

34. de Leon, AR, Wu, B. Copula-based regression models for a bivariate mixed discrete and continuous outcome. Stat Med 2011;30:175–85. https://doi.org/10.1002/sim.4087.Search in Google Scholar PubMed

35. Cuvelier, E, Noirhomme-Fraiture, M. Clayton copula and mixture decomposition. In: Jacques, J, Philippe, L, editors. Applied Stochastic Models and Data Analysis (ASMDA 2005), Brest, 17–20 May 2005; 2005. Publication.Search in Google Scholar

36. Boyd, A, Golding, J, Macleod, J, et al.. Cohort profile: the ’children of the 90s’–the index offspring of the Avon longitudinal study of Parents and children. Int J Epidemiol 2013;42:111–27. https://doi.org/10.1093/ije/dys064.Search in Google Scholar PubMed PubMed Central

37. Walter, K, Min, JL, Huang, J, Crooks, L, Memari, Y, McCarthy, S, et al.. The UK10K project identifies rare variants in health and disease. Nature 2015;526:82–90.10.1038/nature14962Search in Google Scholar PubMed PubMed Central

38. Eriksson, M, Schönland, S, Yumlu, S, Hegenbart, U, von Hutten, H, Gioeva, Z, et al.. Hereditary apolipoprotein AI-associated amyloidosis in surgical pathology specimens. J Mol Diagn 2009;11:257–62. https://doi.org/10.2353/jmoldx.2009.080161.Search in Google Scholar PubMed PubMed Central

39. TG and HDL Working Group of the Exome Sequencing ProjectNational Heart, Lung, and Blood Institute, et al.. Loss-of-Function mutations in APOC3, triglycerides, and coronary disease. N Engl J Med 2014;371:22–31. https://doi.org/10.1056/nejmoa1307095.Search in Google Scholar

40. Ray, D, Pankow, JS, Basu, S. USAT: a unified score-based association test for multiple phenotype-genotype analysis. Genet Epidemiol 2015;40:20–34. https://doi.org/10.1002/gepi.21937.Search in Google Scholar PubMed PubMed Central

41. Yoo, YJ, Sun, L, Poirier, JG, Paterson, AD, Bull, SB. Multiple linear combination (MLC) regression tests for common variants adapted to linkage disequilibrium structure. Genet Epidemiol 2016;41:108–21. https://doi.org/10.1002/gepi.22024.Search in Google Scholar PubMed PubMed Central

42. Joe, H. Dependence modeling with copulas. Chapman and Hall/CRC; 2014.10.1201/b17116Search in Google Scholar


Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/ijb-2022-0010).


Received: 2022-01-20
Revised: 2022-05-26
Accepted: 2022-08-23
Published Online: 2022-10-24

© 2022 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 7.5.2024 from https://www.degruyter.com/document/doi/10.1515/ijb-2022-0010/html
Scroll to top button