Abstract
In genome wide association studies (GWAS), researchers are often dealing with dichotomous and non-normally distributed traits, or a mixture of discrete-continuous traits. However, most of the current region-based methods rely on multivariate linear mixed models (mvLMMs) and assume a multivariate normal distribution for the phenotypes of interest. Hence, these methods are not applicable to disease or non-normally distributed traits. Therefore, there is a need to develop unified and flexible methods to study association between a set of (possibly rare) genetic variants and non-normal multivariate phenotypes. Copulas are multivariate distribution functions with uniform margins on the [0, 1] interval and they provide suitable models to deal with non-normality of errors in multivariate association studies. We propose a novel unified and flexible copula-based multivariate association test (CBMAT) for discovering association between a genetic region and a bivariate continuous, binary or mixed phenotype. We also derive a data-driven analytic p-value procedure of the proposed region-based score-type test. Through simulation studies, we demonstrate that CBMAT has well controlled type I error rates and higher power to detect associations compared with other existing methods, for discrete and non-normally distributed traits. At last, we apply CBMAT to detect the association between two genes located on chromosome 11 and several lipid levels measured on 1477 subjects from the ASLPAC study.
Funding source: Wellcome Trust
Award Identifier / Grant number: WT091310
Funding source: Fonds de recherche Québec-Santé
Award Identifier / Grant number: 267074
Funding source: Natural Sciences and Engineering Research Council of Canada
-
Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
-
Research funding: This research was supported by the Fonds de recherche Québec-Santé [267074 to K.O.]; and the Natural Sciences and Engineering Research Council of Canada [to K.O.]. This study makes use of data generated by the UK10K Consortium, derived from samples from ALSPAC, under data access agreement ID2250. A full list of the investigators who contributed to the generation of the data is available from www.UK10K.org. Funding for UK10K was provided by the Wellcome Trust award [WT091310].
-
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.
References
1. Solovieff, N, Cotsapas, C, Lee, PH, et al.. Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet 2013;14:483–95. https://doi.org/10.1038/nrg3461.Search in Google Scholar PubMed PubMed Central
2. Stearns, FW. One hundred years of pleiotropy: a retrospective. Genetics 2010;186:767–73. https://doi.org/10.1534/genetics.110.122549.Search in Google Scholar PubMed PubMed Central
3. Williams, GC. Pleiotropy, natural selection, and the evolution of senescence. Evolution 1957;11:398. https://doi.org/10.2307/2406060.Search in Google Scholar
4. Yang, JJ, Li, J, Williams, LK, Buu, A. An efficient genome-wide association test for multivariate phenotypes based on the Fisher combination function. BMC Bioinf 2016;17:19. https://doi.org/10.1186/s12859-015-0868-6.Search in Google Scholar PubMed PubMed Central
5. Yang, Q, Wang, Y. Methods for analyzing multivariate phenotypes in genetic association studies. Int J Probab Stat 2012;2012:1–13. https://doi.org/10.1155/2012/652569.Search in Google Scholar PubMed PubMed Central
6. Schaid, DJ, Tong, X, Larrabee, B, et al.. Statistical methods for testing genetic pleiotropy. Genetics 2016;204:483–97. https://doi.org/10.1534/genetics.116.189308.Search in Google Scholar PubMed PubMed Central
7. Shriner, D. Moving toward system genetics through multiple trait analysis in genome-wide association studies. Front Genet 2012;3. https://doi.org/10.3389/fgene.2012.00001.Search in Google Scholar PubMed PubMed Central
8. Zhang, Y, Xu, Z, Shen, X, et al.. Testing for association with multiple traits in generalized estimation equations, with application to neuroimaging data. Neuroimage 2014;96:309–25. https://doi.org/10.1016/j.neuroimage.2014.03.061.Search in Google Scholar PubMed PubMed Central
9. Van der Sluis, S, Dolan, V, Li, J, et al.. MGAS: a powerful tool for multivariate gene-based genome-wide association analysis. Bioinformatics 2015;31:1007–15. https://doi.org/10.1093/bioinformatics/btu783.Search in Google Scholar PubMed PubMed Central
10. Guo, X, Liu, Z, Wang, X, Zhang, H. Genetic association test for multiple traits at gene level. Genet Epidemiol 2013;37:122–9. https://doi.org/10.1002/gepi.21688.Search in Google Scholar PubMed PubMed Central
11. Ott, J, Rabinowitz, D. A principal-components approach based on heritability for combining phenotype information. Hum Hered 1999;49:106–11. https://doi.org/10.1159/000022854.Search in Google Scholar PubMed
12. Aschard, H, Vilhjálmsson, BJ, Greliche, N, Morange, P-E, Trégouët, D-A, Kraft, P. Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies. Am J Hum Genet 2014;94:662–76. https://doi.org/10.1016/j.ajhg.2014.03.016.Search in Google Scholar PubMed PubMed Central
13. Klei, L, Luca, D, Devlin, B, Roeder, K. Pleiotropy and principal components of heritability combine to increase power for association analysis. Genet Epidemiol 2008;32:9–19. https://doi.org/10.1002/gepi.20257.Search in Google Scholar PubMed
14. Tang, CS, Ferreira, MAR. A gene-based test of association using canonical correlation analysis. Bioinformatics 2012;28:845–50. https://doi.org/10.1093/bioinformatics/bts051.Search in Google Scholar PubMed
15. Seoane, JA, Campbell, C, Day Ian, NM, et al.. Canonical correlation analysis for gene-based pleiotropy discovery. PLoS Comput Biol 2014;10:e1003876. https://doi.org/10.1371/journal.pcbi.1003876.Search in Google Scholar PubMed PubMed Central
16. Liu, Z, Lin, X. A geometric perspective on the power of principal component association tests in multiple phenotype studies. J Am Stat Assoc 2019;114:975–90. https://doi.org/10.1080/01621459.2018.1513363.Search in Google Scholar PubMed PubMed Central
17. Sun, J, Oualkacha, K, Forgetta, V, et al.. A method for analyzing multiple continuous phenotypes in rare variant association studies allowing for flexible correlations in variant effects. Eur J Hum Genet 2016;24:1344–51. https://doi.org/10.1038/ejhg.2016.8.Search in Google Scholar PubMed PubMed Central
18. Dutta, D, Scott, L, Boehnke, M, Lee, S. Multi-SKAT: general framework to test for rare-variant association with multiple phenotypes. Genet Epidemiol 2019;43:4–23. https://doi.org/10.1002/gepi.22156.Search in Google Scholar PubMed PubMed Central
19. Lee, S, Won, S, Kim, YJ, Kim, Y, Kim, B-J, Park, T. Rare variant association test with multiple phenotypes. Genet Epidemiol 2017;41:198–209. https://doi.org/10.1002/gepi.22021.Search in Google Scholar PubMed PubMed Central
20. Wang, Y, Liu, A, Mills, JL, et al.. Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models. Genet Epidemiol 2015;39:259–75. https://doi.org/10.1002/gepi.21895.Search in Google Scholar PubMed PubMed Central
21. Nelsen, RB. An Introduction to Copulas In: Springer Series in Statistics, 2nd ed. New York, NY: Springer; 2010. 2006. corr. 2. pr. softcover version of original hardcover edition 2006 ed.Search in Google Scholar
22. Konigorski, S, Yilmaz, YE, Janke, J, Bergmann, MM, Boeing, H, Pischon, T. Powerful rare variant association testing in a copula-based joint analysis of multiple phenotypes. Genet Epidemiol 2020;44:26–40. https://doi.org/10.1002/gepi.22265.Search in Google Scholar PubMed
23. Wei, Y, Liu, Y, Chen, W, Ding, Y. Gene-based association analysis for bivariate time-to-event data through functional regression with copula models. arXiv:1904.01116 [stat] 2019.10.1111/biom.13165Search in Google Scholar PubMed PubMed Central
24. Lin, X. Variance component testing in generalised linear models with random effects. Biometrika 1997;84:309–26. https://doi.org/10.1093/biomet/84.2.309.Search in Google Scholar
25. Lu, T, Shiou, S. Inverses of 2 × 2 block matrices. Comput Math Appl 2002;43:119–29. https://doi.org/10.1016/s0898-1221(01)00278-4.Search in Google Scholar
26. Davies, RB. Algorithm AS 155: the distribution of a linear combination of χ2 random variables. Appl. Stat. 1980;29:323. https://doi.org/10.2307/2346911.Search in Google Scholar
27. Wu, MC, Lee, S, Cai, T, et al.. Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 2011;89:82–93. https://doi.org/10.1016/j.ajhg.2011.05.029.Search in Google Scholar PubMed PubMed Central
28. Lee, S, Emond, MJ, Bamshad, MJ, Barnes, KC, Rieder, MJ, Nickerson, DA, et al.. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am J Hum Genet 2012;91:224–37. https://doi.org/10.1016/j.ajhg.2012.06.007.Search in Google Scholar PubMed PubMed Central
29. Magnus, JR. The moments of products of quadratic forms in normal variables. Stat Neerl 1978;32:201–10. https://doi.org/10.1111/j.1467-9574.1978.tb01399.x.Search in Google Scholar
30. Sun, J, Oualkacha, K, Greenwood, CMT, Lakhal-Chaieb, L. Multivariate association test for rare variants controlling for cryptic and family relatedness. Can J Stat 2019;47:90–107. https://doi.org/10.1002/cjs.11475.Search in Google Scholar
31. The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 2015;526:68–74.10.1038/nature15393Search in Google Scholar PubMed PubMed Central
32. Foulkes, WD, Shuen, AY. In brief: BRCA1 and BRCA2. J Pathol 2013;230:347–9. https://doi.org/10.1002/path.4205.Search in Google Scholar PubMed
33. Beasley, TM, Erickson, S, Allison, DB. Rank-based inverse normal transformations are increasingly used, but are they merited? Behav Genet 2009;39:580–95. https://doi.org/10.1007/s10519-009-9281-0.Search in Google Scholar PubMed PubMed Central
34. de Leon, AR, Wu, B. Copula-based regression models for a bivariate mixed discrete and continuous outcome. Stat Med 2011;30:175–85. https://doi.org/10.1002/sim.4087.Search in Google Scholar PubMed
35. Cuvelier, E, Noirhomme-Fraiture, M. Clayton copula and mixture decomposition. In: Jacques, J, Philippe, L, editors. Applied Stochastic Models and Data Analysis (ASMDA 2005), Brest, 17–20 May 2005; 2005. Publication.Search in Google Scholar
36. Boyd, A, Golding, J, Macleod, J, et al.. Cohort profile: the ’children of the 90s’–the index offspring of the Avon longitudinal study of Parents and children. Int J Epidemiol 2013;42:111–27. https://doi.org/10.1093/ije/dys064.Search in Google Scholar PubMed PubMed Central
37. Walter, K, Min, JL, Huang, J, Crooks, L, Memari, Y, McCarthy, S, et al.. The UK10K project identifies rare variants in health and disease. Nature 2015;526:82–90.10.1038/nature14962Search in Google Scholar PubMed PubMed Central
38. Eriksson, M, Schönland, S, Yumlu, S, Hegenbart, U, von Hutten, H, Gioeva, Z, et al.. Hereditary apolipoprotein AI-associated amyloidosis in surgical pathology specimens. J Mol Diagn 2009;11:257–62. https://doi.org/10.2353/jmoldx.2009.080161.Search in Google Scholar PubMed PubMed Central
39. TG and HDL Working Group of the Exome Sequencing ProjectNational Heart, Lung, and Blood Institute, et al.. Loss-of-Function mutations in APOC3, triglycerides, and coronary disease. N Engl J Med 2014;371:22–31. https://doi.org/10.1056/nejmoa1307095.Search in Google Scholar
40. Ray, D, Pankow, JS, Basu, S. USAT: a unified score-based association test for multiple phenotype-genotype analysis. Genet Epidemiol 2015;40:20–34. https://doi.org/10.1002/gepi.21937.Search in Google Scholar PubMed PubMed Central
41. Yoo, YJ, Sun, L, Poirier, JG, Paterson, AD, Bull, SB. Multiple linear combination (MLC) regression tests for common variants adapted to linkage disequilibrium structure. Genet Epidemiol 2016;41:108–21. https://doi.org/10.1002/gepi.22024.Search in Google Scholar PubMed PubMed Central
42. Joe, H. Dependence modeling with copulas. Chapman and Hall/CRC; 2014.10.1201/b17116Search in Google Scholar
Supplementary Material
The online version of this article offers supplementary material (https://doi.org/10.1515/ijb-2022-0010).
© 2022 Walter de Gruyter GmbH, Berlin/Boston