Abstract
Most human traits are influenced by the interplay between genetic and environmental factors. Many statistical methods have been proposed to screen for gene-environment interaction (GxE) in the post genome-wide association study era. However, most of the existing methods assume a linear interaction between genetic and environmental factors toward phenotypic variations, which diminishes statistical power in the case of nonlinear GxE. In this paper, we present a flexible statistical procedure to detect GxE regardless of whether the underlying relationship is linear or not. By modeling the joint genetic and GxE effects as a varying-coefficient function of the environmental factor, the proposed model is able to capture dynamic trajectories of GxE. We employ a likelihood ratio test with a fast Monte Carlo algorithm for hypothesis testing. Simulations were conducted to evaluate validity and power of the proposed model in various settings. Real data analysis was performed to illustrate its power, in particular, in the case of nonlinear GxE.
Similar content being viewed by others
Data availability
Data sharing is not applicable to this article as no new data were created in this study. The program code of the proposed method is available from the R package GEVACO.
References
Agresti A (2003) Dealing with discreteness: making `exact’ confidence intervals for proportions, differences of proportions, and odds ratios more exact. Stat Methods Med Res 12(1):3–21. https://doi.org/10.1191/0962280203sm311ra
Aschard H (2016) A perspective on interaction effects in genetic association studies. Genet Epidemiol 40(8):678–688
Aschard H, Hancock DB, London SJ, Kraft P (2010) Genome-wide meta-analysis of joint tests for genetic and gene-environment interaction effects. Hum Hered 70(4):292–300
Buniello A, MacArthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C et al (2018) The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res 47(D1):D1005–D1012. https://doi.org/10.1093/nar/gky1120
Chen H, Meigs J, Dupuis J (2014) Incorporating gene-environment interaction in testing for association with rare genetic variants. Hum Hered 78(2):81–90. https://doi.org/10.1159/000363347
Chen H, Wang C, Conomos MP, Stilp AM, Li Z, Sofer T et al (2016) Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. Am J Hum Genet 98(4):653–666
Cordell H (2009) Detecting gene–gene interactions that underlie human diseases. Nat Rev Genet 10(6):392–404
Cornelis MC, Tchetgen Tchetgen EJ, Liang L, Qi L, Chatterjee N, Hu FB, Kraft P (2011) Gene-environment interactions in genome-wide association studies: a comparative study of tests applied to empirical studies of type 2 diabetes. Am J Epidemiol 175(3):191–202. https://doi.org/10.1093/aje/kwr368
Crainiceanu C, Ruppert D (2004) Likelihood ratio tests in linear mixed models with one variance component. J R Stat Soc B 66:165–185. https://doi.org/10.1111/j.1467-9868.2004.00438.x
Crainiceanu C, Ruppert D, Claeskens G, Wand M (2005) Exact likelihood ratio tests for penalised splines. Biometrika 92(1):91–103. https://doi.org/10.1093/biomet/92.1.91
Dai JY, Logsdon BA, Huang Y, Hsu L, Reiner AP, Prentice RL, Kooperberg C (2012) Simultaneously testing for marginal genetic association and gene-environment interaction. Am J Epidemiol 176(2):164–173
del Giudice EM, Grandone A, Cirillo G, Santoro N, Amato A, Brienza C et al (2011) The association of PNPLA3 variants with liver enzymes in childhood obesity is driven by the interaction with abdominal fat. PLoS ONE 6(11):e27933. https://doi.org/10.1371/journal.pone.0027933
Fan J, Zhang W (2008) Statistical methods with varying coefficient models. Stat Interface 1(1):179
Hastie T, Tibshirani R (1993) Varying-coefficient models. J R Stat Soc B 55(4):757–796
Hudson RR (2002) Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18(2):337–338
Hunter DJ (2005) Gene-environment interactions in human diseases. Nat Rev Genet 6(4):287–298. https://doi.org/10.1038/nrg1578
Jiang L, Zheng Z, Fang H, Yang J (2021) A generalized linear mixed model association tool for biobank-scale data. Nat Genet 53(11):1616–1621. https://doi.org/10.1038/s41588-021-00954-4
Kerin M, Marchini J (2020) Inferring gene-by-environment interactions with a Bayesian whole-genome regression model. Am J Hum Genet 107(4):698–713
Khoury MJ, Adams MJ Jr, Flanders WD (1988) An epidemiologic approach to ecogenetics. Am J Hum Genet 42(1):89–95
Kraft P, Yen Y-C, Stram DO, Morrison J, Gauderman WJ (2007) Exploiting gene-environment interaction to detect genetic associations. Hum Hered 63(2):111–119
Lazarus JV, Mark HE, Anstee QM, Arab JP, Batterham RL, Castera L et al (2021) Advancing the global public health agenda for NAFLD: a consensus statement. Nat Rev Gastroenterol Hepatol. https://doi.org/10.1038/s41575-021-00523-4
Li Y, Xing C, Tian Z, Ku HC (2012) Genetic variant I148M in PNPLA3 is associated with the ultrasonography-determined steatosis degree in a Chinese population. BMC Med Genet 13:113. https://doi.org/10.1186/1471-2350-13-113
Lim E, Chen H, Dupuis J, Liu C-T (2020) A unified method for rare variant analysis of gene-environment interactions. Stat Med 39(6):801–813
Manning AK, LaValley M, Liu CT, Rice K, An P, Liu Y et al (2011) Meta-analysis of gene-environment interaction: joint estimation of SNP and SNP× environment regression coefficients. Genet Epidemiol 35(1):11–18
Mathieson I, McVean G (2012) Differential confounding of rare and common variants in spatially structured populations. Nat Genet 44(3):243
Moore R, Casale FP, Jan Bonder M, Horta D, Heijmans BT, C.’t. Hoen PA et al (2019) A linear mixed-model approach to study multivariate gene–environment interactions. Nat Genet 51(1):180–186. https://doi.org/10.1038/s41588-018-0271-0
Morris AP, Zeggini E (2010) An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet Epidemiol 34(2):188–193
Rich NE, Oji S, Mufti AR, Browning JD, Parikh ND, Odewole M et al (2018) Racial and ethnic disparities in nonalcoholic fatty liver disease prevalence, severity, and outcomes in the united states: a systematic review and meta-analysis. Clin Gastroenterol Hepatol 16(2):198-210.e192
Romeo S, Sentinelli F, Dash S, Yeo GSH, Savage DB, Leonetti F et al (2010) Morbid obesity exposes the association between PNPLA3 I148M (rs738409) and indices of hepatic injury in individuals of European descent. Int J Obes 34(1):190–194. https://doi.org/10.1038/ijo.2009.216
Ruppert D (2002) Selecting the number of knots for penalized splines. J Comput Gr Stat 11(4):735–757
Ruppert D, Wand M, Carroll R (2003) Semiparametric regression. Cambridge University Press, New York
Speed T (1991) Comment on “That BLUP is a good thing: the estimation of random effects.” Stat Sci 6(1):42–44
Stender S, Kozlitina J, Nordestgaard BG, Tybjærg-Hansen A, Hobbs HH, Cohen JC (2017) Adiposity amplifies the genetic risk of fatty liver disease conferred by multiple loci. Nat Genet 49(6):842
Sun J, Zheng Y, Hsu L (2013) A unified mixed-effects model for rare-variant association in sequencing studies. Genet Epidemiol 37(4):334–344
Thomas D (2010) Methods for investigating gene-environment interactions in candidate pathway and genome-wide association studies. Annu Rev Public Health 31:21–36
Victor RG, Haley RW, Willett DL, Peshock RM, Vaeth PC, Leonard D et al (2004) The Dallas Heart Study: a population-based probability sample for the multidisciplinary study of ethnic differences in cardiovascular health. Am J Cardiol 93(12):1473–1480
Wand MP (2003) Smoothing and mixed models. Comput Stat 18(2):223–249
Wang C, Zhan X, Bragg-Gresham J, Kang HM, Stambolian D, Chew EY et al (2014) Ancestry estimation and control of population stratification for sequence-based association studies. Nat Genet 46(4):409–415
Wang X, Lim E, Liu C-T, Sung YJ, Rao DC, Morrison AC et al (2020) Efficient gene–environment interaction tests for large biobank-scale sequencing studies. Genet Epidemiol 44(8):908–923
Westerman KE, Pham DT, Hong L, Chen Y, Sevilla-González M, Sung YJ et al (2021) GEM: scalable and flexible gene–environment interaction analysis in millions of samples. Bioinformatics 37(20):3514–3520. https://doi.org/10.1093/bioinformatics/btab223
Yang J, Lee SH, Goddard ME, Visscher PM (2011) GCTA: a tool for genome-wide complex trait analysis. Am J Human Genet 88(1):76–82
Younossi Z, Tacke F, Arrese M, Chander Sharma B, Mostafa I, Bugianesi E et al (2019) Global perspectives on nonalcoholic fatty liver disease and nonalcoholic steatohepatitis. Hepatology 69(6):2672–2682
Zhao N, Zhang H, Clark JJ, Maity A, Wu MC (2019) Composite kernel machine regression based on likelihood ratio test for joint testing of genetic and gene–environment interaction effect. Biometrics 75(2):625–637
Zhou W, Zhao Z, Nielsen JB, Fritsche LG, LeFaive J, Gagliano Taliun SA et al (2020) Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts. Nat Genet 52(6):634–639. https://doi.org/10.1038/s41588-020-0621-6
Acknowledgements
The authors thank Dr. Helen Hobbs for granting permission to use the DHS data, and acknowledge the Texas Advanced Computing Center (https://www.tacc.utexas.edu) at The University of Texas at Austin for providing high performance computing resources that have contributed to the research results reported within this paper.
Funding
This work is supported by the National Institute of Environmental Health Sciences grant R03ES034138 to C.X. and Z.Z. Z.Z. is also supported in part by the National Institute on Minority Health and Health Disparities grant 5U54MD013376-8281 and the National Institute on Aging grant U19AG078109. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
Zhengyang Zhou, Hung‑Chih Ku, Sydney E. Manning, Ming Zhang and Chao Xing declare that they have no conflict of interest.
Ethical approval
This study is a pure methodological research that does not involve any data collection and human subjects recruitment.
Informed consent
Not applicable.
Additional information
Handling Editor: Stacey S. Cherny, PhD.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, Z., Ku, HC., Manning, S.E. et al. A Varying Coefficient Model to Jointly Test Genetic and Gene–Environment Interaction Effects. Behav Genet 53, 374–382 (2023). https://doi.org/10.1007/s10519-022-10131-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10519-022-10131-w