当前位置: X-MOL 学术Genet. Epidemiol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The sequence kernel association test for multicategorical outcomes
Genetic Epidemiology ( IF 2.1 ) Pub Date : 2023-04-19 , DOI: 10.1002/gepi.22527
Zhiwen Jiang 1 , Haoyu Zhang 2 , Thomas U Ahearn 2 , Montserrat Garcia-Closas 2 , Nilanjan Chatterjee 3 , Hongtu Zhu 1 , Xiang Zhan 4 , Ni Zhao 3
Affiliation  

Disease heterogeneity is ubiquitous in biomedical and clinical studies. In genetic studies, researchers are increasingly interested in understanding the distinct genetic underpinning of subtypes of diseases. However, existing set-based analysis methods for genome-wide association studies are either inadequate or inefficient to handle such multicategorical outcomes. In this paper, we proposed a novel set-based association analysis method, sequence kernel association test (SKAT)-MC, the sequence kernel association test for multicategorical outcomes (nominal or ordinal), which jointly evaluates the relationship between a set of variants (common and rare) and disease subtypes. Through comprehensive simulation studies, we showed that SKAT-MC effectively preserves the nominal type I error rate while substantially increases the statistical power compared to existing methods under various scenarios. We applied SKAT-MC to the Polish breast cancer study (PBCS), and identified gene FGFR2 was significantly associated with estrogen receptor (ER)+ and ER− breast cancer subtypes. We also investigated educational attainment using UK Biobank data ( N = 127 , 127 $N=127,127$ ) with SKAT-MC, and identified 21 significant genes in the genome. Consequently, SKAT-MC is a powerful and efficient analysis tool for genetic association studies with multicategorical outcomes. A freely distributed R package SKAT-MC can be accessed at https://github.com/Zhiwen-Owen-Jiang/SKATMC.

中文翻译:

多类别结果的序列核关联测试

疾病异质性在生物医学和临床研究中普遍存在。在遗传学研究中,研究人员越来越有兴趣了解疾病亚型的独特遗传基础。然而,现有的用于全基因组关联研究的基于集合的分析方法对于处理此类多类别结果来说要么不够充分,要么效率低下。在本文中,我们提出了一种新颖的基于集合的关联分析方法,序列核关联测试(SKAT)-MC,用于多类别结果(名义或序数)的序列核关联测试,它联合评估一组变体之间的关系(常见和罕见)和疾病亚型。通过全面的模拟研究,我们表明,与各种场景下的现有方法相比,SKAT-MC 有效地保留了名义 I 类错误率,同时显着提高了统计功效。我们将 SKAT-MC 应用于波兰乳腺癌研究 (PBCS),并发现基因FGFR2与雌激素受体 (ER)+ 和 ER− 乳腺癌亚型显着相关。我们还使用英国生物银行数据调查了受教育程度( = 127 , 127 $N=127,127$ )与 SKAT-MC,并鉴定了基因组中的 21 个重要基因。因此,SKAT-MC 是一种强大而高效的分析工具,适用于具有多类别结果的遗传关联研究。可以在 https://github.com/Zhiwen-Owen-Jiang/SKATMC 访问免费分发的 R 包 SKAT-MC。
更新日期:2023-04-19
down
wechat
bug