当前位置: X-MOL 学术Genom. Proteom. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
inMTSCCA: An Integrated Multi-task Sparse Canonical Correlation Analysis for Multi-omic Brain Imaging Genetics
Genomics, Proteomics & Bioinformatics ( IF 9.5 ) Pub Date : 2023-07-11 , DOI: 10.1016/j.gpb.2023.03.005
Lei Du 1 , Jin Zhang 1 , Ying Zhao 1 , Muheng Shang 1 , Lei Guo 1 , Junwei Han 1 , 1
Affiliation  

Identifying genetic risk factors for Alzheimer’s disease (AD) is an important research topic. To date, different endophenotypes, such as imaging-derived endophenotypes and proteomic expression-derived endophenotypes, have shown the great value in uncovering risk genes compared to case–control studies. Biologically, a co-varying pattern of different omics-derived endophenotypes could result from the shared genetic basis. However, existing methods mainly focus on the effect of endophenotypes alone; the effect of cross-endophenotype (CEP) associations remains largely unexploited. In this study, we used both endophenotypes and their CEP associations of multi-omic data to identify genetic risk factors, and proposed two integrated multi-task sparse canonical correlation analysis (inMTSCCA) methods, i.e., pairwise endophenotype correlation-guided MTSCCA (pcMTSCCA) and high-order endophenotype correlation-guided MTSCCA (hocMTSCCA). pcMTSCCA employed pairwise correlations between magnetic resonance imaging (MRI)-derived, plasma-derived, and cerebrospinal fluid (CSF)-derived endophenotypes as an additional penalty. hocMTSCCA used high-order correlations among these multi-omic data for regularization. To figure out genetic risk factors at individual and group levels, as well as altered endophenotypic markers, we introduced sparsity-inducing penalties for both models. We compared pcMTSCCA and hocMTSCCA with three related methods on both simulation and real (consisting of neuroimaging data, proteomic analytes, and genetic data) datasets. The results showed that our methods obtained better or comparable canonical correlation coefficients (CCCs) and better feature subsets than benchmarks. Most importantly, the identified genetic loci and heterogeneous endophenotypic markers showed high relevance. Therefore, jointly using multi-omic endophenotypes and their CEP associations is promising to reveal genetic risk factors. The source code and manual of inMTSCCA are available at https://ngdc.cncb.ac.cn/biocode/tools/BT007330.



中文翻译:

inMTSCCA:多组学脑成像遗传学的集成多任务稀疏典型相关分析

识别阿尔茨海默病(AD)的遗传风险因素是一个重要的研究课题。迄今为止,与病例对照研究相比,不同的内表型,例如成像衍生的内表型和蛋白质组表达衍生的内表型,在发现风险基因方面显示出巨大的价值。从生物学上来说,不同组学衍生的内表型的共变模式可能是由共同的遗传基础造成的。然而,现有的方法主要只关注内表型的影响;跨内表型(CEP)关联的影响在很大程度上仍未被利用。在本研究中,我们利用多组学数据的内表型及其CEP关联来识别遗传危险因素,并提出了两种集成的多任务稀疏典型相关分析(inMTSCCA)方法,即成对内型相关引导的MTSCCA(pc MTSCCA) ) 和高阶内表型相关引导的 MTSCCA ( hoc MTSCCA)。pc MTSCCA 采用磁共振成像 (MRI) 衍生、血浆衍生和脑脊液 (CSF) 衍生内表型之间的成对相关性作为额外惩罚。hoc MTSCCA 使用这些多组学数据之间的高阶相关性进行正则化。为了找出个体和群体水平的遗传风险因素以及改变的内表型标记,我们为这两个模型引入了稀疏性惩罚。我们在模拟和真实(由神经影像数据、蛋白质组分析物和遗传数据组成)数据集上将pc MTSCCA 和hoc MTSCCA 与三种相关方法进行了比较。结果表明,我们的方法获得了比基准更好或相当的典型相关系数(CCC)和更好的特征子集。最重要的是,确定的遗传位点和异质内表型标记显示出高度相关性。因此,联合使用多组学内表型及其 CEP 关联有望揭示遗传风险因素。inMTSCCA的源代码和手册可以在https://ngdc.cncb.ac.cn/biocode/tools/BT007330获取。

更新日期:2023-07-11
down
wechat
bug