当前位置: X-MOL 学术medRxiv. Genet. Genom. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integration of transcriptomics and long-read genomics prioritizes structural variants in rare disease
medRxiv - Genetic and Genomic Medicine Pub Date : 2024-03-26 , DOI: 10.1101/2024.03.22.24304565
Tanner D Jensen , Bohan Ni , Chloe Reuter , John E Gorzynski , Sarah Fazal , Devon E Bonner , Rachel Allison Ungar , Pagé C Goddard , Archana Natarajan Raja , Euan A Ashley , Jonathan A Bernstein , Stephan Zuchner , Michael D Greicius , Stephen B Montgomery , Michael C Schatz , Matthew T Wheeler , Alexis Battle ,

Rare structural variants (SVs) – insertions, deletions, and complex rearrangements – can cause Mendelian disease, yet they remain difficult to accurately detect and interpret. We sequenced and analyzed Oxford Nanopore long-read genomes of 68 individuals from the Undiagnosed Disease Network (UDN) with no previously identified diagnostic mutations from short-read sequencing. Using our optimized SV detection pipelines and 571 control long-read genomes, we detected 716 long-read rare (MAF < 0.01) SV alleles per genome on average, achieving a 2.4x increase from short-reads. To characterize the functional effects of rare SVs, we assessed their relationship with gene expression from blood or fibroblasts from the same individuals, and found that rare SVs overlapping enhancers were enriched (LOR = 0.46) near expression outliers. We also evaluated tandem repeat expansions (TREs) and found 14 rare TREs per genome; notably these TREs were also enriched near overexpression outliers. To prioritize candidate functional SVs, we developed Watershed-SV, a probabilistic model that integrates expression data with SV-specific genomic annotations, which significantly outperforms baseline models that do not incorporate expression data. Watershed-SV identified a median of eight high-confidence functional SVs per UDN genome. Notably, this included compound heterozygous deletions in FAM177A1 shared by two siblings, which were likely causal for a rare neurodevelopmental disorder. Our observations demonstrate the promise of integrating long-read sequencing with gene expression towards improving the prioritization of functional SVs and TREs in rare disease patients.

中文翻译:

转录组学和长读长基因组学的整合优先考虑罕见疾病中的结构变异

罕见的结构变异(SV)——插入、缺失和复杂的重排——可能导致孟德尔病,但它们仍然难以准确检测和解释。我们对来自未诊断疾病网络 (UDN) 的 68 名个体的 Oxford Nanopore 长读长基因组进行了测序和分析,之前没有通过短读长测序发现诊断性突变。使用我们优化的 SV 检测管道和 571 个对照长读长基因组,我们平均每个基因组检测到 716 个长读长稀有 (MAF < 0.01) SV 等位基因,比短读长增加了 2.4 倍。为了表征罕见 SV 的功能效应,我们评估了它们与同一个体的血液或成纤维细胞的基因表达的关系,发现罕见 SV 重叠增强子在表达异常值附近富集(LOR = 0.46)。我们还评估了串联重复扩增 (TRE),发现每个基因组有 14 个罕见的 TRE;值得注意的是,这些 TRE 在过度表达异常值附近也得到了丰富。为了优先考虑候选功能 SV,我们开发了 Watershed-SV,这是一种将表达数据与 SV 特定基因组注释相结合的概率模型,其性能显着优于不包含表达数据的基线模型。 Watershed-SV 确定每个 UDN 基因组中位数有 8 个高置信度功能 SV。值得注意的是,这包括两个兄弟姐妹共有的FAM177A1复合杂合缺失,这可能是一种罕见的神经发育障碍的原因。我们的观察表明,将长读长测序与基因表达相结合,有望改善罕见疾病患者功能性 SV 和 TRE 的优先顺序。
更新日期:2024-03-26
down
wechat
bug