当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Differences in molecular sampling and data processing explain variation among single-cell and single-nucleus RNA-seq experiments
Genome Research ( IF 7 ) Pub Date : 2024-02-01 , DOI: 10.1101/gr.278253.123
John T Chamberlin , Younghee Lee , Gabor Marth , Aaron R Quinlan

A mechanistic understanding of the biological and technical factors that impact transcript measurements is essential to designing and analyzing single-cell and single-nucleus RNA sequencing experiments. Nuclei contain the same pre-mRNA population as cells, but they contain a small subset of the mRNAs. Nonetheless, early studies argued that single-nucleus analysis yielded results comparable to cellular samples if pre-mRNA measurements were included. However, typical workflows do not distinguish between pre-mRNA and mRNA when estimating gene expression, and variation in their relative abundances across cell types has received limited attention. These gaps are especially important given that incorporating pre-mRNA has become commonplace for both assays, despite known gene length bias in pre-mRNA capture. Here, we reanalyze public data sets from mouse and human to describe the mechanisms and contrasting effects of mRNA and pre-mRNA sampling on gene expression and marker gene selection in single-cell and single-nucleus RNA-seq. We show that pre-mRNA levels vary considerably among cell types, which mediates the degree of gene length bias and limits the generalizability of a recently published normalization method intended to correct for this bias. As an alternative, we repurpose an existing post hoc gene length–based correction method from conventional RNA-seq gene set enrichment analysis. Finally, we show that inclusion of pre-mRNA in bioinformatic processing can impart a larger effect than assay choice itself, which is pivotal to the effective reuse of existing data. These analyses advance our understanding of the sources of variation in single-cell and single-nucleus RNA-seq experiments and provide useful guidance for future studies.

中文翻译:

分子采样和数据处理的差异解释了单细胞和单核 RNA-seq 实验之间的差异

对影响转录测量的生物学和技术因素的机械理解对于设计和分析单细胞和单核 RNA 测序实验至关重要。细胞核含有与细胞相同的前 mRNA 群体,但它们含有一小部分 mRNA。尽管如此,早期研究认为,如果包括前 mRNA 测量,单核分析产生的结果与细胞样本相当。然而,在估计基因表达时,典型的工作流程不会区分前 mRNA 和 mRNA,并且它们在不同细胞类型之间的相对丰度的变化受到的关注有限。尽管已知前体 mRNA 捕获中的基因长度偏差,但考虑到掺入前体 mRNA 在这两种检测中已变得司空见惯,这些差距尤其重要。在这里,我们重新分析了小鼠和人类的公共数据集,以描述 mRNA 和前 mRNA 采样对单细胞和单核 RNA-seq 中基因表达和标记基因选择的机制和对比影响。我们发现,不同细胞类型之间的前体 mRNA 水平差异很大,这调节了基因长度偏差的程度,并限制了最近发表的旨在纠正这种偏差的标准化方法的普遍性。作为替代方案,我们重新利用传统 RNA-seq 基因集富集分析中现有的基于事后基因长度的校正方法。最后,我们表明,在生物信息处理中包含前体 mRNA 可以产生比分析选择本身更大的影响,这对于有效重用现有数据至关重要。这些分析增进了我们对单细胞和单核 RNA-seq 实验变异来源的理解,并为未来的研究提供了有用的指导。
更新日期:2024-02-01
down
wechat
bug