当前位置: X-MOL 学术J. Biosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identification of biomarker genes from multiple studies for abiotic stress in maize through machine learning
Journal of Biosciences ( IF 2.9 ) Pub Date : 2023-12-15 , DOI: 10.1007/s12038-023-00392-w
Leyla Nazari , Zahra Zinati , Paolo Bagnaresi

Abiotic stresses are major limiting factors for maize growth. Therefore, exploration of the mechanisms underlying the response to abiotic stress in maize is of great interest. Toward this end, we performed integration of the feature selection method into the meta-analysis of microarray gene expression. Following extraction of raw data, normalization, and batch effect removal, the data were merged into one expression profile. Differentially expressed genes (DEGs) between control and abiotic conditions were used for the feature selection algorithm to find the minimum features for high-performance classification. Feature selection was performed using a correlation-based feature selection (CFS) algorithm, considering features with a coefficient of 0.7 to 1. Different algorithms of Bayes, Functions, Lazy, Meta, Rules, and Trees were then tested in order to classify the samples and find the best performance classifier in each group. Moreover, the biological pathways and promoter motif analysis of selected genes were identified. The superior and overall performance of classification using all features (DEGs) were 98.86% (Multilayer Perceptron) and 81.25%, respectively. Classification based on feature selection resulted in an average accuracy of 94.69% and 93.56% with 33 and 12 features, respectively. Subsequently, gene ontology and promoter analysis were performed for the 12 selected biomarker genes. Five of them were downregulated and 7 were upregulated. ABRE, unnamed-1, G-box, and G-Box are motifs related to genes involved in several abiotic stress responses and are located upstream of at least nine probes in our study. This study revealed key genes associated with tolerance to abiotic stress in maize.

Graphical abstract



中文翻译:


通过机器学习从玉米非生物胁迫的多项研究中鉴定生物标记基因



非生物胁迫是玉米生长的主要限制因素。因此,探索玉米响应非生物胁迫的机制具有很大的意义。为此,我们将特征选择方法整合到微阵列基因表达的荟萃分析中。在提取原始数据、标准化和批次效应去除之后,数据被合并成一个表达谱。对照和非生物条件之间的差异表达基因(DEG)用于特征选择算法,以找到高性能分类的最小特征。使用基于相关性的特征选择(CFS)算法进行特征选择,考虑系数为0.7到1的特征。然后测试贝叶斯、函数、惰性、元、规则和树的不同算法以对样本进行分类并找到每组中最好的性能分类器。此外,还确定了所选基因的生物学途径和启动子基序分析。使用所有特征 (DEG) 进行分类的优越性和整体性能分别为 98.86%(多层感知器)和 81.25%。基于特征选择的分类的平均准确率分别为 94.69% 和 93.56%(特征数为 33 和 12)。随后,对12个选定的生物标志物基因进行了基因本体和启动子分析。其中 5 个下调,7 个上调。 ABRE、unnamed-1、G-box 和 G-Box 是与参与几种非生物胁迫反应的基因相关的基序,在我们的研究中位于至少 9 个探针的上游。这项研究揭示了与玉米非生物胁迫耐受性相关的关键基因。

 图形概要

更新日期:2023-12-17
down
wechat
bug