Skip to main content

Advertisement

Log in

Multi-omics assists genomic prediction of maize yield with machine learning approaches

  • Published:
Molecular Breeding Aims and scope Submit manuscript

Abstract

With the improvement of high-throughput technologies in recent years, large multi-dimensional plant omics data have been produced, and big-data-driven yield prediction research has received increasing attention. Machine learning offers promising computational and analytical solutions to interpret the biological meaning of large amounts of data in crops. In this study, we utilized multi-omics datasets from 156 maize recombinant inbred lines, containing 2496 single nucleotide polymorphisms (SNPs), 46 image traits (i-traits) from 16 developmental stages obtained through an automatic phenotyping platform, and 133 primary metabolites. Based on benchmark tests with different types of prediction models, some machine learning methods, such as Partial Least Squares (PLS), Random Forest (RF), and Gaussian process with Radial basis function kernel (GaussprRadial), achieved better prediction for maize yield, albeit slight difference for method preferences among i-traits, genomic, and metabolic data. We found that better yield prediction may be caused by various capabilities in ranking and filtering data features, which is found to be linked with biological meaning such as photosynthesis-related or kernel development-related regulations. Finally, by integrating multiple omics data with the RF machine learning approach, we can further improve the prediction accuracy of grain yield from 0.32 to 0.43. Our research provides new ideas for the application of plant omics data and artificial intelligence approaches to facilitate crop genetic improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

Data available within the article or its supplementary materials.

References

Download references

Acknowledgements

The authors are grateful to Dr. Xuehai Zhang for his help with the preparation of data in this paper.

Funding

This research was supported by the National Natural Science Foundation of China (32101773, 32122066) and the China Postdoctoral Science Foundation (2022M711280).

Author information

Authors and Affiliations

Authors

Contributions

Jingyun Luo and Yingjie Xiao designed and supervised this study. Chengxiu Wu performed most of the data analysis. Jingyun Luo prepared the manuscript. All authors critically read, revised, and approved the manuscript.

Corresponding authors

Correspondence to Jingyun Luo or Yingjie Xiao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, C., Luo, J. & Xiao, Y. Multi-omics assists genomic prediction of maize yield with machine learning approaches. Mol Breeding 44, 14 (2024). https://doi.org/10.1007/s11032-024-01454-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11032-024-01454-z

Keywords

Navigation