Abstract
The phenotypic color of seeds is a complex agronomic trait and has economic and biological significance. The genetic control and molecular regulation mechanisms have been extensively studied. Here, we used a multi-omics strategy to explore the color formation in soybean seeds at a big data scale. We identified 13 large quantitative trait loci (QTL) for color with bulk segregating analysis in recombinant inbreeding lines. GWAS analysis of colors and decomposed attributes in 763 germplasms revealed associated SNP sites perfectly falling in five major QTL, suggesting inherited regulation on color during natural selection. Further transcriptomics analysis before and after color accumulation revealed 182 differentially expression genes (DEGs) in the five QTL, including known genes CHS, MYB, and F3′H involved in pigment accumulation. More DEGs with consistently upregulation or downregulation were identified as shared regulatory genes for two or more color formations while some DEGs were only for a specific color formation. For example, five upregulated DEGs in QTL qSC-3 were in flavonoid biosynthesis responsible for black and brown seed. The DEG (Glyma.08G085400) was identified in the purple seed only, which encodes gibberellin 2-beta-dioxygenase in the metabolism of colorful terpenoids. The candidate genes are involved in flavonoid biosynthesis, transcription factor regulation, gibberellin and terpenoid metabolism, photosynthesis, ascorbate and aldarate metabolism, and lipid metabolism. Seven differentially expressed transcription factors were also speculated that may regulate color formation, including a known MYB. The finds expand QTL and gene candidates for color formation, which could guide to breed better cultivars with designed colors.
Similar content being viewed by others
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Ballaré C (2003) Stress under the sun: spotlight on ultraviolet-B responses. Plant Physiol 132(4):1725–1727. https://doi.org/10.1104/pp.103.027672
Chen X, Xue H, Zhu L, Wang H, Long H, Zhao J, Meng F, Liu Y, Ye Y, Luo X, Liu Z, Xiao G, Zhu S (2022a) ERF49 mediates brassinosteroid regulation of heat stress tolerance in Arabidopsis thaliana. BMC Biol 20(1):254. https://doi.org/10.1186/s12915-022-01455-4
Chen Y, Xiong Y, Hong H, Li G, Gao J, Guo Q, Sun R, Ren H, Zhang F, Wang J, Song J, Qiu L (2022b) Genetic dissection of and genomic selection for seed weight, pod length, and pod width in soybean. Crop J. https://doi.org/10.1016/j.cj.2022.11.006
Cho YB, Jones SI, Vodkin L (2013) The transition from primary siRNAs to amplified secondary siRNAs that regulate chalcone synthase during development of Glycine max seed coats. PLoS One 8(10):1–10. https://doi.org/10.1371/journal.pone.0076954
Cho YB, Jones SI, Vodkin LO (2017) Mutations in argonaute5 illuminate epistatic interactions of the K1 and I loci leading to saddle seed color patterns in Glycine max. Plant Cell 29(4):708–725. https://doi.org/10.1105/tpc.17.00162
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, Depristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST (2011) The variant call format and VCFtools. Bioinformatics 27(15):2156–2158. https://doi.org/10.1093/bioinformatics/btr330
Darrigues A, Hall J, Knaap EVD, Francis DM, Gray S (2008) Tomato analyzer-color test: a new tool for efficient digital phenotyping. J Am Soc Hortic 133(4):579–586. https://doi.org/10.21273/JASHS.133.4.579
Dixon RA, Sumner LW (2003) Legume natural products: understanding and manipulating complex pathways for human and animal health. Plant Physiol 131(3):878–885. https://doi.org/10.1104/pp.102.017319
Dobbels AA, Michno JM, Campbell BW, Virdi KS, Stec AO, Muehlbauer GJ, Naeve SL, Stupar RM (2017) An induced chromosomal translocation in soybean disrupts a KASI ortholog and is associated with a high-sucrose and low-oil seed phenotype. G3 Genesgenetics 7(4):1215–1223. https://doi.org/10.1534/g3.116.038596
Dong SS, He WM, Ji JJ, Zhang C, Guo Y, Yang TL (2020) LDBlockShow: a fast and convenient tool for visualizing linkage disequilibrium and haplotype blocks based on variant call format files. Brief Bioinform 22(4). https://doi.org/10.1093/bib/bbaa227
Fang C, Li C, Li W, Wang Z, Zhou ZK, Shen YT, Wu M, Wu YS, Li GQ, Kong LA, Liu CM, Jackson SA, Tian Z (2014) Concerted evolution of D1 and D2 to regulate chlorophyll degradation in soybean. Plant J 77(5):700–712. https://doi.org/10.1111/tpj.12419
Gao R, Han T, Xun H, Zeng X, Li P, Li Y, Wang Y, Shao Y, Cheng X, Feng X, Zhao J, Wang L, Gao X (2021) MYB transcription factors GmMYBA2 and GmMYBR function in a feedback loop to control pigmentation of seed coat in soybean. J Exp Bot 72(12):4401–4418. https://doi.org/10.1093/jxb/erab152
Gillman JD, Tetlow A, Lee JD, Shannon JG, Bilyeu K (2011) Loss-of-function mutations affecting a specific Glycine max R2R3 MYB transcription factor result in brown hilum and brown seed coats. BMC Plant Biol 11(1):155–155. https://doi.org/10.1186/1471-2229-11-155
Gu X, Bar-Peled M (2004) The biosynthesis of UDP-galacturonic acid in plants. Functional cloning and characterization of Arabidopsis UDP-d-glucuronic acid 4-epimerase. Plant Physiol 136(4):4256–4264. https://doi.org/10.1104/pp.104.052365
Han J, Xie X, Zhang Y, Yu X, He G, Li Y, Yang G (2022) Evolution of the dehydration-responsive element-binding protein subfamily in green plants. Plant Physiol 190(1):421–440. https://doi.org/10.1093/plphys/kiac286
Hedden P (2020) The current status of research on gibberellin biosynthesis. Plant Cell Physiol 61(11):1832–1849. https://doi.org/10.1093/pcp/pcaa092
Hill JT, Demarest BL, Bisgrove BW, Gorsi B, Yost HJ (2013) MMAPPR: mutation mapping analysis pipeline for pooled RNA-seq. Genome Res 23(4):687–697. https://doi.org/10.1101/gr.146936.112
Jiang W, Yu D (2009) Arabidopsis WRKY2 transcription factor mediates seed germination and postgermination arrest of development by abscisic acid. BMC Plant Biol 9(1):96. https://doi.org/10.1186/1471-2229-9-96
Khoo HE, Azlan A, Tang ST, Lim SM (2017) Anthocyanidins and anthocyanins: colored pigments as food, pharmaceutical ingredients, and the potential health benefits. Food Nutr Res 61(1):1361779. https://doi.org/10.1080/16546628.2017.1361779
Kim JH, Park JS, Lee CY, Jeong MG, Xu JL, Choi Y, Jung HW, Choi HK (2020) Dissecting seed pigmentation-associated genomic loci and genes by employing dual approaches of reference-based and k-mer-based GWAS with 438 Glycine accessions. PLoS One 15(12):e0243085. https://doi.org/10.1371/journal.pone.0243085
Koes R, Verweij W, Quattrocchio F (2005) Flavonoids: a colorful model for the regulation and evolution of biochemical pathways. Trends Plant Sci 10(5):236–242. https://doi.org/10.1016/j.tplants.2005.03.002
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9(4):357–359. https://doi.org/10.1038/nmeth.1923
Li MW, Lam HM (2022) Genomic studies of plant-environment interactions. Int J Mol Sci 23(11):13943. https://doi.org/10.3390/ijms23115871
Li YH, Qin C, Wang L, Jiao CZ, Hl H, Tian Y, Li YF, Xing GN, Wang J, Gu YZ, Gao XP, Li DL, Li HY, Liu ZX, Jing X, Feng BB, Zhao T, Guan RX, Guo Y et al (2022) Genome-wide signatures of the geographic expansion and breeding of soybean. Sci China Life Sci 454:1–16. https://doi.org/10.1007/s11427-022-2158-7
Maoka T (2020) Carotenoids as natural functional pigments. J Nat Med 74(1):1–16. https://doi.org/10.1007/s11418-019-01364-x
McClean PE, Bett KE, Stonehouse R, Lee R, Pflieger S, Moghaddam SM, Geffroy V, Miklas P, Mamidi S (2018) White seed color in common bean (Phaseolus vulgaris) results from convergent evolution in the P (pigment) gene. New Phytol 219(3):1112–1123. https://doi.org/10.1111/nph.15259
Mckenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20(9):1297–1303. https://doi.org/10.1101/gr.107524.110
Nakano M, Yamada T, Masuda Y, Sato Y, Kobayashi H, Ueda H, Morita R, Nishimura M, Kitamura K, Kusaba M (2014) A green-cotyledon/stay-green mutant exemplifies the ancient whole-genome duplications in soybean. Plant Cell Physiol 55(10):1763–1771. https://doi.org/10.1093/pcp/pcu107
Palmer RG, Pfeiffer TW, Buss GR, Kilen TC (2004) Qualitative genetics in soybeans: improvement, produnction, and uses, 3rd edn. ASA, CSSA, AND SSSA, Madison(WI), pp 137–233
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904–909. https://doi.org/10.1038/ng1847
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira M, Bender D, Maller J, Sklar P, Bakker P, Daly MJ (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575. https://doi.org/10.1086/519795
Rodríguez GR, Moyseenko JB, Robbins MD, Huarachi Morejón N, Francis DM, Esther VDK (2010) Tomato analyzer: a useful software application to collect accurate and detailed morphological and colorimetric data from two-dimensional objects. J Vis Exp 37(37):e1856. https://doi.org/10.3791/1856
Sadohara R, Long Y, Izquierdo P, Urrea CA, Morris D, Cichy K (2021) Seed coat color genetics and genotype × environment effects in yellow beans via machine-learning and genome-wide association. Plant Genome 15:e20173. https://doi.org/10.1002/tpg2.20173
Saghai-Maroof MA (1985) Ribosomal DNA spacer-length polymorphisms in barley: mendelian inheritance, chromosomal location, and population dynamics. P Natl Acad Sci 81(24):8014–8018. https://doi.org/10.1073/pnas.81.24.8014
Sakuma Y, Liu Q, Dubouzet JG, Abe H, Shinozaki K, Yamaguchi-Shinozaki K (2002) DNA-binding specificity of the ERF/AP2 domain of Arabidopsis DREBs, transcription factors involved in dehydration- and cold-inducible gene expression. Biochem Biophys Res Commun 290(3):998–1009. https://doi.org/10.1006/bbrc.2001.6299
Santos-Buelga C, Mateus N, Freitas VD (2014) Anthocyanins. plant pigments and beyond. J Agric Food Chem 62(29):6879–6884. https://doi.org/10.1021/jf501950s
Senda M, Nishimura S, Kasai A, Yumoto S, Takada Y, Tanaka Y, Ohnishi S, Kuroda T (2013) Comparative analysis of the inverted repeat of a chalcone synthase pseudogene between yellow soybean and seed coat pigmented mutants. Breed Sci 63(4):384–392. https://doi.org/10.1270/jsbbs.63.384
Senda M, Masuta C, Ohnishi S, Goto K, Kasai A (2004) Patterning of virus-infected Glycine max seed coat is associated with suppression of endogenous silencing of chalcone synthase genes. Plant Cell 16(4):807–818. https://doi.org/10.1105/tpc.019885
Smirnoff N (2018) Ascorbic acid metabolism and functions: a comparison of plants and mammals. Free Radic Biol Med 122:116–129. https://doi.org/10.1016/j.freeradbiomed.2018.03.033
Sonah H, O'Donoughue L, Cober E, Rajcan I, Belzile F (2014) Identification of loci governing eight agronomic traits using a GBS-GWAS approach and validation by QTL mapping in soya bean. Plant Biotechnol J 13(2):211–221. https://doi.org/10.1111/pbi.12249
Song J, Li Z, Liu Z, Guo Y, Qiu L-J (2017) Next-generation sequencing from bulked-segregant analysis accelerates the simultaneous identification of two qualitative genes in soybean. Front Plant Sci 8:919. https://doi.org/10.3389/fpls.2017.00919
Song J, Liu Z, Hong H, Ma Y, Tian L, Li X, Li Y-H, Guan R, Guo Y, Qiu L-J (2016) Identification and validation of loci governing seed coat color by combining association mapping and bulk segregation analysis in soybean. PLoS One 11(7):e0159064. https://doi.org/10.1371/journal.pone.0159064
Sun L, Miao Z, Cai C, Zhang D, Zhao M, Wu Y, Zhang X, Swarm SA, Zhou L, Zhang ZJ (2015) GmHs1-1, encoding a calcineurin-like protein, controls hard-seededness in soybean. Nat Genet 47(8):939. https://doi.org/10.1038/ng.3339
Sun T, Rao S, Zhou X, Li L (2022) Plant carotenoids: recent advances and future perspectives. Mol Horticulture 2(1):3. https://doi.org/10.1186/s43897-022-00023-2
Toda K, Yang D, Yamanaka N, Watanabe S, Harada K, Takahashi R (2002) A single-base deletion in soybean flavonoid 3′-hydroxylase gene is associated with gray pubescence color. Plant Mol Biol 50(2):187–196
Todd JJ (1996) Duplications that suppress and deletions that restore expression from a chalcone synthase multigene family. Plant Cell 8(4):687–699. https://doi.org/10.1105/tpc.8.4.687
Tuteja JH, Zabala G, Varala K, Hudson M, Vodkin LO (2009) Endogenous, tissue-specific short interfering RNAs silence the chalcone synthase gene family in Glycine max seed coats. Plant Cell 21(10):3063–3077. https://doi.org/10.1105/tpc.109.069856
Wallace TC, Giusti MM (2015) Anthocyanins. Adv Nutr 6(5):620–622. https://doi.org/10.3945/an.115.009233
Wang J, Wang H, Fu Y, Huang T, Liu Y, Wang X (2021) Genetic variance and transcriptional regulation modulate terpenoid biosynthesis in trichomes of Nicotiana tabacum under drought. Ind Crop Prod 167:113501. https://doi.org/10.1016/j.indcrop.2021.113501
Wang M, Li W, Fang C, Xu F, Liu Y, Wang Z, Yang R, Zhang M, Liu S, Lu S (2018) Parallel selection on a dormancy gene during domestication of crops from multiple families. Nat Genet 50(10):1435–1441. https://doi.org/10.1038/s41588-018-0229-2
Wang X, Liu B-y, Zhao Q, Sun X, Li Y, Duan Z, Miao X, Luo S, Li J (2019) Genomic variance and transcriptional comparisons reveal the mechanisms of leaf color affecting palatability and stressed defense in tea plant. Genes 10(11):929
Xie DY, Sharma SB, Paiva NL, Ferreira D, Dixon RA (2003) Role of anthocyanidin reductase, encoded by BANYULS in plant flavonoid biosynthesis. Science 299:396–399. https://doi.org/10.1126/science.1078540
Xie M, Chung YL, Li MW, Wong FL, Wang X, Liu A, Wang Z, Leung KY, Wong TH, Tong SW (2019) A reference-grade wild soybean genome. Nat Commun 10(1):1216. https://doi.org/10.1038/s41467-019-09142-9
Yang K, Jeong N, Moon JK, Lee YH, Lee SH, Kim HM, Hwang CH, Back K, Palmer RG, Jeong SC (2010) Genetic analysis of genes controlling natural variation of seed coat and flower colors in soybean. J Hered 101(6):757–768. https://doi.org/10.1093/jhered/esq078
Yang X, Xia X, Zhang Z, Nong B, Zeng Y, Wu YY, Xiong F, Zhang YX, Liang HF, Pan YH, Dai GX, Deng GF, Li D (2019) Identification of anthocyanin biosynthesis genes in rice pericarp using PCAMP. Plant Biotechnol J 17(9):1700–1702. https://doi.org/10.1111/pbi.13133
Yazaki K, Arimura G-i, Ohnishi T (2017) ‘Hidden’ terpenoids in plants: their biosynthesis, localization and ecological roles. Plant Cell Physiol 58(10):1615–1621. https://doi.org/10.1093/pcp/pcx123
Yuan B, Yuan C, Wang Y, Liu X, Qi G, Wang Y, Dong L, Zhao H, Li Y, Dong Y (2022) Identification of genetic loci conferring seed coat color based on a high-density map in soybean. Front Plant Sci 13:968618. https://doi.org/10.3389/fpls.2022.968618
Zabala G, Vodkin L (2003) Cloning of the pleiotropic T locus in soybean and two recessive alleles that differentially affect structure and expression of the encoded flavonoid 3' hydroxylase. Genetics 163(1):295–309
Zabala G, Vodkin LO (2007) A rearrangement resulting in small tandem repeats in the F3′5′H gene of white flower genotypes is associated with the soybean locus. Crop Sci 47(S2):S113–S124. https://doi.org/10.2135/cropsci2006.12.0838tpg
Zabala G, Vodkin LO, Cui Z (2014) Methylation affects transposition and splicing of a large CACTA transposon from a MYB transcription factor regulating anthocyanin synthase genes in soybean seed coats. PLoS One 9(11):e111959. https://doi.org/10.1371/journal.pone.0111959
Zhang S, Du H, Ma Y, Li H, Kan G, Yu D (2021) Linkage and association study discovered loci and candidate genes for glycinin and β-conglycinin in soybean (Glycine max L. Merr.). Theor Appl Genet 134(3):1201–1215. https://doi.org/10.1007/s00122-021-03766-6
Zhang Y, Guo C, Deng M, Li S, Chen Y, Gu X, Tang G, Lin Y, Wang Y, He W, Li M, Zhang Y, Luo Y, Wang X, Chen Q, Tang H (2022) Genome-wide analysis of the ERF family and identification of potential genes involved in fruit ripening in octoploid strawberry. Int J Mol Sci 23(18). https://doi.org/10.3390/ijms231810550
Zhou Z, Yu J, Zheng W, Gou Z, Tian Z (2015) Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat Biotechnol 33(4):408–414. https://doi.org/10.1038/nbt.3096
Funding
This work was supported by the National Natural Scientific Foundation of China (Grant No.: 32072016).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study’s conception and design. Material preparation, data collection, and analysis were performed by Jian Song, Qingyuan Guo, Ruixin Xu, and Xuewen Wang. The first draft of the manuscript was written by Jian Song and Xuewen Wang. Caiyu Wu helped on the experimental treatment, and Yinghui Li conceived the mutant and provided data support. Li-Juan Qiu supervised the project and reviewed the manuscript. Jun Wang revised the manuscript. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Key message
The genetic basis of soybean seed coat color by BSA mapping of segregation population and GWAS of 763 germplasms
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Song, J., Xu, R., Guo, Q. et al. An omics strategy increasingly improves the discovery of genetic loci and genes for seed-coat color formation in soybean. Mol Breeding 43, 71 (2023). https://doi.org/10.1007/s11032-023-01414-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11032-023-01414-z