Endophytic fungi are microorganisms that colonize a plant’s tissue without causing any damage [1]. Plant endophytes are reservoirs of novel bioactive metabolites and hundreds of endophytes in many plants have the potential to synthesize various bioactive secondary metabolites. The Natural Products Atlas database shows that there are 20 006 fungal natural products with diverse chemical structures and biological activities [2], and they may indirectly or directly be applied as therapeutic mediators in treating various diseases. However, less than 10% of plant endophytic fungi have been studied for their metabolites [3]. Without a complete picture of endophytic fungal biosynthesis, it is difficult to rapidly obtain the natural products of interest by conventional chemical extraction and separation methods. Thus, quickly grasping the full picture of the metabolic types and biosynthesis of endophytic fungi is beneficial to rapidly increasing the ability to obtain target natural products.

Danshen is the dry root and rhizome of Salvia miltiorrhiza Bge., commonly used to treat cardiovascular and cerebrovascular disorders [4]. There are abundant endophytic fungi in the roots of S. miltiorrhiza, and some endophytic fungi can synthesize active components such as salvianolic acid C [5], tanshinone I [6], and tanshinone IIA [6]. Although there are hundreds of endophytes in S. miltiorrhiza, and their metabolic types and biosynthetic potential lack systematic understanding, thus, limiting the development and utilization of endophytes. Therefore, non-target metabolome and metagenomic binning techniques were used in this experiment to analyze the metabolites and functional genes carried by the endophytic fungi of S. miltiorrhiza (24 genera, 47 species, and 166 strains) and systematically analyze the metabolic types and biosynthetic potentials. At the genus level, the biosynthesis ability of endophytic fungi in phenolic acids, alkaloids, terpenoids, polyketides and other types of compounds was revealed to provide the basis for quickly searching for specific active substances from the endophytic fungi of S. miltiorrhiza and further development and utilization of endophytic fungi.

MATERIALS AND METHODS

Untargeted Metabolomics

Strains and cultural conditions. Endophytic fungi (24 genera, 47 species, and 166 strains) used in the current study were previously isolated from healthy S. miltiorrhiza roots (Table S1) and stored at 4°C on potato dextrose agar media (PDA) containing (g/L): potato—200 and glucose—20. The endophytes were separately inoculated into 250 mL Erlenmeyer flasks, each containing 100 mL medium of the following composition (g/L): maltobiose—20.0, sodium glutamate—10.0, KH2PO4—5.0, MgSO4·7H2O—0.3, glucose—10.0, yeast extract—3.0, corn steep liquor—1.0, mannitol—20.0, CaCO3—20.0 and 5-azacytidine— 0.01. Fermentation was carried out for 10 days at 30°C and 180 rpm.

Secondary metabolite extraction and detection. After fermentation, the cultural broth was removed by through centrifugation at 4000 g for 5 min. Mycelial biomasses were collected, washed 2–3 times with deionized water, transferred to a pre-chilled centrifuge tube and centrifuged at 5000 g and 4°C for 5 min. The obtained mycelial pellets were stored at –80°C for further analysis.

Mycelial pellets were grounded individually with liquid nitrogen and mixed in equal amounts according to fungal genera (Table S1). Mixed samples (100 mg) were resuspended with pre-chilled 80% methanol and 0.1% formic acid by the well vortexing. The samples were incubated on ice for 5 min and then centrifuged at 15 000 g and 4°C for 20 min. Some of the supernatants were diluted to a final concentration containing 53% methanol by deionized water. The samples were transferred to a fresh Eppendorf tube and centrifuged under the same conditions. Finally, the supernatant was injected into the UHPLC-MS/MS system analysis.

UHPLC-MS/MS analysis. UHPLC-MS/MS analyses were performed using a Vanquish UHPLC system (Thermo Fisher Scientific, USA) coupled with an Orbitrap Q ExactiveTM HF mass spectrometer (Thermo Fisher Scientific, USA) in Novogene Co., Ltd. (China). Samples were injected onto a Hypersil Goldcolumn (100 × 2.1 mm, 1.9 μm; Thermo Fisher Scientific, USA) using a 17-min linear gradient of eluents at a 0.2 mL/min flow rate. Eluent A (0.1% formic acid in water) and eluent B (methanol) were used for the positive polarity mode. The eluents for the negative polarity mode were eluent A (5 mM ammonium acetate, pH 9.0) and eluent B (methanol). The solvent gradient was set as follows: 2% B, 1.5 min; 2–100% B, 12.0 min; 100% B, 14.0 min; 100–2% B, 14.1 min; 2% B, 17 min. Q ExactiveTM HF mass spectrometer was operated in positive/negative polarity mode with a spray voltage of 3.2 kV, capillary temperature of 320°C, sheath gas flow rate of 40 arb and aux gasflow rate of 10 arb.

Data processing and metabolite identification. The raw data files generated by UHPLC-MS were processed using Compound Discoverer 3.1 (CD3.1, Thermo Fisher Scientific, USA) to perform peak alignment, peak picking, and quantitation for each metabolite. The main parameters were set: retention time tolerance, 0.2 min; actual mass tolerance, 5 ppm; signal intensity tolerance, 30%; signal/noise ratio, 3; and minimum intensity, 100 000. After that, peak intensities were normalized to the total spectral intensity. The normalized data were used to predict the molecular formula based on additive ions, molecular ion peaks and fragment ions. Moreover, peaks were matched with the mzCloud (https://www.mzcloud.org/), mzVault and MassList databases to obtain accurate qualitative and relative quantitative results. Statistical analyses were performed using the statistical software R (version 3.4.3), Python (version 2.7.6) and CentOS (version 6.6); when data were not normally distributed, normal transformations were attempted using of area normalization method.

Analysis of molecular network data. Metabolic pathway analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG, https://www.genome.jp/kegg/pathway.html) [7], Human Metabolome Database (HMDB, https://hmdb.ca/metabolites) and LIPID Metabolites and Pathways Strategy (LIPID MAPS, http://www.lipidmaps.org/).

Metagenomics

DNA extraction. The stored strains were inoculated into PDA media and incubated at 28°C for 5 days. Mycelia were placed in sterilized collection tubes, liquid nitrogen was added, and the frozen mycelia were ground into a powder. According to the manufacturer’s protocol, we extracted genomic DNA from culturable fungi using the OMEGA Fungal DNA Kit (D3390-02, OMEGA BioTek, GA). After extraction, 1% agarose gel electrophoresis was used to analyze the quality of the DNA. DNA concentration was measured with a micro-ultraviolet spectrophotometer (NanoPhotometer P-Class P330, Implen GmbH, Germany). After adjusting the DNA to 50 ng/μL, it was mixed into a total DNA.

Experimental procedures of metagenomic sequencing. The total DNA of the rhizosphere fungi isolated from S. miltiorrhiza was extracted for metagenomic sequencing. DNA degradation degree and potential contamination were monitored on 1% agarose gels. 1 μg DNA per sample was used as input material for the DNA sample preparations. The DNA sample was fragmented by sonication to 350 bp, then DNA fragments were end-polished, A-tailed, and ligated with the full-length adaptor for Illumina sequencing with further PCR amplification. At last, PCR products were purified (AMPure XP system, Beckman, USA), and libraries were analyzed for size distribution by Agilent2100 Bioanalyzer and quantified using real-time PCR. The index-coded samples were clustering on a cBot Cluster Generation System according to the manufacturer’s instructions. After cluster generation, the library preparations were sequenced on an Illumina HiSeq platform (USA), and paired-end reads were generated.

Information analysis of metagenomics. The raw data obtained from the Illumina HiSeq sequencing platform was processed using Readfq (version 8, https://github.com/cjfields/readfq) to acquire the clean data for subsequent analysis. Specifically, the following reads were removed during the preprocessing: (a) the reads which contain low-quality bases (default quality threshold value <38) above a certain portion (default length of 40 bp); (b) the reads in which the N base has reached a certain percentage (default length of 10 bp); (c) the reads which shared the overlap above a certain portion with adapter (default length of 15 bp). Considering the possibility of contamination with host DNA, clean data was searched against the S. miltiorrhiza genome database using Bowtie (version 2.2.4, http://bowtiebio.sourceforge.net/bowtie2/index.shtml) software under default settings. The parameters for the filtering of the host reads were as follows: –end-to-end, –sensitive, -I 200, -X 400.

All the reads not used in the forward step of all samples were combined, and then we used the software of SOAPdenovo (version 2.04)/MEGAHIT (version 1.0.4-beta) for mixed assembly with the parameters the same as single assembly and breaked the mixed assembled Scaffolds from N connection and obtained the Scaftigs. Next, we filtered the fragment shorter than 500 bp in all Scaftigs for subsequent analysis.

The Scaftigs (≥500 bp) assembled from both single and mixed fragments are all predicted the ORF by MetaGeneMark (version 2.10, http://topaz.gatech.edu/GeneMark/) software and filtered the length information shorter than 100 nt from the predicted result with default parameters. For ORF predicted, CD-HIT (version 4.5.8, http://www.bioinformatics.org/cd-hit ) software was adopted to redundancy to obtain the unique initial gene catalog (the genes here refer to the nucleotide sequences coded by unique and continuous genes) using the parameters option: -c 0.95, -G 0, -aS 0.9, -g 1, -d 0. The Clean Data of each sample was mapped to the initial gene catalog using Bowtie 2.2.4, and got the number of reads to which genes mapped in each sample with the parameter setting of –end-to-end, –sensitive, -I 200, -X 400. The genes which the number of reads ≤2 in each sample were inserted to the gene catalog (Unigenes) eventually used for subsequent analysis.

DIAMOND software (version 0.9.9, https://github.com/bbuchfink/diamond/) was used to search the Unigenes against the sequences of bacteria, fungi, archaea and viruses, which were all extracted from the NR database (version 2018-01-02, https://www.ncbi.nlm.nih.gov/) of NCBI with the parameter setting of blastp, -e 1e-5. For the finally aligned results of each sequence, as each sequence may have multiple aligned results, we chose the result of which the e value ≤ the smallest e value * 10 to take the LCA algorithm which was applied to system classification of MEGAN software to make sure the species annotation information of sequences.

We adopted DIAMOND software (version 0.9.9) to blast Unigenes to the functional database with the parameter setting of blastp, -e 1e-5. Functional databases include the KEGG database (version 2018-01-01, http://www.kegg.jp/kegg/), eggNOG database (version 4.5, http://eggnogdb.embl.de/#/app/home), CAZy database (version 201801, http://www.cazy.org/). For each sequence blast result, the best Blast Hit was used for subsequent analysis. Then we only retained the results of the endophytic fungi that we were able to isolate (47 species) (Table S2).

Binning analysis and functional annotation. MetaWRAP (https://github.com/bxlab/metaWRAP) was used for binning analysis. The reads were co-assembled within each community type with the metaWRAP-Assembly module (default parameters). Contigs shorter than 500 bp were discarded. The contigs from the co-assemblies were binned with the metaWRAP-Binning module (–metabat2 –maxbin2 –concoct parameters). Metrics related to the completeness and contamination of bins were compared using CheckM version 1.1.2 with the options lineage_wf–reduced_tree. The bins from the metaWRAP-Binning module were run through the Classify_bins module (default parameters), which makes initial taxonomy predictions of individual scaffolds. Bins were functionally annotated with the metaWRAP-Annotate_bins module (default parameters), which uses PROKKA to annotate each bin. EggNOG-mapper relies on the eggNOG database of orthologs groups (OGs), covering thousands of bacterial, archaeal, and eukaryotic organisms. KAAS (https://www.genome.jp/tools/kaas/) and eggnog-mapper were used to analyze the function of microorganisms. We predicted secondary metabolite biosynthesis gene cluster sequences with antiSMASH (https://antismash.secondarymetabolites.org/), then used Clinker (https://github.com/gamcil/clinker) to align similar biosynthetic gene clusters (BGCs).

Heatmaps, bubble plots and barplot analysis were performed on the Tutools platform (https://www.cloudtutu.com), a free online data analysis website. The flower plot was constructed with EVenn (http://www.ehbio.com/test/venn/) [8].

RESULTS

Untargeted Metabolomics

The metabolomic profiles of different genera are in Fig. S1. A total of 3016 metabolites were identified (Table S3), and of those, 697 could successfully be mapped to the KEGG pathways (Fig. 1a). The results showed 316 kinds of primary metabolites produced, including 191 amino acids and their derivatives. There were 335 kinds of secondary metabolites, including 32 terpenoids, 41 phenolic acids, 58 polyketones, and 54 alkaloids (Fig. 1b). The distribution of these compounds varies widely among different genera.

Fig. 1.
figure 1

Secondary metabolites produced by endophytic fungi of Salvia miltiorrhiza. (a) KEGG pathway annotation statistics; (b) secondary metabolites classification.

Special amino acids and sugars. A total of 191 amino acids were identified in the metabolome. Among the 17 common amino acids, 8 essential amino acids were distributed in all genera. The γ-aminobutyric acid (GABA) was mainly found in Aspergillus, Beauveria, Candida, Fusarium, Penicillium, Trichoderma, and Talaromyces. Ergothioneine was detected in most fungi; theanine was only detected in Beauveria and Purpureocillium.

In total, 29 sugars were identified. Dulcitol was distributed in Aspergillus, Beauveria, Cladosporium, Penicillium, Candida, Fusarium, and Purpureocillium. 1-Kestose was found in Gongronella, Rhodotorula, Trichoderma, Hypocreales, Moesziomyces, Chaetomium, Thielaviopsis, Exophiala, Hormonema, Pyrenochaeta, Acrostalagmus, and Candida.

Alkaloids. There were 54 alkaloids identified in the metabolome of S. miltiorrhiza endophytes, including pyrrolizidine alkaloids, tropane alkaloids, quinoline alkaloids, piperidine alkaloids, pyridine alkaloids, isoquinoline alkaloids, indole alkaloids and other 17 classes of alkaloids. Alkaloids were abundant in Candia, Cladosporium, and Penicillium, with 17, 17, and 16 species, respectively (Figs. 2a and 2b). The distribution of alkaloids in endophytic fungi of S. miltiorrhiza was uneven (Figs. 3a and 3b). Thirteen indole alkaloids were distributed in 21 genera, including Penicillium, Candida, Cladosporium, Exophiala, Gongronella, Moesziomyces, Rhodotorula, and so on, among which vincristine was distributed in Cladosporium. In addition, vincristine’s precursor, vindoline, was mainly found in Acremonium and Acrostalagmus. Another indole alkaloid harmaline was mainly distributed in Exophiala, Beauveria, Candida, and Cladosporium. Seven isoquinoline alkaloids were distributed in 16 genera, including Candida, Fusarium, and Hormonema. Atropine as a tropane was mainly distributed in Chaetomium, Thielaviopsis, and Candida. For quinolizidine alkaloids, oxymatrine was found in Gongronella, and cytisine was found in Trichoderma. Camptothecin (CPT), a quinoline alkaloid, was found in Cladosporium, and its water-soluble derivative irinotecan was found in Beauveria and Talaromyces.

Fig. 2.
figure 2

Secondary metabolites produced by different genera of S. miltiorrhiza endophytes. (a) Heatmap of secondary metabolites; (b) bubble plot of secondary metabolites; (c) bubble plot of related enzymes; (d) heatmap of related enzymes.

Fig. 3.
figure 3

Different alkaloids (a, b), terpenoids (c, d) and phenolic acids (e, f) produced by different genera of endophyte fungi isolated from S. miltiorrhiza. Distribution (a, c, e) and the number (b, d, f) of alkaloids, terpenoids and phenolic acids, respectively, in endophytic fungi.

Terpenoids. There were 32 terpenoids detected, including monoterpenes, sesquiterpenes, diterpenes, triterpenes, steroids, and iridoid glycosides. Among them, diterpenoids had the most abundant in ten species. The three genera Candida, Cladosporium, and Penicillium have the most detected terpenoids (Figs. 2a and 2b), and the most sesquiterpenes were found in Aspergillus and Penicillium, the most diterpenes were in Talaromyces, and the most triterpenes were revealed in Hypoxylon, Penicillium, and Purpureocillium. Isophorone, carvone, and ursolic acid were widely distributed in the endophytic fungi of S. miltiorrhiza (Figs. 3c and 3d). Among them, cryptotanshinone, the active ingredient of the host plant S. miltiorrhiza, was found in Trichoderma, and salvinorin, the active ingredient of Salvia divinorum, the same genus of the host, was also found in Penicillium. Gibberellin A3 was revealed in Pyrenochaeta and Candida, gibberellin A4 was detected in Rhodotorula and Talaromyces, gibberellin A7 was determined in Hypoxylon, Other terpenoids such as artemisinin was found in Moesziomyces, Aspergillus and Fusarium; andrographolide was revealed in Gongronella, Aspergillus, Candida, Penicillium, Purpureocillium, and Talaromyces, and its natural derivative 14-deoxy-11,12-dihydro andrographolide was found in Gongronella. Ginsenoside Rg3 was detected in Cladosporium.

Phenolic acids. There were 41 phenolic acid compounds detected. Penicillium, Aspergillus, Candida, and Cladosporium had more phenolic acids, of which Penicillium detected 17 compounds (Figs. 2a and 2b). 2-Hydroxycinnamic acid, mycophenolic acid, 2, 3, 4-trihydroxy benzoic acid, 3-hydroxy anthranilic acid, 3-(4-hydroxy-3-methoxyphenyl) propanoic acid, 2, 6-dihydroxybenzoic acid, and 3-hydroxybenzoic acid were widely distributed in endophytes (Figs. 3e and 3f). Mycophenolic acid as an immunosuppressive agent was found in genera except for Beauveria, Candida, Fusarium, and Purpureocillium. 3,5-Dihydroxybenzoic acid, 2-hydroxyphenylacetic acid, 3,5-dimethoxybenzoic acid, 4-hydroxy-3-methyl benzoic acid, caffeic acid, emamectin B1a, esculin, rosmarinic acid, and trans-clovamide had a narrow distribution (Figs. 3e and 3f), in which the active ingredient of S. miltiorrhiza, rosmarinic acid, was found in Fusarium, and the insecticide, emamectin B1a, was detected in Aspergillus.

Polyketones. Polyketides are a large group of complex natural products catalyzed by polyketide synthase (PKS), including flavonoids, tetracyclines, macrolides, polyethers, etc. There were 58 polyketides detected, including 41 flavonoids, 4 mycotoxins, 3 tetracycline antibiotics, 3 macrolide antibiotics, and 1 polyether. The following is a distribution analysis of compound types by flavonoids, macrolides and polyethers.

S. miltiorrhiza endophytic fungi produced a variety of flavonoids, including isoflavones, flavonols, flavanones, flavan-3-ols, chalcones, etc. Hormonema and Candida had the most flavonoids; respectively, Acrostalagmus, Rhodotorula and Hypoxylon had the least (Figs. 4a and 4b). There were 11 flavonols, mainly distributed in 15 genera such as Hormonema, Cladosporium, Penicillium, and Talaromyces, among which rutin was present in Talaromyces and Kaempferol was found in Hormonema, Beauveria, and Talaromyces. There were 9 flavones, mainly distributed in Gongronella, Hormonema, Thielaviopsis, and Penicillium, of which icaritin was found in Gongronella and Hormonema. There were 8 species of isoflavones, mainly in Thielaviopsis, Hormonema, Aspergillus, and Candida. There were 5 species of flavanones, mainly in Hormonema, Thielaviopsis, and Candida. In other polyketides, resveratrol was found in Purpureocillium, and its natural analog, piceatannol, was revealed in Hormonema, Candida, and Talaromyces. A mycotoxin sterigmatocystin was detected in Candida, aflatoxin B1 and aflatoxin M1 were mainly determined in Aspergillus, and aflatoxin G1 was mainly found in Thielaviopsis and Hormonema.

Fig. 4.
figure 4

Different polyketides produced by different genera of endophyte fungi isolated from S. miltiorrhiza. Distribution (a, c) and the number (b, d) of flavonoids and other polyketides (except flavonoids), respectively, in endophytic fungi.

Macrolide immunosuppressants are a class of natural compounds sharing a macrolide-like structure, and many clinically used antibacterial drugs are assembled and synthesized by PKS I. Among them, tylosin is distributed in Hypocreales, tacrolimus is present in Cladosporium, and oleandomycin is revealed in Gongronella, all of which are antibiotics. Polyethers are assembled and synthesized by PKS I, among which monensin is mainly distributed in Hypoxylon and Penicillium.

There were 6 polyketide antibiotics, including oxytetracycline, minocycline, tetracycline, tylosin, tacrolimus, and oleandomycin. Tylosin, tacrolimus, and oleandomycin belong to macrolides, mainly present in Hypoxylon, Gongronella, Cladosporium, Acremonium, and Pyrenochaeta. Oxytetracycline, minocycline, and tetracycline are tetracycline antibiotics. Oxytetracycline was found in Hormonema, Acremonium, Aspergillus, Beauveria, and Fusarium; minocycline was detected in Thielaviopsis; tetracycline was found in Gongronella, Moesziomyces, Cladosporium, and Penicillium. Other antibiotics were detected beyond polyketide antibiotics, such as β-lactam antibiotics penicillin G, amoxicillin and cefradine, mainly distributed in Aspergillus, Candida and Cladosporium.

Metagenomics

The sequences obtained by the metagenomic analysis were annotated with species and functions to obtain functional genes at various taxonomic levels. Annotation results that did not belong to endophytic fungi were deleted according to taxonomy, and we only retained results belonging to endophytic fungi, including 11 endophytic fungi (Table S3). Then, we obtained 217 bins by metagenomic binning (Concoct: 153, Maxbin2: 29, Metabat2: 35). The characteristics of all bins obtained by the three binning tools applied in this study are detailed in the Supporting Information Table S4. According to species annotations at the genus level, 8 genera were combined (Table S5). Finally, two results were combined to obtain 12 endophytic fungi (Rhodotorula, Aspergillus, Beauveria, Candida, Chaetomium, Exophiala, Fusarium, Hypoxylon, Penicillium, Purpureocillium, Talaromyces, and Trichoderma). They are involved in a total of 443 tertiary KEGG pathways, including the biosynthesis of amino acids, sugars, alkaloids, terpenoids, phenolic acids and polyketides (Table S6).

Special amino acids and sugars. The metagenome annotated 104 amino acid biosynthesis enzyme genes (Table S5). Most endophytic fungi genera can be annotated with more than 70 kinds of amino acid biosynthetic enzyme genes (Fig. S2). The enzyme genes for GABA synthesis were mainly found in Aspergillus, Beauveria, Candida, Fusarium, Penicillium, Trichoderma, and Talaromyces. Almost complete biosynthetic pathways of ergothioneine were detected in Beauveria, Talaromyces, Fusarium, Penicillium, and Aspergillus (Fig. S3). Theanine’s biosynthetic enzyme genes were annotated in Beauveria, Purpureocillium, Fusarium, Penicillium, and Aspergillus, indicating that these genera have the potential to synthesize theanine.

More than 6 pathways have been annotated in the KEGG pathway database, including galactose, fructose, mannose, carbon, starch and sucrose metabolism, glycolytic and pentose phosphate pathways. Dulcitol synthesis-related enzymes, aldehyde reductase genes were found in Beauveria and Talaromyces. The genes for the synthesis of 1-kestose were annotated. ꞵ-Fructofuranosidase (EC 3.2.1.26) gene was annotated in Talaromyces, Beauveria, Fusarium, Penicillium, and Aspergillus. Endo-inulinase (EC 3.2.1.7) gene was annotated in Talaromyces.

Alkaloids. Six KEGG pathways were detected in the metagenome with 21 enzyme genes related to alkaloids synthesis (Fig. 5a). More enzyme genes were annotated in Aspergillus, Beauveria, Fusarium and Penicillium (Figs. 2c and 2d). For Aspergillus, there are 4 enzyme genes of paspaline biosynthesis (map00403), 8 of isoquinoline alkaloid biosynthesis (map00950), and 6 of tropane, piperidine and pyridine alkaloid biosynthesis (map00960). For Beauveria, there are 8 enzyme genes of map00950 and 5 of map00960. For Fusarium, 7 enzyme genes of map00950, 5 of map00960, and 1 of map00403 were annotated. For Penicillium, there are 7 enzyme genes of map00950 and 4 of map00960.

Fig. 5.
figure 5

Secondary metabolite biosynthesis pathways involved in different genera of endophytic fungi isolated from S. miltiorrhiza. (a) Alkaloids. (b) Terpenoids. (c) Phenolic acids. (d) Polyketides. map00901: indole alkaloid biosynthesis, map00403: indole diterpene alkaloid biosynthesis, map00950: isoquinoline alkaloid biosynthesis, map00960: tropane, piperidine and pyridine alkaloid biosynthesis, map00232: caffeine metabolism, map00965: betalain biosynthesis, map00900: Terpenoid backbone biosynthesis, map00902: monoterpenoid biosynthesis, map00904: diterpenoid biosynthesis, map00905: brassinosteroid biosynthesis, map00908: zeatin biosynthesis, map00281: geraniol degradation, map00940: phenylpropanoid biosynthesis, map00400: Phe, Tyr and Trp biosynthesis, map01051: biosynthesis of ansamycins, map01059: biosynthesis of enediyne antibiotics, map01054: nonribosomal peptide structures, map01053: biosynthesis of siderophore group nonribosomal peptides, map01055: biosynthesis of vancomycin group antibiotics, map00523: polyketide sugar unit biosynthesis, map00945: stilbenoid, diarylheptanoid and gingerol biosynthesis, map00941: flavonoid biosynthesis, map00944: flavone and flavonol biosynthesis.

Terpenoids. Fifty-two related enzyme genes were involved in 6 KEGG pathways (Fig. 5b), and the main annotated pathways were the terpenoid backbone biosynthesis and diterpenoid biosynthesis pathway (Figs. S4 and S5). Penicillium, Talaromyces, Beauveria, Fusarium, and Aspergillus annotated the most terpene-related enzymes. Talaromyces had the most monoterpenes and diterpenes-related enzymes (Figs. 2c and 2d). Meanwhile, fungal antiSMASH analysis showed that Talaromyces could produce diterpene compound pimara-8(14), 15-diene (Fig. 6a). It indicated that Talaromyces has great potential to produce diterpenes. Among them, Talaromyces annotated the genes of ent-copalyl diphosphate synthase, gibberellin 2β-dioxygenase, and gibberellin-44 dioxygenase. Fusarium also annotated the genes of tanshinone biosynthesis-related enzymes hydroxymethylglutaryl-CoA reductase (NADPH), ent-copalyl diphosphate synthase, ent-kaurene oxidase, and geranylgeranyl diphosphate synthase.

Fig. 6.
figure 6

Comparative analyses of the S. miltiorrhiza endophytes BGC and the known BGCs bearing the highest similarity using Clinker & clustermap.js. Genes were color-coded based on homology. (a) Talaromyces; (b) Fusarium; (c) Beauveria.

Phenolic acids. There were 33 related enzyme genes annotated (Table S5), such as phenylalanine ammonia-lyase (PAL), 4-coumarate-CoA ligase (4CL), and caffeic acid 3-O-methyltransferase (COMT). The KEGG annotation indicated that S. miltiorrhiza endophytes were involved in the metabolism of two pathways for synthesizing phenolic acids, including Phe, Tyr and Trp biosynthesis pathways, and phenylpropanoid biosynthesis pathways (Fig. 4c). More enzymes related to phenolic acid synthesis were annotated in Talaromyces, Beauveria, Fusarium, Penicillium, and Aspergillus (Figs. 2c and 2d). PAL and 4CL were annotated in Aspergillus, Beauveria, Fusarium, and Talaromyces. Moreover, COMT was annotated in Talaromyces.

Polyketones. Twenty five polyketide synthesis-related enzyme genes were annotated by metagenomes, including 11 PKS I genes and 8 nonribosomal polyketide synthetase (NRPS) genes, and involved 10 KEGG pathways (Fig. 4d). Polyketo-synthetase-related genes were more distributed in Talaromyces, Beauveria, Fusarium, and Aspergillus (Figs. 2c and 2d), mainly including cis-AT PKS, trans-AT PKS, and iterative type I PKS. Eleven genes for flavonoid biosynthesis-related enzymes were annotated.

Fungal antiSMASH also annotated that Talaromyces had the ability to synthesize the natural tetracycline TAN-1612 (Fig. 6a). We also annotated Beauveria and Fusarium with the ability to synthesize siderophores ferrichrome and dimethylcoprogen (Figs. 6b and 6c), and found that Beauveria could synthesize anti-HIV phomasetin, mycotoxins tenellin, enniatin, and beauvericin (Fig. 6c).

Our study showed that the endophytic fungi isolated from S. miltiorrhiza could produce many metabolites and carry a variety of metabolite-related enzyme genes. Through untargeted metabolomics, it was found that the types of compounds produced by each fungal genus were different, and metagenomics binning technology found that the genes carried by fungi were also inconsistent. Finally, alkaloids were mainly found in Aspergillus, Candida, Beauveria, Penicillium, etc. The terpenoids were mainly found in Fusarium, Talaromyces, Candida, Beauveria, Aspergillus, and Penicillium. Talaromyces, Beauveria, Fusarium, Penicillium, Aspergillus, etc., mainly produced phenolic acids. Fusarium, Talaromyces, Candida, Beauveria, Hormonema, and Thielaviopsis mainly produced flavonoids. Antibiotics were mainly found in Aspergillus, Penicillium, Talaromyces, etc. Therefore, combining untargeted metabolomics and metagenomics can rapidly screen specific active substances of endophytic fungi.

DISCUSSION

Untargeted metabolomics combined with metagenomic analysis showed that endophytic fungi of S. miltiorrhize had the ability to synthesize various natural products such as amino acids, sugars, lipids, alkaloids, terpenoids, phenolic acids, and polyketides.

Plant endophytic fungi are reservoirs of novel bioactive metabolites, including typical secondary and specialized primary metabolites. Some active special amino acids such as ergothioneine, theanine, and GABA were annotated in the endophytic fungi metabolites of S. miltiorrhize. The anti-oxidant active compound ergothioneine [9] was distributed in the non-Saccharomyces yeasts genera Beauveria, Talaromyces, Fusarium, Penicillium, and Aspergillus, and the first three genera were newly added fungal groups in this study [10]. Theanine has relaxing, neuroprotective, blood pressure regulating, and antitumor properties [11]. It is first discovered in Beauveria and Purpureocillium. In addition, Aspergillus, Penicillium, and Fusarium also possess the ability to synthesize it. GABA is a central node in balancing metabolic fluxes between carbon and nitrogen metabolism [12], and also could be involved in fungal-plant and fungal-microbe interactions as a signaling substance.

The endophytic fungi of S. miltiorrhiza can synthesize some special functional sugars with biological activities, such as dulcitol and 1-kestose. Dulcitol is a compound separated from euonymus alatus, which has been reported to suppress the proliferation and migration of hepatocellular carcinoma via regulating the SIRT1/p53 pathway [13]. Dulcitol was detected in microbial metabolites for the first time in the genera Aspergillus, Beauveria, Cladosporium, Penicillium, Candida, Fusarium, and Purpureocillium. 1-Kestose shows more significant effects in terms of sweet taste, biological activity, and more notable effects in promoting the growth of probiotics, including Faecalibacterium prausnitzii and Bifidobacterium. Moreover, it has been revealed that Aspergillus, Scopulariopsis, Aureobasidium, and Penicillium could produce 1-kestose [14]. In addition to these fungi, we found that Gongronella, Exophiala, Acrostalagmus, Candida, Talaromyces, Beauveria, and Fusarium had the potential to produce 1-kestose. Studies suggest that sugars from microorganisms may play a key role in plant-microbe interactions, such as trehalose and its intermediate trehalose 6-phosphate as a signal metabolite of plants [15], and the results of this experiment showed that most endophytic fungi could produce trehalose. The abundant primary metabolites of the endophytic fungi of S. miltiorrhiza are not only energy resources for fungi but also have medical significance and may play a role in promoting plant-fungal symbiosis.

Secondary metabolites synthesized by endophytic fungi exhibit high biological activity. In this study, many secondary metabolites such as alkaloids, terpenoids, phenolic acids, and polyketides were detected from endophytic fungal metabolites of S. miltiorrhiza, and they carried more abundant genes of related synthetases.

The endophytic fungi of S. miltiorrhiza can synthesize some alkaloids with important biological activities and are also important biological sources. CPT, a natural monoterpene-quinoline alkaloid, exhibits anticancer activity by inhibiting topoisomerase I [16]. Talaromyces, Alternaria, Aspergillus, Fusarium and other genera have reported the production of CPT [17]. In this experiment, CPT was annotated in Cladosporium, and irinotecan, a water-soluble derivative of CPT, was found in Beauveria and Talaromyces, and studies have shown that Cladosporium can produce alkaloids [18], all of which suggested that Cladosporium, Beauveria, and Talaromyces can biosynthesize CPT. Glaucine, an alkaloid isolated from the tuber of Corydalis turtschaninovii, is a cough suppressant in some countries and has also been shown to inhibit tumor cell proliferation, including leukemia, cervical cancer, bladder, breast, and colon cancer cells [19]. There is no report of finding glaucine from fungi, but this study found that Fusarium had the possibility of producing glaucine, and the first rate-limiting enzyme Trp dehydrogenase was annotated in Fusarium. Vincristine is a potent antitumor drug, which has been isolated from the metabolites of Aspergillus, Alternaria, Cladosporium, and Talaromyces [20], which are consistent with the results of this experiment.

Many studies have shown terpenoids have significant biological activities, such as artemisinin with antimalarial activity, ginsenoside Rg3 with antitumor activity, etc. [21]. In this experiment, artemisinin was detected in Moesziomyces, Aspergillus, and Fusarium for the first time, ginsenoside Rg3 was revealed in Cladosporium for the first time, and cryptotanshinone was detected in Trichoderma. Cryptotanshinone is one of the main bioactive components of the medicinal plant S. miltiorrhiza and has many biological effects such as anti-inflammation, anti-oxidant and anticancer [22]. Metagenome results showed that Aspergillus, Trichoderma, and Fusarium carried the hydroxymethylglutaryl-CoA synthase gene, a key enzyme in the upstream terpenoid pathway, and terpenoids have been isolated from these 3 genera [2325]. Andrographolide and its derivative 14-deoxy-11,12-didehydroandrographolide with anticancer effect were found in the endophyte fungi of S. miltiorrhiza.

Phenolic acids are important active plant substances and protect against ultraviolet light, anti-insect, antibacterial, allelopathic, and fungal attractants. In this experiment, rosmarinic acid was identified in Fusarium. Although only Alternaria tenuissima has been reported to produce rosmarinic acid before [26], the metagenomic annotation results also showed that Fusarium could produce phenolic acids (Fig. 5b). Mycophenolic acid is the active ingredient of the immunosuppressant mycophenolate mofetil, which is widely used in transplantation medicine and autoimmune diseases [27]. It has been reported that Penicillium, Aspergillus, and Eurotium can produce mycophenolic acid [28], which is consistent with the results of this study, and this experiment found more endophytes that can produce mycophenolic acid. Psoralidin has anti-oxidant, anti-apoptotic, anti-inflammatory, and antitumor effects [29]. This experiment found that endophytic fungi from S. miltiorrhiza can produce psoralidin for the first time.

Polyketides have important pharmacological activities such as antibacterial, anticancer, anti-oxidant, antiparasitic, and anti-inflammatory properties. Phellamurin suppresses the viability and induces apoptosis in OS cells by inhibiting the PI3K/AKT/mTOR pathway [30], and we detected phellamurin for the first time in fungi Thielaviopsis and Fusarium. Resveratrol and its analog piceatannol exert cardioprotective, neuroprotective, antidiabetic, anti-inflammatory, cancer-preventive, and therapeutic effects. Several genera, such as Penicillium, Aspergillus, Fusarium, and Alternaria, have been shown to produce resveratrol [31]. For the first time, we found that Purpureocillium also can produce resveratrol. Monensin is one of the most widely studied ionophore antibiotics. Consistent with the results of this experiment, Penicillium can produce monensin [32]. Aspergillus was found to produce β-lactam antibiotics [33], which is consistent with the results of this experiment. In addition to Aspergillus, this experiment also found that Candida and Cladosporium have the ability to produce β-lactam antibiotics.

Compared with whole-genomics, metagenomics has the advantage of being inexpensive. High throughput sequencing of metagenomes is amenable to downstream analysis, giving insight into precise descriptions of strains of biotechnological importance as well as novel genes, further analyzing genes involved in a number of biochemical pathways as well as interactions among the microbes in their natural environment or bioreactors [34]. MetaWRAP is a flexible pipeline that can handle common tasks in metagenomic data analysis using bin refinement and reassembly, which improved draft genome recovery from shotgun metagenomic data. Because of its high sensitivity, throughput, and detectability of residual compounds, by using LC-MS/MS. Sample processing can become easier, time-saving, and less complex, and metabolites that are hard to identify, as well as improve the specificity of analysis, can be identified and separated [35]. By combining untargeted metabolomics and metagenomics binning, we successfully and quickly grasped the biosynthesis and metabolism of the endophytic fungi of S. miltiorrhiza.