Introduction

Cytoplasmic male sterility (CMS) is a widespread phenomenon in the plant kingdom, which produces dehiscent anthers, nonfunctional pollen, or non-pollen. In agricultural production, CMS provides an effective tool for heterosis utilization and has been extensively used in the hybrid production of rice, maize, and sorghum (Budar and Pelletier 2001). Cotton is an important economical crop and shows marked heterosis with specific hybrid combinations (Wu et al. 2019). However, the heterosis utilization of cotton was inhibited with the limited kinds of cotton CMS and the negative impact of cytoplasm on cotton yield (Li et al. 2002). Therefore, it is urgent to study the pollen abortion of cotton CMS mechanism and the germplasm innovation of cotton CMS. In the present study, cotton CMS line H276A was found by our group, which cytoplasm derived from cultivated species H276B (maintainer line). The study of H276A microspore development presented that microspore degraded from tetrad stage and no pollen was produced (Kong et al. 2017), indicating that it has important economic value. However, the molecular mechanism of pollen abortion in H276A is unknown.

A study about the molecular mechanism of plant CMS has been conducted for many years and pointed out that CMS genes resulted from rearrangements of the mitochondrial genome (Dong et al. 2007). Recently, more than 28 CMS genes have been reported, for instance, orf224 of pol Brassica (Wang et al. 1995), orf522 of sunflower (Köhler et al. 1991), urf13 of T-CMS maize (Dewey et al. 1986), and orf79 of Boro II rice (Itabashi et al. 2009). In previous studies, mitochondrial DNA sequence comparative profiles between CMS line and maintainer line were mainly strategies for identifying CMS-associated loci. Unfortunately, mitochondrial DNA differences between two materials often reflect evolutionary divergence and have no correlation with CMS. Sequence analysis of CMS genes indicated that most reported CMS genes had a similar trait that could be co-transcribed and be detected by northern blot with mitochondrial protein-coding gene. By this token, mitochondrial gene transcript comparative profile between CMS line and its maintainer line is an effective strategy to detect the CMS loci and can be extensively used in CMS gene exploring. In the cotton CMS system, we firstly analyzed the transcripts of five ATP synthase genes (atp1, atp4, atp6, atp8, and atp9) of the mitochondrial genome in H276A (Kong et al. 2019). Unfortunately, the length of their transcripts was similar in CMS line H276A and its maintainer line H276B. However, the left of the mitochondrial gene transcript profile has not been reported yet.

Pollen development is an energy-consuming process, and an undersupply of energy will cause CMS (Peng et al. 2014). The cotton mitochondrial genome included approximately 35–40 protein-coding genes that are mainly correlated with energy metabolism (Liu et al. 2013; Tang et al. 2015; Bi et al. 2016), generating mature mRNA that passes through various processing. Like those transcribed from different transcribe original sites, 3` end trimming, and RNA editing. These processes play an important role in the regulation of mitochondrial gene expression (Joachim et al. 2007). The relationship between RNA editing and CMS has been reported in other cotton CMS systems, and the results showed that no differences between the cytoplasm of male sterility and male fertility could account for the cotton CMS (Suzuki et al. 2013). However, no studies have been conducted on 5` and 3` terminals of the transcript of the mitochondrial gene in cotton crops.

In this study, the comparative analysis on transcripts of mitochondrial genes between CMS line H276A and its maintainer line H276B was conducted with northern blot, and a differential expression gene cox3 was identified. Subsequently, the expression of cox3 was detected from RNA and protein levels. Finally, to explore the reason to respond to abnormal expression of cox3, the 5` and 3`-ends of differentially expressed gene cox3 in two materials were confirmed with circularized RNA reverse-transcribed PCR (CR-RT-PCR). Our data will provide a foundation for a better understanding of the molecular mechanism of CMS and mitochondrial gene expression in cotton.

Materials and methods

Plant materials

Cotton CMS line H276A and its maintainer line H276B were used in this study. They were grown in the experiment field of Guangxi University, Nanning, China, with normal field management. Tetrad period (pollen abortion stage) floral buds of both materials were collected from both materials for northern blot, western blot, quantitative reverse-transcribed PCR (qRT-PCR), and CR-RT-PCR.

DNA, RNA, and protein isolation

Total DNA was extracted from young leaves with the cethyltrimethyl ammonium bromide (CTAB) extraction method, as described by Allen et al. (2006). Total RNA was extracted from floral buds of the tetrad stage, using Quick RNA Isolation Kit (Huayueyang, Beijing, China). The integrity and concentration of DNA and RNA were confirmed with 1% agarose gels and NanoDrop 2000 (UV spectrometer) (Thermo, Waltham, MA USA). Total protein isolating was conducted with 10% TCA/acetone, as described by Wang et al. (2010). The concentration of protein was identified using Bradford Protein Assay Kit (Tiangen, Beijing, China).

Primers and northern blot

Primers used in this study were designed based on the cotton mitochondrial genome (Liu et al. 2013) with primer 5.0 software and presented in Supplementary information 1. A total of 30 μg of RNA was used for northern blot, and the detailed operation was conducted following the manufacturer of DIG-High prime DNA Labeling and Detection Starter Kit (Roche, Mannheim, Germany).

Reverse transcribed PCR and quantitative reverse-transcribed PCR

For each sample, 1 μg total RNA was reverse transcribed into cDNA using TransScript II One-Step gDNA Removal and cDNA Synthesis SuperMix (Trans, Beijing, China). The elimination of gDNA was verified with primers of cox2, which amplified an 1887 bp fragment with gDNA, but a 393 bp fragment was amplified for cDNA. The expression level of cox3 of H276A was detected by qRT-PCR with maintainer line H276B as control. The 18S gene of cotton was used as the internal control. The standard curve and melt curve of both primer-pairs were supplied in (Supplementary information 2). The qRT-PCR was performed in a total volume of 15 μL containing 7.5 μL 2 × TransStart Tip Green qPCR Super Mix (Trans, Beijing, China), 4.9 μL double-distilled water, 0.3 μL each of primers, and 100 ng cDNA using the C1000 Touch™ Thermal Cycler (Bio-Rad, USA). The qRT-PCR conditions followed the manufacturer’s procedure: 30 s at 95℃ followed by 42 cycles of heating at 95℃ for 5 s and annealing at 60℃ for 30 s, then heating to 95℃ with an increment of 0.5℃ for 5 s to generate the melt curve. The relative expression level was calculated with the 2−△△Ct method (Livak and Schmittgen 2001).

Immunoblot analysis

Two synthetic polypeptide ESQRHSYHLVDPSPWPISG-C and HSSLAPTVEIGGIW PPKGIGVL-C corresponding to residues 3–21 and 106–127 aa of cox3, which were selected based on antigenic epitope analysis, were used to immunize rabbits to obtain the antibody serum by AB clonal company (Wuhan, China). Protein samples were mixed with an equal volume of the 4 × loading buffer, separated by 5% and 10% SDS-PAGE, and transferred onto an Immobilon-PPSQP transfer membrane (polyvinylidene fluoride (PVDF)) using a Bio-Rad mini transfer (Bio-Rad, USA). The membrane blots were incubated in blocking buffer (5% milk, 0.05% Tween-20, 0.1%, 20 mM Tris–HCl and 150 mM NaCl, pH 7.5) for 2 h at room temperature, washed twice with TBST buffer (0.05% Tween-20, 20 mM Tris–HCl and 150 mM NaCl, pH 7.5), and incubated with mixture antibody of COX3 (1:3000 dilution) for 20 h at 4℃. After two rinses with TBSTT, the blots were incubated in 1:3000-diluted secondary antibody solution (HRP Goat Anti-Rabbit IgG (H + L)) (ABclonal, Wuhan, China) for 1 h at room temperature and washed twice with TBST. The membrane blots were incubated in the ECL substrate (Solarbio, Beijing, China) for 2 min. The anti-β-Actin mouse monoclonal antibody (CWBIO, Beijing, China) was selected as the internal reference. The secondary antibody of β-Actin was HRP Goat Anti-Mouse IgG(H + L) (ABclonal, Wuhan, China). The operation of β-Actin immunoblot analysis was similar to COX3.

Mapping of cox3 mRNA termini in H276A and H276B

The 5` and 3` ends of cox3 mRNA were determined by CR-RT-PCR according to Kuhn and Binder (2002) with some minor modifications (Fig. 1). RNA self-ligation was performed using T4 RNA Ligase Kit (Thermo, Waltham, MA, USA). A total volume of 10 μL self-ligation containing 2 μg RNA, 1 μL T4 RNA Ligase, 1 μL bovine serum albumin (BSA), and 1 μL 10 × reaction buffer. After ligation, the first strand of cDNA synthesis was conducted with gene special primer cox3-AP using TransScript II one-Step gDNA Removal and cDNA synthesis Super Mix (Trans, Beijing, China). 1 μL cDNA sample was used as template in the primary amplification. The primary amplification products were diluted 20-fold with double-distilled water and used as templates for the second amplification. The products of second amplification were recovered using a DNA purification Kit (TIANGEN, Beijing, China) cloned into the PMD19-T vector and sequencing. DNAMAN (https://www.lynnon.com/dnaman.html, version 6.0399) software was used for sequence assembly and alignment.

Fig. 1
figure 1

The schematic of CR-RT-PCR analysis of cox3

Results

Transcript polymorphisms analysis of mitochondrial gene

Our previous study revealed that five ATP synthase genes (atp1, atp4, atp6, atp8, and atp9) of the mitochondrial genome showed similar transcripts in CMS line H276A and its maintainer line H276B (Kong et al. 2019). To comprehensively explore the transcript polymorphisms of mitochondrial protein-coding genes, 30 mitochondrial protein-coding genes (cox1, cox2, cox3, nad1, nad2, nad3, nad4, nad4L, nad5, nad6, nad7, nad9, rpl2, rpl5, rpl10, rpl16, rps3, rps4, rps7, rps10, rps12, rps14, ccmb, ccmc, ccmFN, mttB, ccmFC, cob, matR, sdh4) were examined by northern blot between CMS line H276A and its maintainer line H276B. The data showed that no transcript length polymorphisms were detected in all of the 30 genes (Supplementary information 3). It indicated no novel orf was homology with the mitochondrial protein-coding gene in CMS line H276A. In addition, we found that the expression of cox3 was significantly decreased in CMS line H276A compared with its maintainer line H276B. The amount of the cox3 transcript in H276A was approximately 40.73% (with image J software) relative to that of H276B, as shown in Fig. 2a. cox3 encoded one subunit of mitochondrial electron respiratory chain, which decreasing expression will inhibit subsequent ATP synthesis. Plant flowering is an energy-consuming process; deficiency in energy supplements will cause pollen abortion. Hence, we speculated the abnormal expression of cox3 might be associated with cotton H276A CMS.

Fig. 2
figure 2

(a) Northern blot of cox3 in CMS line H276A and its maintainer line H276B. lane 1: H276A; lane 2: H276B. (b) Relative expression analysis of cox3 by qRT-PCR. The housekeeping gene 18S was used as an internal control, H276A, CMS line; H276B, maintainer line. Error bars represent standard deviation (n = 3). **Remarkable significant difference at p < 0.01 level in t-test. (c) Immunoblot analysis of COX3 and internal reference β-Actin using proteins prepared from tetrad stage floral of H276A and H276B. lane 1: H276A; lane 2: H276B

mRNA and protein expression of cox3 in H276A and H276B

To verify the result of the northern blot, the relative expression level of cox3 in H276A and H276B was detected by qRT-PCR. The expression level of cox3 in the CMS line H276A was 0.39-fold, relative to that of the control H276B (P < 0.01) (Fig. 2b). The data indicated that the expression of cox3 in H276A was substantially lower than that of H276B and consistent with the northern blot. Proteins were executors of biological function and had no direct relationship between mRNA and protein expression. To investigate the variant of COX3 protein in both materials, an immunoblot was conducted. The result showed that a band of about 29 kDa was identified in both materials. However, the amount of the COX3 protein was decreased to 59.38% (with image J software) in CMS line H276A compared with that in H276B.

Transcript sequences analysis of cox3

To explain the reason for decreased expression of cox3 in CMS line H276A, we investigated the transcript sequences in both materials. Firstly, the coding region sequences of cox3 in H276A and H276B were obtained using the homology clone method (Fig. 3a). The length of the coding sequence was 798 bp and encoded 265 amino acids. Sequence analysis indicated that there was no difference between the two materials but one base conversation (C > A) at 157th which led to an amino acid change at 53th (L > I). Furthermore, five full RNA editing sites were identified at 311th, 313th, 413th, 754th, and 764th (respect with ATG) in H276A and H276B. However, editing frequency analysis indicated that they were full editing sites and no difference existed in the two materials. In addition, the amino acid changed by RNA editing is shown in Supplementary information 4.

Fig. 3
figure 3

Cloning of cox3 and its full-length transcripts in CMS line H276A and its maintainer line H276B. (a) Amplification of coding sequences of cox3. Maker: BM2000, lane 1: cox3 gDNA of H276A, lane 2: cox3 gDNA of H276B, lane 3: cox3 cDNA of H276A, lane 4: cox3 cDNA of H276B. (b) Maker: DL 2000, lanes 1 and 2: the primary and secondary CR-RT-PCR products from H276A, lanes 3 and 4: the primary and secondary CR-RT-PCR products from H276B. (c) Amplification of 5` flanking sequence of cox3 maker: DL2000, lane 1: H276A, lane 2: H276B

The CR-RT-PCR was conducted to obtain the RNA sequence of the untranslated region (UTR) of cox3. With this method, the 5` and 3` ends could be determined simultaneously (Fig. 3b). Based on the coding sequence of cox3, two pairs of primer combinations cox3-AP, cox3-SP1 and cox3-AP, cox3-SP2 were designed for CR-RT-PCR. The primers combinations of cox3-AP3, cox3-SP1 were used for primary amplification and the primers combinations of cox3-AP, cox3-SP2 for secondary amplification. The primer distance was 62 bp between cox3-SP1 and cox3-SP2, and the products were in accordance with expectations (Fig. 3b); 12 and 13 single clones from H276A and H276B were selected randomly for sequencing. Sequence analysis revealed that the 3` and 5`-ends of cox3 in H276B are uniform (+ 310/ + 314 relative to the stop codon, -411/-412 relative to the start codon) and the 3`-ends in H276A were in accordance with H276B. However, the multiple 5`-ends were mapped at -451, -464, -465, -467, -471, -472, and -508 with respect to the start codon (Fig. 4).

Fig. 4
figure 4

5` and 3` ends detected by the CR-RT-PCR. Sequences of the cDNA covering the 5`–3`-ligation sites of cotton cox3 transcripts. The numbering of the 5` end refers to the translation start codon and the 3` numbers are relative to the stop codon. The ligation site is indicated by a horizontal dash. Sequences corresponding to the 5` terminal part are given in white betters in black boxes. (a) H276B; (b) H276A

Subsequently, two pairs of primer combinations Yzcox3-1F, Yzcox3-R and Yzcox3-2F, Yzcox3-R were designed for the accuracy detecting of CR-RT-PCR. In which, Yzcox3-1F was located at the transcript special sequence of cox3 in H276A, and the other primers Yzcox3-2F, Yzcox3-R were positioned at the same transcript sequence of cox3 in both materials. The results showed that a 374 bp was obtained from all gDNA, cDNA of both materials with the primer combinations Yzcox3-2F, Yzcox3-R. With the primer combinations Yzcox3-1F, Yzcox3-R, a 486 bp band was obtained from cDNA of H276A and gDNA of both materials, but absent in cDNA of H276B (Fig. 5). The data indicated that CR-RT-PCR was reliable.

Fig. 5
figure 5

Verification of cox3 cDNA. Lanes 1 and 2: H276A cDNA; lanes 3 and 4: H276B cDNA; lanes 5 and 6: H276A gDNA; lanes 7 and 8: H276B gDNA. Lanes 1, 3, 5, and 7 with primers combination of Yzcox3-1F, Yzcox3-R; lanes 2, 4, 6, and 8 with primer combination of Yzcox3-2F, Yzcox3-R

To explore the reason why cox3 of H276A has different multiple transcript original sites from H276B, the primers which for cloning of upstream untranslated region sequence of cox3 were designed based on the cotton mitochondrial genome (Fig. 3c). After clone sequencing, 15 single nucleotide polymorphism (SNP) mutations at upstream of cox3 coding sequences were observed. Interestingly, among these SNPs, seven were located around transcript initial sites of cox3 in H276A (Fig. 6). We inferred that the seven bases mutation might be related to cox3 transcribed in advance and the novel transcript associated with decreased expression of cox3 in H276A.

Fig. 6
figure 6

Sequence analysis of cox3 between H276A and H276B. Red frames represent the initiation codon and stop codon. Black arrows represent the 5` and 3` ends of the cox3 transcript

Discussion

Plant CMS provides important breeding tools to harness heterosis in hybrid crops and supplies critical materials to study cytoplasmic-nuclear genomic interactions. Therefore, scientists have long been interested in the molecular mechanisms of CMS in plants and have made great progress (Chen and Liu 2014). At the molecular level, numerous studies have demonstrated the CMS was caused by rearrangement of the mitochondrial genome and identified many CMS genes (Hanson and Bentolila 2004). These CMS genes could be divided into three categories according to their sequence structure characteristics. Firstly, it co-transcribed with mitochondrial protein-coding gene and could be detected by its probe: like WA352 of the wild abortive rice, which co-transcribed with rpl5 and identified with rpl5 probe (Luo et al. 2013); orf456 of chili pepper, which co-transcribed with cox2 and identified with cox2 probe (Dong et al. 2007). Secondly, transcribed as independent orf: like orf182 of D1-CMS rice, which 3`-end was identified at 22 bp upstream of the nad6 stop codon and neither of the co-transcripts contained the complete coding region of nad6 (Xie et al. 2018). Thirdly, dysfunction of mitochondrial protein-coding gene: like cox2 of CMS-G in sugar beet (Ducos et al. 2010). In the present study, our results indicated no orf co-transcribed with 30 cotton mitochondrial protein-coding genes in CMS line H276A. In addition, five ATP synthase genes transcripts were detected using the same method by our group, but they also have the same transcripts between H276A and H276B (Kong et al. 2019). This illustrated CMS loci of cotton H276A did not co-transcribe with mitochondrial protein-coding gene. Therefore, we inferred that the CMS gene of H276A might be transcribed independently or another peculiar CMS molecular mechanism existed in CMS line H276A, like dysfunction of the mitochondrial protein-coding gene.

The classical electron transport chain was located on mitochondrial inner members and consisted of four large protein complexes (I, II, III, IV). Cytochrome c oxidase complex (IV) is the terminal oxidase of the classical electron transport chain (Millar et al. 2011). Plant cytochrome c oxidase complex consists of many protein subunits. For instance, 14 protein bands were separated from Arabidopsis (Millar et al. 2011). However, the plant mitochondrial genome only encoded three subunits (COX1, COX2, and COX3). CMS genes usually cause mitochondrial dysfunction by hindering the synthesis of ATP or disrupting the mitochondrial membrane and other proteins (Hanson and Bentolila 2004). The abnormal of cytochrome c oxidase complex will decrease the subsequent ATP synthesis and related with CMS. In CMS-G sugar beet, the COX2 subunit has a truncated C-terminus and cytochrome c oxidase activity was reduced by 50% (Ducos et al. 2010), implicating electron respiratory chain changes and a putative link with CMS. In CMS-AP wheat, the N terminus of cox1 participated in the composition of orf256 (Song and Hedgcoth 1994). Our study showed the expression of cox3 is 0.39-fold in H276A compared to that in H276B; this might be correlated with cotton CMS. In addition, the immunoblot analysis revealed that protein expression of COX3 was decreased in CMS line H276A. Therefore, we speculated that decreased expression of COX3 was associated with cotton H276A CMS.

Previous studies presented that the nature of the 5` end could influence its expression (Joachim et al. 2007). In this study, we found that the transcript of cox3 has a uniform 3` end, although the 5` end in H276A had seven initiation sites. This result was consistent with Arabidopsis thaliana, in which all mitochondrial gene transcripts have a single major 3` end, while multiple 5` ends were detected for several genes (Joachim et al. 2007). The terminal of cox3 analysis showed that a similar 3` end was identified in H276B compared with that in H276A. However, uniform 5` ends were detected and different from H276A. The 5` end formation of transcripts for cox3 in Arabidopsis thaliana was influenced by mitochondrial DNA sequences and RNA processing factor2 which encoded a pentatricopeptide repeat protein (PPR) (Christian et al. 2010). Many PPR proteins have been identified as restore fertility genes in the CMS system. For instance, fertility restore genes Rf0 of CMS-Ogu in Brassica (Uyttewaal et al. 2008), Rfk1 of CMS-Kos in Radish (Koizuka et al. 2010); they interrupted the function of CMS genes by modifying transcripts or regulating the expression of CMS genes. However, two materials used in the study were nearly iso-genetic lines which had backcrossed more than eight generations. Hence, this point could be excluded. In addition, seven SNPs around the initiation site of the transcript of cox3 in CMS line H276A were identified. Therefore, we predicted these SNPs might influence the 5` end formation. Due to the lower expression level and the abnormal initiation sites in the CMS line, it is suggested that the seven SNPs might be associated with the novel transcription of cox3.

Conclusion

This study presented no novel orf transcribed with mitochondrial protein genes in cotton CMS line H276A. However, the different expression gene cox3 was identified using the northern blot. In addition, immunoblot analysis revealed that the protein amount of COX3 was substantially decreased in H276A. Sequencing analysis of cox3 transcript revealed that multiple 5` ends were identified in H276A, while similar 3` ends were identified between H276A and H276B. DNA sequence analysis indicated seven SNPs around 5` ends might respond to the novel transcript and influenced the expression of cox3 and were associated with CMS. Our data will help to explore the sterile mechanisms and provide an important molecular basis for transcription of the mitochondrial gene in cotton crops.