1 Introduction

Inequalities across gender and racial lines are ubiquitous and persistent. In academia, while recent changes in the composition of student bodies have seen increasing involvement of women and people of color in many countries (Cloete and Mouton 2015, Snyder et al. 2019), at the faculty level these groups remain under-represented, particularly in STEM (Science Technology Engineering and Mathematics) fields and at higher levels of seniority (Herman 2017, Snyder et al. 2019).

Acknowledging the numerous and complex reasons behind the persistence of these imbalances, past research on scholars’ scientific productivity has looked at the productivity differences between male and female scientists. The gender gap in publishing has been well documented. While details depend on the context, discipline, geography or era, female scientists have been found to produce fewer papers per year than their male colleagues (Allison and Stewart 1974, Cole and Zuckerman 1984, Fox and Faver 1985, Holman et al. 2018, Lerchenmueller and Sorenson 2018, Mairesse and Pezzoni 2015, Mairesse et al. 2019, Pezzoni et al. 2016). Given the nature of promotion and advance in an academic career, those productivity differences are among the main constraints on the careers of female and African scholars (Evans and Cokley 2008). So far, however, little research has focused on doctoral student publication productivity: whether gender productivity gaps are present right at the start of a possible academic career, and if so, how these gaps relate to advisor characteristics. Advisors act in fact both as role models and as gatekeepers to the profession for the students they supervise (Evans and Cokley 2008). Past research in education has shown that role models of the same gender and race are more effective for students (Aguinis et al. 2018, Bettinger and Long 2005, Lockwood 2006). Thus, we can expect that the relative scarcity of women and black professors to supervise might affect the productivity of students from those same groups.

In this paper, we study whether Ph.D. students’ publication productivity is affected by the gender pairing of the student with her/his advisor, and further, we explore the intersectionality of gender and race.

Our contribution is unique in four ways. First, we have a unique dataset of South African register data. We cover the complete sample of STEM Ph.D. students in an African country for more than a decade and, importantly, our data also contain information on their advisors. The literature on the science of science has focused on North America and Europe; studies on developing and emerging economies are rare, and tend not to be systematic.

Second, a significant improvement on past research is that our estimates do not suffer from the well-known publication-bias and language-bias of bibliometric data. While past research has generally been based on publication records in Web of Science (WOS) or SCOPUS, we rely on register data of Ph.D. students’ enrollment, and thus we account for students who have not published, or have only published in outlets not listed in WOS or SCOPUS.

Third, we investigate the intersectionality of gender and race, where past research has focused on the two separately. In this respect, the South African case is particularly interesting since black and female scientists are present in the system in similar proportions.

Fourth, we provide new insights for the literature on scientific productivity by complementing the OLS results with a quantile regression that opens a more complex and nuanced understanding of the issues related to productivity gaps in science and serves to the identification of these gaps across the productivity distribution.

In studies of the gender gap, it has been common to observe that age plays a role in publishing productivity and that the gap can change with age (Kelchtermans and Veugelers 2011). This observation, combined with the well-known Matthew Effect (Merton 1988), suggests that productivity gaps might originate very early in the career and grow over time. An important open issue, then, is whether we observe publishing productivity gaps early in the career (Conti and Visentin 2015, David 1993), and if so how to understand them. Are under-represented groups disadvantaged by the scarcity of women and black professors (who can act as supervisors and mentors) during their Ph.D. training? We can get at this issue by examining publication of scientists during the course of their doctorates, looking at productivity of students while controlling for the gender or racial composition of student-advisor couples.

While the gender gap is defined in terms of single scientists, it must be acknowledged that much publishing involves more than the focal author (Chuang and Ho 2014, Larivière 2012, Wager et al. 2015). Not only co-authors, but research assistants, co-workers, technicians, conference participants, and many others contribute with work, ideas, and suggestions. Of course, when we are considering Ph.D. students as a (co-)authors, the thesis advisor is very likely to provide important inputs.

Often the thesis advisor is the first person with whom a student co-authors, but additionally, supervisors play a key role in introducing students into the profession. It seems very likely that the properties of the supervisor matter for a student’s early success (Li et al. 2019). A priori, there are some obvious traits of the supervisor that will matter: extent of supervision, publishing record, status in the profession, quality, and so on. But other literature suggests gender (race) might also matter. For example, subtle gender and racial biases can distort the meritocratic evaluation of the students. An experiment in a sample of 127 biology, chemistry, and physics professors in the USA, asked academics to evaluate the CVs of students for a laboratory manager position, where gender was randomly assigned to CVs. It found that both male and female faculty judged female students as less competent. Consequently, females would be less likely to be hired than an identical male student, and would be offered a smaller salary and less mentoring (Moss-Racusin et al. 2016). Such biases can also reduce a student’s access to relevant information. A similar randomization experiment found that black students were less likely to receive warning information from academic advisors than are white students, when race was randomly assigned to student academic records (Crosby and Monin 2007). Along the same lines, past research has found that supervisors provide more psychological support to protégés of the same gender (Aguinis et al. 2018, Koberg et al. 1998).

Gender (or racial) bias can also play a role through the student side of the relationship (Rossello and Cowan 2019). In education and learning the gender of the advisor can affect performance and beliefs (Breda et al. 2018, Gaule and Piacentini 2018, Rossello and Cowan 2019). For example, female students are often more inspired by female than by male role models (Aguinis et al. 2018, Bettinger and Long 2005, Lockwood 2006). A recent French experiment among senior high school students found a reduction of stereotypes associated with jobs in science, after students were exposed to a female scientist (Breda et al. 2018). In the same study, enrolment in a selective science program increased by 30% among the higher achieving students; and in particular, the share of female (male) students in STEM programs were 38% (28%) higher than that in classes that did not receive the intervention. Thus, we might expect to see female students performing better with female advisors.

Exploring the relationship between gender (race) and productivity in the student-supervisor pair is a step towards understanding productivity differences among different groups within academia. Past research, generally focusing on STEM is available only for the US in a first-tier institution (Pezzoni et al. 2016) or in a single field (Gaule and Piacentini 2018).

Pezzoni et al. (2016) studied all fields in STEM with data based on 933 Ph.D. graduates and 204 advisors at the California Institute of Technology (Caltech), an elite institution in US, between 2004 and 2009. In terms of student publication productivity, they found no difference between the female-female and the male-male couples but they did find differences in cross-gender groups. In particular, they found that compared to the male-male student-advisor couples, female students working with male advisors published 8.5% less, and male students working with female advisors published 10% more.

In contrast, Gaule and Piacentini (2018) explored a single field, looking 20,000 Ph.D. graduates between 1999 and 2008 in the US chemistry departments. They found that same-gender couples tended to be more productive during the Ph.D., and that female students working with female advisors were more likely to become faculty members compared with female students working with male advisors.

In contrast to past research, our data are drawn from an emerging economy, namely South Africa, where resource constraints in the science system in general, and universities in particular, are much more severe than they are in developed countries.

We observe that the academic science system in South Africa is relatively small: in 2020 there were only 2318 full professors, and 3453 Ph.D.s granted.Footnote 1 And the production of Ph.D.s is concentrated in a relatively small number of institutions (Cowan and Rossello 2018).Footnote 2 These features are typical of many developing countries (González-Sauri and Rossello 2022, Nchinda 2002). While the concern with gender inequality in science is very important today in most countries, in South Africa, in view of the history of apartheid and its on-going legacy, race inequalities in science are also a crucial matter. Given that academic appointments are heavily based on performance during graduate studies, understanding gender and race effects on Ph.D. students’ publication productivity is of great importance.

Our analysis involves the entire South African academic science system (though restricting attention to STEM disciplines), and can be considered as representative of national trends and effects. Our statistical analysis is different from previous studies in three important respects. First, we consider early career productivity of students not in isolation but controlling for advisor characteristics. Second, our original data allows us to examine and decompose the intersection of gender and race, running regressions for separate sub-samples of the data: the sub-samples of (1) white-white; (2) black-black; and (3) black-white student-advisor couples. Third, we control for students’ “ability” implementing a quantile regression analysis, in order to explicitly test whether gender differences vary depending on the “productivity-profile” (high, medium, or low) of the student.

In particular, this paper is the first that applies a quantile regression approach to study the gender publication productivity gap in science. Past research has either measured average differences between male and female scientists (Pezzoni et al. 2016) or has tried to address potential bias using a two stage estimation, correcting for promotion bias or a scientist’s non-publishing “idle periods” related, for example, to motherhood (Mairesse and Pezzoni 2015, Rivera León et al. 2017). Our contribution using a quantile regression approach helps quantify and identify where the gender productivity gap is larger and this might help the identification of targets aimed at closing the gap.

Our main findings are the following. First, there is a gender-based publishing productivity gap in South Africa. Second, on average the productivity gap is smaller (or non-present) in inter-racial student-advisor couples. In same-race student-supervisor pairs there is no (statistically significant) productivity gap for female students working with female advisors. For female students working with male advisors, however, a gap exists. This suggests that the gap, as measured on the entire population is driven largely by female students working with male advisors in white-white and black-black pairs. Finally, using quantile regressions to consider the productivity distribution underlying the average differences, we observe that gender publication productivity differences are U-shaped over student productivity. In particular, female students with high (or low) productivity profile are as productive as male students (with similar productivity) working with male advisors.

Our work aims at quantifying and identifying the origins of some of the gender and/or race differences in the publication productivity of early career scientists. In spite of its specificity and limits, our study lends itself to some important science policy recommendations. Because of cumulative advantage in science, as well as the well-known Matthew effect of citations, we know that early career productivity differences tend to grow over time.

Our results underline the importance of supervision, suggesting that female role models and inter-racial couples might mitigate early career publication differences between female and male students in STEM. These findings are of particular importance in the South African context, where many students do not complete their university studies. Our sample comprises an “elite” part of the South African university system, representing students who complete their education and become independent scientists. Hopefully, addressing productivity gender differences at the Ph.D. level may increase the (success and) proportion of female professors in the population and create an inclusive and productive working environment for all. It may, in turn, create positive feedbacks for students of lower levels.

2 Material

2.1 The context

The context of our analysis is the South African university system. South Africa has a population of roughly 60 million people where 80% are black Africans. It is a country undergoing rapid social transformations where social, political, and economic conditions have been changing rapidly since 1994. South African society, including the university system, was segregated across racial lines by apartheid until 1994. Black universities were systematically underfunded, specialized in teaching and technical education, and not considered producers of new knowledge (Herman 2017). In 1994 the university system began a process of integration, aiming to make both the student body and the faculty reflect the ethnic composition of the overall population. Progress has definitely been made, but the goal has not yet been achieved. It remains the case, though, that today South Africa is an important hub for African science and indeed is at the world frontier in many disciplines (aspects of biology, medicine, astronomy, and agricultural science to name a few). However, the system is relatively small, comprising 24 public universities and around 2300 full time permanent full professors in 2020.Footnote 3 Tuition costs can limit access to higher education for much of the population: the average annual tuition fee is high, between 30000 and 64000 ZARFootnote 4 (approximately 1500-3200 Euro), where GDP per capita in 2020 is estimated by the IMF at 6611 Euro.Footnote 5 Improvements notwithstanding, imbalances and racial inequalities are still present in the South African university system. This provoked two large student protests in 2015 Rhodes Must Fall and Fees Must Fall demanding racial transformations and equal access and opportunities.Footnote 6Footnote 7Rhodes Must Fall, in particular, migrated to universities outside South Africa, in particular in the UK and the US.Footnote 8

The international echo made many think of Rhodes Must Fall as a precursor of the social movement Black Lives Matter.Footnote 9 The South African case of discrimination at universities is important not only for South Africa but for the world at large.

2.2 Data

Our data originate from the National Research Foundation (NRF) database of South African Academia.Footnote 10 The NRF has a system in which academics at South African universities apply to be “rated”. This rating has (until recently) financial and prestige incentives, so most academics in South Africa who pursue a research career do apply. Overall, rated scholars comprise about 30% of South African scholars who produce roughly the 90% of the country peer reviewed output. The STEM fields have been part of the system longer than have SSH fields, and in these fields coverage appears to be almost complete. Consequently, we restrict attention to this group, where the agency has a primary role in funding research of individuals (including early in the career) as well as of universities. We create a unique dataset using information supplied by researchers in the rating application process. The raw data include three datasets, the first contains information about STEM Ph.D. students and their advisors from 2000 to 2014, the second has all individual employment information, and the third collects all publication data supplied to the NRF from 1961 to 2014.

From student and supervisor characteristics (i.e., name, surname, graduation year, scientific field, and university) we match both students and advisors to their employment records and then match each individual to his or her NRF publication record. This method avoids the problems associated with methods dependent on name disambiguation. Additionally, to be confident that our publication data are complete, we include in the analysis only Ph.D. students in STEM who became active scholars in the NRF system. To construct the panel data we carry forward for each student the time from enrollment year up to 2 years after graduation, creating publication records for each student and advisor over time. In this way, we obtain a total of 6049 observations representing 924 Ph.D.s and 549 thesis supervisors. Our final sample represents Ph.D. students within the enrollment period 2000–2012 and with a graduation date up to 2014. In our sample, Ph.D. students graduate on average after 3.8 years after enrollment.

Students in our sample are 58% (249 white and 282 black) male and 42% female (259 white and 134 black). Professors in our final sample are 73% male (298 white and 104 black) and 27% female (130 white and 17 black).Footnote 11 Table 1 shows the population composition in terms of student and advisor pairs. The majority of students are supervised by white male advisors (54%) followed by white female (23%), black male (19%), and black female advisors (3%).

Table 1 Students and advisors, by race and gender. A professor can supervise more than one student

2.3 Variable description

Looking at 3 year moving windows between enrollment and graduation plus 2 years, the annual average number of publications per student is close to one. The most productive are white male students who produce on average 1.58 publications, followed by black males with 1.37 average publications, white females with 1.21 publications, and black females are the least productive with an average of 0.75 publications. However, not surprisingly, publication data are skewed, where the median values are close to zero for all groups, indeed 44% (410 Ph.D.s) of the students do not publish at all between enrollment years and 2 years after graduation. Our approach of using register data is particularly relevant here. In contrast to most previous studies, which look at publication records in Web of Science (WOS) or SCOPUS, we are able to include scholars who have never published.

Our outcome variable is student productivity in terms of publications. We define student productivity as log(1 + pubt) were pubt is the number of student publications between year t and t + 2 inclusive divided by 3. Raw differences in student productivity between different populations and student-supervisor pairs are presented in Fig. 1. In panel A, looking at student (by columns) and advisor (by rows) types, we observe a large heterogeneity in productivity across supervision couples, suggesting a complex joint effect of gender and race. But overall, the figure shows that same-type supervision (2nd diagonal) correlates with higher average productivity (lighter colors).Footnote 12

Fig. 1
figure 1

Heat-map of doctoral average annual productivity for student and advisor gender (racial) combinations. The color intensity of each entry represents the average annual productivity of each group. Darker (lighter) colors represent lower (higher) productivity values. Productivity is log(1 + pubt), where pubt is number of student publications between year t and t + 2 inclusive, divided by 3. Rows in sub-figure A are advisors' gender-race type while columns are students' gender-race type. In sub-figure B rows are student advisor gender couples and columns are fields

Panel B of Fig. 1 shows average productivity across student-advisor gender differentiated by discipline. In 5 out of 13 fields the couple female student with female advisor has the highest average productivity.Footnote 13 In 2 fields, Mathematics and Medical: clinical, cross gender ties are those with the highest averages. In the remaining 6 fields the couple male student with male advisors has the highest average productivity. Finally, female students with male advisors have the lowest average productivity for 6 out of 13 fields.

3 Methods

The raw data indicate that female and black Ph.D. students publish less than male and white students. But these differences could be driven by many things. In the analysis that follows, we control for several factors that are likely to contribute to a scientist’s publication productivity, in order to isolate the effects of gender and race.

We ask whether average doctoral productivity differs with respect to the genders of the student-advisor pair. We estimate panel OLS regressions with robust clustered standard errors as follows.Footnote 14

$${Y}_{it}={\alpha }+{\gamma }{Z}_{i}+{\beta }{X}_{it-1}+{\delta }{W}_{it}+{\epsilon }_{it}$$
(1)

As described in the previous sections, our outcome variable (Yit) is the student’s productivity, \(\log (1+pu{b}_{t})\), where pubt are the publications of which the student was a (co-)author, between years t and t + 2 inclusive, taking annual averages. Zi are our main control variables representing the student-advisor gender couples. In particular, we use dummies representing the different student-advisor couple combination as follows. StudFemale_AdvFemale is equal to one if both student and advisor are females, StudFemale_AdvMale equals one when the student is female and the advisor is male, and StudMale_AdvFemale is one when the student is male and his advisor is female. In all regressions, our baseline category is StudMale_AdvMale. Xit−1 and Wit are the student and advisors’ controls. We include as additional controls field and student enrollment year dummies. Lastly ϵijt is the error term. In particular, as control for students’ we include Stud. Published Previously, a dummy variable equal to one when the student had published previously and zero otherwise.Footnote 15 To account for publication quality, we add as control the Average Scimago Journal Ranking of a student’s previous publications.Footnote 16 Scimago Journal Ranking is a metric of journals quality based of their average prestige per article. The metric is based on the idea that not all citations are equal and measures the scientific influence of journals. Our variable is equal to the average Scimago Journal Ranking of all the student’s previous publications, where the ranking of the journal is that of the year of article’s publication.Footnote 17Footnote 18

To control for advisor characteristics we use Adv. Log Average Previous Productivity: the logarithm of the advisor’s past cumulative average productivity.Footnote 19 Additionally, the variable Adv. NRF Ratings accounts for advisor’s “quality” using her/his NRF Rating.The NRF Rating is the result of the peer review assessment of researcher quality, done by the South African National Research Foundation (NRF). The rating has 5 broad categories: A - Leading international researchers; B - Internationally acclaimed researchers; C - Established researchers; P - Prestigious Awards; and Y - Promising young researcher; the first three of which have also a finer grained category. The grade is assigned based on the reports written by a group (usually 5 or 6) of international experts in the scholar’s field who have read the publications submitted by the candidate.Footnote 20 We further control for team size using More Adv., a dummy variable equal to one if the student have more than one advisor and zero otherwise.Footnote 21 We control for the number of students supervised by the advisor (Number of Adv. Stud) and Time to Graduation the time to student’s graduation.

We run the regressions on the overall sample (ALL) and on same and cross racial sub-samples. This permits us to decompose the intersection between gender and race, and to account for the institutional segregation inherited from apartheid.Footnote 22 Specifically, we repeat the analysis on the sub-sample of white students working with white advisors (WW), black students working with black advisors (BB), and on the cross-race sample of black students working with white advisors (BW).Footnote 23

One could imagine that supervisors spend different amounts of time and effort on supervision, depending on his or her perception of a student’s ability. Bearing in mind that this effect might intersect with gender aspects, we use a quantile regression to examine the effects of the student-supervisor pair, where the quantiles are defined over students’ productivity. This method has two advantages. First, it is a control for student ability where students are grouped and compared across productivity percentiles. Second, it allows us to observe which student productivity profiles drive the average difference of publication productivity between male and female. The quantile regression formulation is:

$${Q}_{\tau }({Y}_{it}| {Z}_{i},{X}_{it-1}, {W}_{it})={\alpha }_{\tau }+{\gamma }_{\tau }{Z}_{i}+{\beta }_{\tau }{X}_{it-1}+{\delta }_{\tau }{W}_{it}+{\epsilon }_{it}$$
(2)

We use the same specification as before. Where Qτ(YitZi, Xit−1, Wit) is the τth quantile regression function, Zi are the dummies representing the different student-advisor gender couples StudFemale_AdvFemale, StudFemale_AdvMale, StudMale_AdvFemale where the baseline category is the pair male students with male advisors. Xit−1 and Wit are the controls described before, and ϵit is the error term.Footnote 24 We run 40 quantile regressions each covering 2.5 percent of the student productivity distribution. To show these results we plot the estimated coefficients with their 95% confidence interval of the dummies of interest StudFemale_AdvFemale, StudFemale_AdvMale, StudMale_AdvFemale; which compare the couples with the baseline (StudMale_AdvMale).

4 Results

In this section, first, we present the results of the OLS estimation done for the whole sample and in separate racial sub-samples to isolate gender and race effects. And second, in the following sub-section, we present the results of the quantile regressions.

4.1 OLS regression results

Table 2 explores average productivity differences between student-advisor gender pairs compared to the baseline StudMale_AdvMale. It gives the results of OLS estimations of the model for the whole sample (ALL, column 1) and for same- (WW column 2, BB column 3) and cross- racial sub-samples (BW column 4). Overall, column 1 shows that female students working with male advisors (StudFemale_AdvMale) have the largest gap compared with the male-male couple: they produce on average 14% fewer papers; while it is 13% fewer for female students working with female advisors (StudFemale_AdvFemale). Male students working with female advisors (StudMale_AdvFemale) do not differ in productivity compared to the baseline male–male pair (StudMale_AdvMale).

Table 2 Pooled OLS Panel Regression with robust clustered standard errors. The dependent variable is log of 1+ average productivity (number of publications between t and t + 2 inclusive divided by 3). Column (1) is the estimation on the whole sample; column (2) is the sub-sample White Students with White Professors; column (3) is the sub-sample Black Students with Black Professors; column (4) is the sub-sample Black Students with White Professors

However, disaggregating to take into account the racial composition of the pairs suggests a slightly different story. Here, in the same-race supervisions (WW and BB) we again observe a significant gap for female students with male advisers (StudFemale_AdvMale) (columns 2 and 3). However, in these racial groupings, we do not see a significant gap for female students with female advisors ((StudFemale_AdvMale) (again columns 2 and 3). In particular, when student and advisor are both white, (column 2) female students working with male advisors (StudFemale_AdvMale) produce 14% fewer papers than male students working with male advisors. Similarly, among black-black supervision pairs (column 3) female students working with male advisors (StudFemale_AdvMale) produce on average 24% fewer papers than do male students working with male advisors. By contrast, looking at the cross-racial sub-sample (column 4) the group of black students (of either gender) working with white advisors (of either gender) displays no significant productivity gaps with male-male couples for any gender combination. These figures suggest that there is a strong interaction between race and gender in determining student productivity.Footnote 25

This is particularly relevant in the South African context where the formerly white institutions are becoming more racially inclusive and the participation of black students is increasing. Indeed, while the sub-samples of same-race couples (models 2 and 3) account for potential segregation effects within departments and fields, the sub-sample of black students working with white advisors (column 4) may describe the “intermediate future” of the South African university system envisioning black graduates entering as Ph.D. students the formerly white departments, which will still be largely populated by white faculty.

Besides our main regressors, looking at the common controls used in the literature studying scholar publication productivity, overall our results are consistent with others’ results: the best predictor of future productivity is the past productivity—students who had published in the previous years (Stud. Published Previously) produce between 58% and 82% more papers than those who did not. Publication quality is also relevant in aggregate, (though not significantly so for same-race pairs): a one unit increase Avarage Scimago Journal Ranking increases student productivity between 5% and 7%. The team size plays a role and student productivity increases 6% for each additional student supervised by her/his advisor (Number of Adv. Stud). These results highlight the role of past productivity in determining future productivity of scholars. This observation, coupled with gender imbalances in early productivity, might create a lock-in effect that systematically disadvantages female scholars. We return to this below.

In the next section, we look at whether the relation between productivity and student-advisor gender couples changes depending on the student’s relative productivity level.

4.2 Quantile regression results

To go beyond average differences, controlling for students’ “ability”, and to accommodate the skewness and fat tails of the dependent variable, in this subsection we examine the quantile regression results. In this way we are able to look at the origins of gender productivity differences, and ask whether discrepancy between groups is stronger or weaker for different sections of the population representing students with different “abilities”. Where student “ability” is captured using productivity percentiles.

Figures 2, 3, and 4 plot the coefficients and 95% confidence intervals estimated respectively for StudFemale_AdvFemale, StudFemale_AdvMale and StudMale_AdvFemale gender couples compared to the baseline (StudMale_AdvMale). The figures collect results from 40 quantile regressions, estimated each 2.5% percentiles of the student productivity distribution. The 40 quantile regressions use the same controls as before: More Adv., Adv. Log Average Previous Productivity, Adv. NRF Ratings, Student Published Previously, Average Scimago Journal Ranking, Time to Graduation, Number of Adv. Student, Field, Enrolment Year, and Year dummies. We report the corresponding regression table for selected percentiles in Table A2 in the appendix.

Fig. 2
figure 2

Quantile Regressions Coefficients and 95% confidence intervals of StudFemale_AdvFemale compared to baseline StudMale_AdvMale. The dependent variable is student productivity: log(1 + pubt), where pubt is number of student publications between t and t + 2 inclusive, divided by 3. Quantile regressions are done for each 2.5 percentile using robust clustered standard errors according to Machado et al. (2011). The solid black line is zero, the dashed red line is the (non-quantile) panel OLS estimation of Model 1 from Table 2. Additional controls are as in Model 1 Table 2: More Adv., Adv. Log Average Previous Productivity, Adv. NRF Ratings, Student Published Previously, Average Scimago Journal Ranking, Time to Graduation, Number of Adv. Student, Field, Enrolment Year, and Year dummies. A selection of corresponding regressions are in Supplementary Table A2

In the figures the black line at zero represents the baseline comparison with StudMale_AdvMale; when the confidence intervals overlap the zero line the difference between the coefficient estimated of the gender couples and the baseline is not statistically different than zero. Otherwise, the coefficient of the productivity gap with the baseline is different than zero at 95% confidence level. Additionally, we indicate with the red dotted line the previous OLS estimation (model 1 of Table 2, that is, not disaggregated by racial combination).

Figure 2 shows that productivity differences of female student working with a female advisor (StudFemale_AdvFemale) are U-shaped over student productivity. The gap is largest among students with a productivity around the 85th productivity percentile. Indeed, female students working with female advisor with low (<70th percentiles) and high (>90th percentiles) productivity profiles are as productive as male students with similar profiles working with male advisors.

Similarly, Fig. 3 shows the estimated difference in productivity across percentiles between female students working with male advisors (StudFemale_AdvMale) and male-male student-advisor couples (StudMale_AdvMale). Again we observe a U-shaped tendency, but less pronounced. Between the 70th and 90th percentile female students working with a male advisors have a lower productivity than male students working with male advisors. A striking observation from these two figures is how the two gaps respond to the quantile. The gap observed for the (StudFemale_AdvFemale) group is very clearly U-shaped, with the gap entirely disappearing at the higher productivity levels. By contrast, for the (StudFemale_AdvMale) group increases from the 60th percentile and plateaus. There is only a small recovery at the top percentile. The gap is for this group more persistent over productivity levels.

Fig. 3
figure 3

Quantile regressions coefficients and 95% confidence intervals of StudFemale_AdvMale compared to baseline StudMale_AdvMale. The dependent variable is student productivity is log(1 + pubt), where pubt is number of student publications between year t and t + 2 inclusive, divided by 3. Quantile regressions are done for each 2.5 percentile using robust clustered standard errors according to Machado et al. (2011). The solid black line is zero, dashed red line is the (non-quantile) panel OLS estimation of Model 1 from Table 2. Additional controls are as in Model 1 Table 2: More Adv., Adv. Log Average Previous Productivity, Adv. NRF Ratings,Student Published Previously, Average Scimago Journal Ranking, Time to Graduation, Number of Adv. Student, Field, Enrolment Year, and Year dummies. A selection of corresponding regressions are in Supplementary Table A2

Finally, Fig. 4 shows the quantile results for the difference in productivity between male students working with a female advisor (StudMale_AdvFemale) and those working with a male advisor (StudMale_AdvMale). In this case, as in the OLS results, we found no significance difference in productivity across student-advisor gender couples.

Fig. 4
figure 4

Quantile regressions coefficients and 95% confidence intervals of StudMale_AdvFemale compared to baseline StudMale_AdvMale. The dependent variable is student productivity is log(1 + pubt), where pubt is number of student publications between year t and t + 2 inclusive, divided by 3. Quantile regressions are done for each 2.5 percentile using robust clustered standard errors according to Machado et al. (2011). The solid black line is zero, dashed red line is the (non-quantile) panel OLS estimation of Model 1 from Table 2. Additional controls are as in Model 1 Table 2: More Adv., Adv. Log Average Previous Productivity, Adv. NRF Ratings,Student Published Previously, Average Scimago Journal Ranking, Time to Graduation, Number of Adv. Student, Field, Enrolment Year, and Year dummies. A selection of corresponding regressions are in Supplementary Table A2

In aggregate the gender productivity gap shows a U-shape over productivity levels. It is weak for the least and the most productive students, but strong for students between the 70th and 90th percentile. But the “recovery” as productivity increases is not uniform across sub-groups. In fact, the recovery is driven almost exclusively by female students with female advisors. Among female students with male advisors, the gap appears around the 60th percentile and increases rapidly to the 80th percentile, as it does for those with female advisors. But for those with male advisors the gap does not close, but stays high for the upper tail of the distribution.

5 Conclusion

In STEM subjects, in South Africa, the gender composition of the student-advisor pair matters for students’ early career productivity. In general, there is a productivity gap between male and female Ph.D. students. Dis-aggregating on the basis of the gender composition of the student-supervisor pair, we find that there are productivity gaps both for female-female and female-male student-supervisor pairs, with a slightly higher gap for female-male pairs. However, when we dis-aggregate further, considering also the racial composition of the pair, we find that female students working with female advisors display no productivity gap with male-male couples when student and supervisor are of the same racial group. And we find no gender gaps at all for couples in cross-racial ones. Where we do observe significant gender gaps in the same-race groups is in the StudFemale_AdvMale student supervisor pairs. Supervision teams that are either cross-race or are same-gender seem to have the same productivity.Footnote 26 This result points to the importance of the level of (dis-)aggregation at which the analysis is pursued. Focus solely on the effects of gender can generate mis-leading results, as the racial composition of the student-supervisor pair is important in determining gender effects. Gender effects that we seem to observe in the aggregate are not present in identifiable sub-groups.

Further, investigating the gender productivity differences across students’ “ability”, we find that the gap in productivity between male and female students is largest between the 85th and 90th percentile of the productivity distribution. In general, we find that the gap is U-shaped over student productivity, suggesting that the average productivity gap is driven by the moderately-productive students. But again this is sensitive to which sub-population we are observing.

Our results suggest that female students have smaller productivity gaps with male-male couples when coupled with female advisors: female students perform slightly better when coupled with female advisors. It would follow that the scarcity of female advisors (which is not unique to South Africa) might create a negative lock-in effect for female students. This is, potentially, the start of a vicious circle at the system level: because there are few female advisors, female students tend to be less productive. The Matthew effect drives this into the future, implying fewer females are promoted, perpetuating the situation in which female students cannot find female supervisors.

The reasons underlying the smaller gaps for females working with female advisors are numerous, but might be linked to a positive role model effect, or may suggest that female students with male advisors may receive lower quality (or quantity) supervision than do male students. We should remark, however, that our estimation of student productivity is in terms of statistical significance, not necessarily in economic or educational learning terms.

We do not attempt to give the ultimate explanation for the differences we observe, but our results open several important lines of future research.

In general terms, an additional potential avenue for future research would be to conduct similar studies in other countries. The research of science of science in emerging economies is thin, which is unfortunate since improving the performance of science and higher education in these countries is often considered an important part of the catching up agenda. It seems likely that the general patterns will be similar, but also likely to differ in details. Middle income countries in general face common challenges: scientific activity is often concentrate in specific topics, budget constraints are often more severe, and funding is more volatile and uneven. But the details beneath these general issues differ from country to country, and are thus likely to impinge on issues of (young researcher) productivity in different ways. It remains the case, though, that productivity gaps in emerging economies are potentially more problematic and extreme compared to those in more mature systems.

In more specific terms, our analysis looked at productivity differences in the short-run. It would be interesting to test whether there are any long-run effects. The Matthew effect suggests that in general there are, but the way this is manifest may vary across gender or race. A female scientist might not have the peak of productivity in her mid-career but later, and her productivity distribution might not be unimodal. To observe Ph.D. cohorts over a long period of time, relating future success to initial success, perhaps moderated by supervisor properties is an important direction for future research. While we examined the relation between the Ph.D. student and her/his main advisor, we must consider that they are located in a larger environment. In some disciplines (many natural sciences or medicine for example) research is very much team-based, with different lab members contributing different parts of the project. But even where that is not the case, spillovers among colleagues can affect research success or productivity. In addition, more “social” forces may be at play: for example, political power within a department, local and international networks of collaborators, the likelihood of experiencing discrimination, to name a few. How this local environment affects a student’s productivity is essentially unexplored in the literature. Thus, an open, but potentially very important and influential question is how the relation between student and his/her advisor is affected by the larger environment in which they are embedded.

The observed differences in early career productivity could be due to student-supervisor personal relations; access to resources; differences in the career paths; different nature of the research output in terms of content (for example between basic or applied research which can translate into differential ‘publishability’). The personal relations hypothesis is compatible with results in Rossello and Cowan (2019), which find a same-gender (same-race) bias in supervision-tie formation. Bias in tie formation relates with group behavior and socialization in the working environment which may disadvantage female students working with males (Blackburn et al. 1981, Van den Brink and Benschop 2014, Zinovyeva and Bagues 2015). More in general, social relations are embedded in networks which are found both to vary with gender and to enhance or restrict access to resources, information, and collaborations (Jadidi et al. 2018).

Differences in productivity are often explained by differences in career paths induced by motherhood (Pezzoni et al. 2016). Past research has found that female productivity has a negative shock during the first 3 years of a newborn (Mairesse et al. 2019). In South Africa fertility rates peak at age 25-29 which corresponds to doctoral years (Lehohla 2015). Such a shock may be accommodated differently depending on whether the female student works with a male or with a female supervisor.

A further explanation, explaining why the female-male couple has the lowest productivity, can relate with the two-world hypothesis. This hypothesis states that there exists a gender or racial specialization in specific (sub-)disciplines (Moore et al. 2018). Thus, cross-gender couples may re-combine different (sub-)fields and knowledge. More in general, the management literature has found that diversity is associated with novelty and innovation because it is more likely to recombine distant knowledge and expertise (Chen et al. 2009, Fleming 2001, Rzhetsky et al. 2015, Shi et al. 2015, Uzzi et al. 2013). In science, novelty is often a risk, particularly for a younger scientist, and may have slower returns (Azoulay et al. 2011, Boudreau et al. 2016, Verhoeven et al. 2016, Wang et al. 2017). Taking risks early in the career may slow down productivity in the short-run affecting “publishability” of the research. Different gender composition pairs may differently mitigate such risk.

All these mechanisms may individually and jointly explain our findings and contribute to an early career gender gap in science. Our work aims at quantifying and identifying the origins of some of the gender and/or race differences in the publication productivity of early career scientists.

In spite of its specificity and limits, our study suggests important policy recommendations to close the gender productivity gap. It highlights the role of female advisors and of cross-race supervision and their potential role in mitigating initial productivity imbalances between male and female students. Additionally, our results suggest that if intervention is to be aimed at closing the gap, there are specific parts of the productivity distribution that should be targeted. The “strongest” and “weakest” students show no productivity gaps across gender, so interventions targeting the “productive but not super-productive” are likely to be more effective and efficient than targeting the entire population. Hopefully, addressing productivity gender differences at the Ph.D. level would increase the proportion of female professors in the population and create an inclusive and productive working environment for all. A similar goal, in turn, may create positive feedbacks, contrasting the mentioned negative lock-in effect generated by early career productivity imbalances which constrain the career of African woman in STEM.