1 Introduction: accountability, opportunistic behaviours and motivations to game the system

Contemporary education systems are increasingly adopting external evaluation policy instruments for accountability purposes (Verger et al., 2019). Globally, school governance reforms are being adopted differently in attempts to encompass school autonomy, external forms of accountability and administrative control (Ingersoll & Collins, 2017; Verger et al., 2019). Performance-based accountability (PBA) has become one of the main accountability mechanisms that aims to improve the quality of education by making school actors more responsible for the performance of student results in external standardised tests. Teachers and principals are expected to make use of the data derived from the test to reflexively identify aspects of improvement and implement instructional changes to enhance learning and performance. In “higher stakes” systems, although some data use for improvement may be expected, the main accountability rationale is driven by the consequences associated with the test, often attached to a given scheme of incentives and/or sanctions (Maroy, 2015).

Beyond the policy design of accountability systems, as well as the theory of change expected from different models, accountability mechanisms may generate unexpected effects as they alter the perceptions, expectations, and behaviours of school actors (Maroy & Pons, 2019). Instruments have “a life of their own” and tend to gain autonomy from their initial designs, often evolving towards unexpected outcomes (Lascoumes & Le Galès, 2007; Le Galès, 2016). In the case of PBA, unexpected results have been reported in different dimensions of teaching and instruction, including a wide range of ‘opportunistic behaviours’, understood as responses derived from “perverse incentives” (Ryan, 2003) with negative consequences that push school actors to adopt instrumental actions to improve the performance in standardised tests. Such opportunistic and instrumental behaviours are expected to be more prone in high-stakes accountability regimes, where “teachers are subjected to higher levels of external pressure to achieve better educational outcomes, especially because of the threat of sanctions these systems involve” (Verger & Parcerisa, 2017, p. 246). The most well-known opportunistic behaviours are those practices that are aimed at prioritising student performance over learning, including not only the so-called teaching to the test and narrowing the curriculum (Au, 2007; Berliner, 2011; Ohemeng & McCall-Thomas, 2013) but also forms of direct or indirect cheating (Amrein-Beardsley et al., 2010; Hibel & Penn, 2020; Jacob & Levitt, 2003).Footnote 1

There are many forms of teacher cheating: copying, suggesting correct answers, ‘adjustments’ while checking answer sheets, etc. Other strategies, not focused on the test, can be altering the composition of students who are being tested: students may be strategically classified as students with special needs (so excluded from aggregated scores) or simply advised not to go to class on the day of the test. There have been several cases of teachers cheating identified in many countries, published mostly in reports and newspaper articles (Amrein & Berliner, 2002; Nichols & Berliner, 2005), triggering a broad public debate on the reliability of the high-stakes testing system. For example, in 2013, a report sent by the U.S. Government Accountability Office to the Secretary of the U.S. Department of Education described incidents in Illinois, Maryland, Pennsylvania, Texas, Washington D.C. and in California, where the results of 23 schools were invalidated for cheating by school administrators and teachers in 2012. In Chicago schools, it has been estimated that, every year, there is a minimum of 4–5% teacher cheating in elementary schools and that this phenomenon was associated with minimal changes in incentive schemes, which led to significant distortions in conduct (Jacob & Levitt, 2003). In other countries analysed, such as Hungary, similar levels of teacher cheating to that in American schools have been reported (Horn, 2012).

The rationale and motivations behind turning to these opportunistic behaviours depend on the goals of those who cheat, which are conditioned by the characteristics of the accountability and standardised assessment systems (Stecher, 2002). Opportunistic behaviours are expected to be more likely in high-stakes accountability systems, which offer extrinsic incentives encouraging teachers and schools to raise student scores (Jacob & Levitt, 2003). However, explicit consequences cannot be assumed as the single or the main explanatory factor explaining instrumental practices. Lowering the stakes does not necessarily prevent opportunistic behaviours, which have also been reported even in the absence of schemes of explicit incentives and sanctions. Evidence of opportunistic behaviours in systems that do not attach relevant consequences to the test have been found, for instance, in Italy, where significant cheating on tests has been uncovered (Bertoni et al., 2013; Paccagnella & Sestito, 2014; Quintano et al., 2009); or in some German states, where practices of narrowing the curriculum and teaching to the test have been reported (Jäger et al., 2012). According to the sociology of numbers approach, the performative effect of testing and measurement is a sufficient condition for exerting external pressure for school actors, modulating their attitudes, understandings and behaviours (Gorur, 2015; Hardy, 2015). Teachers and principals may also feel pressure for the external test because of the reputational effect of PBA (Camphuijsen, 2021).

To date, there have been few studies aimed at examining the rationale and motivations to cheat in compulsory education (primary and lower education), and even fewer dealings specifically with factors behind cheating led by teachers. Studies have been mainly focused on the relationship between the incentive systems (rewards and sanctions) and teacher behaviours. In high-stakes systems, teachers normally attribute misbehaviour to the pressures of their social environment to get better results (parents and media), but particularly from the educational authorities and the accountability system based on explicit threats of dismissal if they fail. However, what are the factors explaining cheating practices in contexts where test results are not published in schools’ league tables, or where there are no formal and direct schemes of incentives and sanctions associated with the average results of teachers and schools?

This study contributes to answering this question by developing a comprehensive framework to interpret the phenomenon of teachers cheating in a low-stakes accountability system. Specifically, it focuses on identifying contextual factors and conditions behind teacher behaviours aimed at gaming the system in an accountability framework that, on the face of it, offers no motivation to do so.

2 Social capital as collective civic capital to understand cheating practices

Teacher cheating might be understood as an unexpected and undesired professional response. Over the last few decades, an emerging corpus of literature has focused on the social dimension of teaching — understood as a set of professional practices and routines shaped on shared norms, values and beliefs — and embedded in social contexts of professional and personal relations and interactions. Accordingly, a growing body of research has adopted the analytical perspective of “social capital” to understand teachers’ practices. However, this literature tends to use the term “social capital” ambiguously, since it is often conceived as an umbrella concept that includes both individual and collective conceptions, contributing to blurring the concept and making it more difficult to use it empirically (Coppe et al., 2022). To overcome such limitations, we explicitly approach social capital as a collectively shared civic capital, based on institutional trust and social reciprocity.

Moreover, such conceptualisation appears to be a very well-suited analytical perspective to better understand the rationale for cheating in low-stakes standardised tests. This approach understands social capital as collective civic engagement based on reciprocity and trust (Gittell & Vidal, 1998; Putnam, 1995, 2000; Woolcock & Narayan, 2000). Citizens in a highly civic-minded community have a high civic engagement, are politically equals and are more prone to act on the basis of solidarity, trust and tolerance, while giving a strong boost to the associations of public life (Putnam, 2000). Here, social capital is understood as a collective stock and refers to the moral obligations and norms, social values (such as trust) and networks (such as voluntary associations) that enable people to act collectively (Woolcock & Narayan, 2000) in favour of the collective benefit (Portes, 2000). This last feature of collective stock (social capital as a feature of communities) and collective benefits means this conception diverges from other conceptions, according to which ties are established to yield benefits to individuals (Coleman, 1988; Portes, 2000). According to Putnam, the collective accumulation of social capital drives political integration and economic development to higher levels (Putnam, 1995). Here, social capital is understood as a civic culture that can be collectively used and it has a strong cultural base (Trigilia, 2011).

This approach to social capital can be distinguished into two different forms. First, bonding social capital, which is established between members of a community with a homogeneous composition; second, bridging social capital, which is set among different social groups or socially heterogeneous groups. In both forms of social capital, the networks and the associated norms of reciprocity are valuable in terms of trust, solidarity and mutual support. Both society-based and community-based networks may provide resources to support the most disadvantaged members or groups.

However, to be considered as a stock of capital, any form of capital should have a positive economic payoff, should be measurable and should have defined mechanisms through which social capital can be accumulated and depreciated (Solow, 1995). As it is potentially exclusive, since socially homogeneous groups have self-referenced interests which may diverge from those held by other groups, or from those of society as a whole, the bonding form of social capital may have negative consequences. In these cases, this form of social capital may be detrimental to the bridging form, which is concerned with solidarity, mutual respect and cooperation — values related to the welfare of the society. Indeed, as pointed out by Portes (1998), the strong ties, which bring benefits to members of a group, generally restrict access to outsiders. Therefore, the potential negative externalities of the bonding form of social capital cast doubt on this approach’s suitability.

To meet Solow’s criteria, Guiso et al. (2011) propose another definition of social capital as civic capital: “those persistent and shared beliefs and values that help a group overcome the free rider problem in the pursuit of socially valuable activities” (p. 419). The authors refer to a culture-based civic capital where values and beliefs are shared by a community and persist over time. This civic capital is related to all types of economic interactions and not restricted to political participation (Guiso et al., 2010). The authors point out that relevant direct measures of civic capital may identify values that induce people to be against actions that give private benefits at high social costs. Specifically, they refer to opinions about free riding and other behaviours which deviate from the public good (e.g. labour absenteeism, tax evasion or avoidance and littering).

3 Objectives and research questions

The main objective of this paper is to explore the factors that contribute to explain why teacher cheating in standardised tests is observed in low-stakes accountability systems. Along with the above-mentioned approach of social capital, we aim to find the factors associated with teacher cheating, which are distributed in three aggregated levels of analysis: classroom, school and province (Table 1). We can divide our research questions and hypotheses into two sub-groups, according to different explanatory factors we aim to investigate.

Table 1. Teacher cheating-related factors and operationalisation of variables

First, we want to observe to what extent cheating behaviours are more likely in classrooms with higher stocks of bonding social capital (see Table 1, columns a and b). We also want to confirm whether teacher cheating, understood as a form of community-based support, is addressed to help more disadvantaged students (e.g. low-SES, socioeconomic status or grade-retained students). Strong ties established in socially homogeneous groups may benefit their members, which, in this case, is in the form of teacher help for students who have difficulties with the test, while being detrimental to society at large — since teacher cheating undermines the monitoring and accountability objectives of the testing systems.

Second, we want to observe to what extent teachers in contexts with lower levels of civic-minded capital are more likely to cheat. We use indicators of behaviours collectively shared that deviate from the public good (of the whole society); that is, actions that provide private benefits at high social cost: with regard to school-level behaviours and practices, we analyse to what extent cheating behaviours are more likely in schools which undertake practices that do not match legal requirements or recommendations, such as social tracking of students and exclusion of students from tests (Table 1, column c). As regards the context-level behaviours, we explore whether teachers in schools located in provinces with a lack of civic capital — higher rates of teacher absenteeism — are more prone to cheat in standardised tests. Then, even if deterrents such as external controllers during the test taking have been proved to be effective in preventing cheating, we also analyse whether such deterrents are effective in schools where non-civic-minded values are deeply rooted.

4 Data and method

4.1 The Invalsi student performance dataset

Italy is a good setting for exploring the factors associated with opportunistic behaviours in low-stakes systems, due to the richness of the data coming from the standardised testing system. In Italy, teacher cheating has already been associated with the geographical location of schools (Quintano et al., 2009), certain locally shared values (Paccagnella & Sestito, 2014), student peer effects (Lucifora & Tonello, 2012) or deterrent mechanisms (Bertoni et al., 2013), such as the external control during tests.

We used the dataset of standardised tests administered by the Italian National Institute for the Evaluation of the Education System (Istituto nazionale per la valutazione del sistema educativo di istruzione e di formazione, Invalsi), from the 2011/12 academic year. This study dealt with surveys of 5th, 6th and 10th grade student performance. In the Italian education system structure, these grades correspond, respectively, to the 5th year of primary education, the 1st year of lower secondary education and the 2nd year of upper secondary. The national survey was obligatory for all schools in 2009 and 2010. For this research, in the 2011/12 wave, the estimation models constructed covered approximately 1,426,000 students and 79,100 classrooms from the 5th, 6th and 10th grades (on average, 10, 11 and 15 years old), spread over five macro-areas, 20 regions and 103 provinces. The tests, which were not high-stakes, covered mathematics and Italian language and were administered by teachers following a protocol set by Invalsi. This protocol suggested that the presence of teachers not specialised in the subject being tested would be appropriate. External inspectors were sent to a sample of classrooms, in schools randomly selected across the regions, to control the realisation of tests, to check the answer sheets and return the results to Invalsi.

4.2 Empirical analysis: determinants, incentives and deterrents of cheating

We estimated logistic regression models to explore to what extent the explanatory factors included in the equation make the likelihood of cheating increase or decrease. The baseline equation can be expressed as follows:

$$logit\left({cheat}_{csp}\right)=\alpha +{\gamma}_1{O}_{csp}+{\gamma}_2{H}_{csp}+{\gamma}_3{F}_{csp}+{\gamma}_4{E}_{sp}+{\gamma}_5{T}_{sp}+{\gamma}_6{A}_p+{\gamma}_7{D}_p+{\gamma}_8{C}_{csp}+{\gamma}_9{P}_p+{\gamma}_{10}\left({O}_{csp}\ast {T}_{sp}\right)$$
(1)

where cheatcsp is whether a classroom c in school s in province p is suspected of cheating. Ocsp covers two dummies that refer to the opportunity to cheat, that is, whether a classroom c was monitored by an external controller and whether it was indirectly controlled. The rest of the variables included in the model, whose descriptive statistics are shown in Table 2, are described below.

Table 2. Descriptive statistics

A cheating indicator based on suspicious answer strings

The binary dependent variable cheatcsp is whether a classroom is suspected of cheating in mathematics and reading tests. In the literature, research reports and articles propose or compare different methods to identify student cheating in multiple choice tests (Angoff, 1974; Belleza & Belleza, 1989; Frary, 1993; Sotaridona & van der Linden, 2006; Wesolowsky, 2000). In contrast, there are very few methods to identify teacher cheating. The most relevant is that elaborated by Jacob and Levitt (2003), which is the method that we partially replicated. This method to detect opportunistic behaviours adopted by teachers combines two indicators. The first is Unexpected Test Score Fluctuations, which basically refers to unexpected score gains that can be explained by cheating. A classroom will be suspected of cheating if unexpectedly large gains are followed by lower than usual test scores for the same students the following year. When test scores are monitored over time, student gains due to talented teachers or rich educational programs are likely to be permanent. This first indicator could not be calculated due to the lack of longitudinal data.

Instead, we focused on the second: a composite indicator of Suspicious Answer Strings.Footnote 2 With this indicator, different ways for a teacher to cheat could be detected — not only the easiest but also more sophisticated actions. The Suspicious Answer Strings indicator is a combination of four measures: an unlikely block of identical answers given to consecutive questions in the classroom, the classroom average variance across all test items, the variance (as opposed to the mean) in the degree of correlation across questions within a classroom and the extent to which a student’s response pattern differs from other students with the same aggregate score that year. The overall measure of cheating is constructed, within a given subject and grade, ranking classrooms on each of the four indicators and taking the sum of squared ranks across the four measures (see the appendix for more information about the construction of the cheating indicator based on suspicious answer strings). In our empirical analysis, we employ the 90th percentile cut-offs to pinpoint classrooms suspected to cheating, with the 95th percentile used in robustness checks:Footnote 3

Social capital-related factors as predictors of cheating

In our study, we explored factors associated with civic social capital. The measure Hcsp represents classroom social homogeneity, calculated based on the standard deviation of the SES index — socioeconomic status — within each classroom. The SES index was calculated and provided by the Italian National Institute for the Evaluation of the Education System, using data about students’ parents’ education, occupation and household possessions. The standard deviation as a measure of social homogeneity offers insights into the socioeconomic diversity of students in a classroom: a lower standard deviation would suggest more homogeneity (similar backgrounds), while a higher standard deviation would indicate more heterogeneity (diverse backgrounds).

F csp is a vector capturing factors indicative of the compensation function of teacher cheating, which includes the classroom proportion of students who were retained a grade and a dummy for whether a classroom has a low SES average. Classroom with a low SES were identified by generating quartiles from the classroom-level SES average. Classrooms in the bottom quartile were assigned a value of 1, while the others were assigned a value of 0.

We have also included three school-related factors associated to the social capital of teachers and other school agents: the school-level measure Esp, which represents the fraction of students who missed the test (as a proxy of the percentage of students excluded from the test taking); the school-level measure Tsp, which is a variable denoting the extent of school practices that result in social tracking between classrooms and the province-level measure Ap, which is the share of teacher absenteeism. Here, we were also interested in analysing the extent to which the mentioned external deterrents of cheating during the test taking are effective in schools with high levels of non-civic-minded culture or lack of civic capital. For this we included the vector Ocsp × Tsp, which represents the interaction term between a continuous measure of a lack of civic capital at the school level (social tracking between classrooms within schools) and the binary predictors of both direct and indirect external control during the test.

4.3 Addressing omitted-variable bias

We deal with potential problems of omitted variables, since behaviours denoting law acceptance or conformity (e.g. tax compliance vs. tax evasion) may be effectively driven by factors not necessarily related to civic capital, such as economic payoffs or legal enforcement. This is the reason why, as indirect measures of civic capital, outcome-based measures such as behaviours, are difficult to interpret (Guiso et al., 2010). Although we assume that measures of deviant behaviours may only partially inform the lack of civic-minded capital, we argue that, to the extent that the following two assumptions are accepted, they may still be important predictors. Firstly, when dealing with school- and province-level aggregated factors, such as social tracking of students or tax evasion, we assume that the legal framework and the measures of law enforcement are held constant. Since we exploit within-country variability of a single country, we partially account for the driving force of legal deterrents: measures of legal enforcement, recommendations, deontological professional standards, etc. Secondly, we assume that much of the motivation and incentive to act opportunistically, associated with the social structure and territorial characteristics, can be captured through the socioeconomic composition controls, as well as by the territorial fixed effects (regions and macro-areas).

When talking about social dissimilarity between classrooms, we should consider that, in Italy, especially in urban centres, many schools occupy several buildings which are often physically separated — possibly by more than 1 km. This may lead to social homogeneity within, and social heterogeneity between, classrooms, not as a result of practices of student tracking, but as a reflection of the different social composition of the surroundings of the school buildings. Bearing in mind that no information is available on the urban or rural location of schools, we account for this phenomenon, albeit partially, by including measure Zsp:school size, which reflects the number of classrooms in each school.

The variables used as controls measures in the equations were at classroom, school and province level. At the classroom level, we included the vector Ccsp, which covers aggregated student background data: the fraction of female students and the fraction of first-generation and second-generation immigrant students in the classroom. We also controlled for the classroom size Scsp, expressed as the number of students enrolled. At the province level, we accounted for a range of social, economic and geographic characteristics, which was included in the vector Pp of province characteristics data: GDP per capita, unemployment rate, whether provinces contained a metropolitan area, population size and density of province and share of adult population participating in education. In this vector, we also included province-aggregated data related to the school system, such as the share of early school leavers, an index of teacher precariousness (teachers with a temporary contract) and the share of teacher turnover. Finally, in order to control for systemic and cultural differences at the territorial level, we separately introduced a set of macro-area and region fixed-effects. We do not add an explicit error term as the logit transformation inherently accounts for an error structure. Standard errors were clustered, in separate specifications, at the province and the school levels.

4.4 Results reporting and goodness of fit

The results are interpreted in two different ways, accounting for the non-linear nature of the logit analysis: through an odds ratio and by using average marginal effects. The odds ratio indicates the odds of a classroom being suspected of cheating (y = 1) relative to the odds of it not being suspected of cheating (y = 0). Specifically, if p is the probability of y = 1, then the odds are given by the ratio p/(1 − p). The average marginal effects were calculated estimating the average of the classroom marginal effects (expressed as a percentage), indicating how an increase in x is associated with an increase or decrease of the probability of y being equal to 1 (classroom suspected of cheating). While for dummy variables, the marginal effect is expressed in comparison to the base; for continuous variables, it is expressed for one-unit change in the explanatory factor. The marginal effects on the probability of being suspected of cheating, calculated as the average of the classroom marginal effects, are given by:

$$\frac{\partial p}{\partial {x}_j}=\frac{\sum {F}^{\prime}\left({x}^{\prime}\beta \right)}{n}{\beta}_j$$
(2)

The goodness of fit was measured with the percentage of values correctly predicted. First of all, taking the estimated coefficient \(\hat{\beta}\), we calculate the predicted probability \(\hat{p}\) that y would be equal to 1 for each classroom (i.e. that cheating had occurred) in the dataset:

$$\hat{p}= pr\left[y=1|x\right]=F\left({x}^{\prime}\hat{\beta}\right)$$
(3)

The predicted probabilities \(\hat{p}\) in logit and probit models are limited between 0 and 1 and indicate the likelihood that y = 1. Once we have \(\hat{p}\) to check the good fit of the model, we calculate the percentage of values correctly predicted. This is the proportion of true predictions to total predictions (\(\hat{y}\) = y). Our model is a good fit if we correctly give at least 70% true or correct predictions. The models derived from Eq. (3) correctly predict between 88% (90th percentile cut-off) and 94% of values (95th percentile cut-off), and therefore we can confirm that they are suitable to use for the empirical analysis.

5 Main results

In this section, we test our hypotheses and present the main results according to the social capital analytical approach to understand opportunistic behaviours.

5.1 Cheating and the bonding form of social capital

As mentioned earlier, if a context is socially more homogeneous, stronger ties between teachers and students are more plausible. The first indicator is a measure of social dispersion within classrooms (classroom social heterogeneity), which, inverted, becomes a proxy of strong ties established in a close and socially homogeneous context. Since we refer specifically to teacher cheating, we do not approach classroom social homogeneity to identify interactions among students as a determinant of cheating (Lucifora & Tonello, 2012), but rather in contexts where ties of mutual support between teacher and students are more likely to be found. As can be seen in Table 3, the social dispersion indicator is a robust predictor of teacher cheating, net of classroom, school and territorial controls. Specifically, looking at the marginal effects, we observed that a one-unit increase in the standard deviation of the students’ socioeconomic index within classrooms decreases the probability of a teacher being suspected of cheating. For a one-unit increase in social heterogeneity, we expect the probability of cheating to be reduced by 1.6% in primary and 3% in lower and upper secondary. In terms of the odds ratio, if we invert the scale, we can say that for a one-unit decrease in the measure of social dispersion, the probability for being suspected of cheating is 1.2 and 1.4 times more likely than not being suspected, in primary and in lower and upper secondary, respectively. Overall, teachers are more prone to support students, suggest answers or fill in the answer sheets when the classroom composition is socially more homogeneous, where bonding forms of social capital are more likely.

Table 3. Teacher cheating and bonding social capital-related factors (Italian language). Interactions with external control

Context of teacher cheating: a compensation function for disadvantaged students

From the bonding approach of social capital, cheating can be understood as a strategy targeted to support socially and academically disadvantaged students. If we look at the fraction of grade-retained students in classrooms, we observe that teacher cheating is significantly associated with these academically disadvantaged students in the upper secondary for mathematics and Italian and in the lower secondary only for mathematics. Indeed, a one-percent unit increase of grade-retained students in upper secondary school increases the probability of a teacher cheating by 12% in Italian (Table 3) and 8% in mathematics.Footnote 4 In the case of lower secondary teachers, the marginal effect increases to 8% for a one-percent unit increase in retained students. This is, however, an outcome that may be influenced by the differential presence of retained students depending on the education grades. The extent to which we move up through the grades, the percentage of grade-retained students increases. It is almost non-existent in primary, representing only 3.4% of students. In the lower secondary this is about 7%, while in the upper secondary, it rises to almost 20% (Table 2).

In addition, to test whether cheating is more likely in classrooms with socially disadvantaged students, we estimated the probability that being suspected of cheating was dependent on the socioeconomic average status of classrooms. Here, we compared classrooms with a high SES average with low- and mid-SES classrooms. The results are significant and robust in all specifications: teachers in classrooms with low social composition are more prone to be suspected of cheating. Specifically, in primary and upper secondary, having a low-SES composition doubles the probability of cheating in both Italian and mathematics (Table 3). In terms of average marginal effects, teachers in low-SES classrooms are 5.5 to 7% more likely to cheat than teachers in high-SES classrooms. These results are attenuated, though still statistically significant, in the 1st year of lower secondary school and between mid-SES and high-SES classrooms.

5.2 Cheating as lack of civic capital

In this section, we tested the hypothesis that teachers in contexts with lower levels of civic capital are more prone to cheat. As mentioned earlier, here, we specifically explore to what extent cheating behaviours are associated with collectively-shared misbehaviours, which are detrimental to the public good. Beside deontological and ethical considerations, we consider them as non-civic-minded behaviours since they do not comply with legal requirements or are against laws and regulations.

School-level practice: excluding students from the test-taking

In Italy, students can take an adapted test or be exempt from testing, if they have a certified disability or a specific learning disability certification, such as dyslexia or dyscalculia. While this information is not available, we have got a proxy, albeit an imperfect one, for students who have been opportunistically excluded from the test to avoid worse results or to inflate the average results. This proxy is obtained by the difference between the number of students taking the test and the number of students on the class record. While the number of students with certified disabilities may vary significantly from class to class, or from school to school, this information should be independent from the teacher’s likelihood of cheating. Thus, any association observed between the increase in students absent in the test-taking and the likelihood of a teacher being suspected of cheating would point out simultaneity of opportunistic strategies to modify the average test results.

In Table 3, we see how the number of students absent from taking the tests is associated with the probability of cheating by both lower and upper secondary school teachers. With primary education the only exception, teachers that directly alter the test results to improve test scores are also prone to exclude a higher proportion of students from test taking as an instrumental strategy. This association shows how different strategic behaviours teachers use to inflate scores can run in parallel, even in low-stakes systems. This result is in line with those obtained by studies that have reported opportunistic behaviours related to the exclusion of low-SES and special-education-needs students, including cases such as The Netherlands (Mons, 2009), Canada (Bélair, 2005, cited by Mons, 2009), the USA (Cullen & Reback, 2006; Figlio & Getzler, 2006; Haladyna et al., 1991; Jacob, 2005; Madaus et al., 1992) and Chile (Hofflinger & von Hippel, 2020).

School-level practice: within-school social segregation

The second factor associated with teacher cheating is the practice of non-random allocation of students across classrooms based on their social background. In short, schools that undertake tracking practices, which result in social polarisation of classrooms, are also more likely to show higher levels of teacher cheating. A one-unit increase in the standard deviation of a classroom’s socioeconomic index within schools increases the probability of the classroom being suspected of cheating by between 3 and 5% in Italian (Table 3), and between 2 and 5% in mathematics. As for the odds-ratio, for a one-unit increase in the measure of social tracking, the probability of being suspected of cheating is, on average across grades, 1.6 times more likely than not being suspected. It is worth noting that in the case of upper secondary education, no significant association was found in mathematics. This is probably because upper secondary schools are already highly tracked, having a more homogeneous social composition. On the other hand, the most solid association between cheating and social tracking, as a proxy of non-minded-civic behaviour, is in the first year of the lower secondary school: this is when students from primary school are allocated into classrooms by the school board.

Civic capital and deterrents to cheat

At this point, we questioned to what extent deterrents of cheating are effective when considering the level of non-civic-minded engagement of schools. Firstly, it is worth noting that external control of the classroom during the test is the most important deterrent of cheating (Table 3). Moreover, we also confirm the spill-over effects of the presence of external controllers in other classrooms in the same school. Now, we introduce interaction terms to test the change in the interaction effects of those measures of external control on teacher cheating, depending upon the degree of civic capital in schools. Figures 1 and 2 show that the strong association between the lack of civic capital and teacher cheating is also observed in classrooms with direct and indirect external control. Although the predicted probability is always higher in unmonitored classrooms, even in classrooms with external control, the less civic capital in a school, the more cheating behaviours are found. If we compare the association trends between monitored and unmonitored classrooms, we observe that they tend to converge when schools show high levels of non-civic-minded behaviours. This means that, especially in the case of indirectly controlled classrooms, the predicted probability of cheating is the same as in unmonitored classrooms.

Fig. 1.
figure 1

Probability of teacher cheating: interaction between lack of civic capital in schools and external control in classrooms (direct control). Source: own elaboration, based on data from the Invalsi dataset 2011-12

Fig. 2.
figure 2

Probability of teacher cheating: interaction between lack of civic capital in schools and external control in schools (indirect control). Source: own elaboration, based on data from the Invalsi dataset 2011-12

In Fig. 1, the difference in probability between classrooms with and without direct external control is not statistically significant when the levels of civic capital are very low. This means that direct control in classrooms during test taking is an effective deterrent, except in those schools with very low levels of civic capital. Conversely, Fig. 2 indicates that indirect control is an effective deterrent when the school has medium or high levels of civic-minded capital. Specifically, the difference between the levels of teacher cheating between unmonitored and indirectly monitored classrooms is no longer significant when the lack of civic capital is in the 80th percentile (0.37).

Context-related factor: teacher absenteeism in the province

The rate of teacher absenteeism may be approached not only as a school system-related factor but also as a context-related factor. In either case, it is a powerful predictor of a lack of civic capital, as the results indicate a solid association between teacher cheating and the extent to which teachers are absent from the school, significant and robust across the grades and in both tests. Regarding the magnitude of the association, the tables show that, when the rate of teacher absenteeism at the province level increases by a one-percent unit, the probability of a teacher being suspected of cheating increases between 1.3 and 1.8%, depending on both the grade and the test (Table 4). Even when controlling for regional dummies (instead of macro-area), the association is significant, with the sole exception of the mathematics test in the upper secondary school.

Table 4. Teacher cheating and lack of civic capital: teacher absenteeism (Italian language)

Here, we confirm that the context in which schools are located is strongly correlated with cheating: teachers are more likely to cheat in contexts where more deviant behaviours — behaviours against laws and regulations — are collectively shared. This remains significant even after accounting for a wide range of province characteristics, including the socioeconomic composition, and for cultural and economic differences across macro-areas.

6 Discussion

This paper contributes to the educational policy debate by providing a comprehensive framework for school administrators and policymakers to understand and explore potential explanatory factors behind teacher cheating and other opportunistic behaviours. We have empirically studied the factors behind opportunistic behaviours in education systems with schemes of low-stakes accountability. More specifically, we have explored the practices of teacher cheating through the analytical lens of civic social capital. Our results challenge the assumption that the formal incentives and sanctions of high stakes accountability systems are the single driver of opportunistic behaviours. We show, contrary to traditional thinking within the field of educational accountability and assessment, how in low-stakes educational systems, teacher cheating can be associated with factors related to social capital. Our results suggest that cheating and free-rider behaviours can be explained as both stock and lack of civic social capital.

First, we have explored whether cheating behaviours are more likely in contexts where forms of bonding social capital prevail. The results of our analysis confirm that cheating is more prone to happen in social homogeneous environments: where strong ties are more likely, cheating levels appear to be higher and oriented to help the most socially and academically disadvantaged students — students with a low socioeconomic status and the grade-retained — in what could be understood as an altruistic attitude. Our findings are concurrent with existing research in the Italian context. Studies have suggested that the higher levels of cheating in Southern Italy are due to both a lower endowment of bridging social capital and a higher degree of bonding social capital (Bertoni et al., 2013). Other studies have demonstrated that cheating is positively associated with measures of particularistic values (Paccagnella & Sestito, 2014), which can be also understood as civic values related to bonding social capital. For instance, Paccagnella and Sestito (2014) found strong associations between cheating and contexts where people use close local networks to a greater extent.

In parallel, we have explored how and to what extent teachers’ cheating behaviours are associated with both school and context-based behaviours. We have hypothesised that teachers in schools and communities with lower levels of civic capital are more prone to cheat. Here, teacher cheating is understood as a misbehaviour that would reflect collective non-civic-minded beliefs and values. This has been highlighted by Paccagnella and Sestito (2014), who showed how cheating is negatively associated with proxies of trust towards education authorities and non-adherence to the rule of law. In our paper, findings confirm how teachers in schools with lower levels of civic capital (school-level non-civic-minded practices) or within contexts which have a lack of civic capital (territorially-aggregated levels of teacher absenteeism) are significantly more likely to cheat.

Another relevant and original contribution of this paper refers to the significant and robust correlation between teacher cheating and school-based opportunistic practices. Specifically, our results suggest that teachers are more likely to cheat in schools that also undertake practices ignoring institutional recommendations from authorities or legal requirements, such as student tracking or the exclusion of students from tests. These results suggest that teachers are more prone to cheat in institutional contexts where other opportunistic behaviours disregarding formal regulations and prescriptions are also found.

The practice of excluding students from tests has been already identified internationally, especially in countries with high-stakes tests such as Chile, the USA and the Netherlands. In the Netherlands, for example, the inspectorate has reported how, in some cases, students who were more likely to be sent to less prestigious school tracks did not take part in the high-stakes tests (Mons, 2009). In the case of Ontario, Canada, teachers reported that some schools had consistently been reducing the number of students to bring up the average (Bélair, 2005, cited by Mons, 2009). In Chile, schools serving disadvantaged students inflated their scores by having low-performing students miss high-stakes tests (Hofflinger & von Hippel, 2020). In the USA, numerous cases have been identified in several states such as California, Florida, Illinois, Texas and Alabama, where low-ability students were excluded to raise average scores (Haladyna et al., 1991; Madaus et al., 1992). For instance, several authors found that teachers responded strategically to incentives by classifying certain students having special education needs (Cullen & Reback, 2006; Figlio & Getzler, 2006; Jacob, 2005). Our results make a contribution to this literature, adding that strategic behaviours teachers use to inflate scores can run in parallel, even in low-stakes systems, like that of Italy. More research is needed to elucidate whether these practices are also correlated with other forms of opportunistic behaviours in different settings and how it is related to diverging stocks of civic social capital.

The results of our study are also relevant in terms of identifying potential deterrents of teacher cheating. Our results show that low levels of civic engagement and trust are associated with opportunistic behaviours and cheating practices. While we have found that direct control in classrooms is an effective deterrent of teacher cheating, for those schools with very low aggregated levels of social capital, this was not the case. Similarly, schools with lower levels of civic social capital are also the exception in terms of external control as a hindering factor for cheating. In other words, mechanisms that are expected to be effective in limiting cheating are essentially unproductive in these low social civic capital environments. Therefore, more efforts are needed to develop institutional trust and civic capital rather than on developing external control systems without sufficient support for teachers and schools.

7 Implications and conclusions

Altogether, the results presented suggest several implications for research, social theory and policy. First, regarding the implications for research, the evidence provided shows how the undesired effects of accountability systems are not only associated with the formal consequences attached to the test. In this regard, the existing literature tends to highlight the distinction between high and low-stakes accountability systems. This dichotomous differentiation might be flawed for different reasons. The consequences defined in the test might be formally clear but appear to be more blurred in the actual realities of classrooms and schools. Accountability system designs typically consider formal and material consequences, although informal, symbolic and reputational impacts tend to emerge when the test is implemented in real-school settings. For this reason, more than a dichotomous approach, we should understand accountability stakes as a continuum, considering both the formally designed stakes and the unexpected impacts and consequences emerging during the process of implementation.

We have shown how opportunistic behaviours and teacher cheating are not only a singular effect of high stakes accountability regimes. These results reinforce the idea of the performative effect of testing, numbers and metrics as powerful tools to change and modulate the perceptions and behaviours of school actors, who either elude their control or adapt and respond to diverse accountability demands. Such considerations invite us to further investigate under what circumstances, contexts and policy designs different accountability instruments generate particular effects in classrooms and schools. Future research should address the relationships between policy design, context and effects, analysing how similar accountability designs generate variegated effects in different contexts, as well as the other way around: how different designs might generate similar results depending on the context where they are implemented. Comparative, qualitative and quasi-experimental designs may be a promising research strategy to better understand the complex nature of such phenomena.

Our paper also suggests important contributions for social theory. The findings reinforce the interactionist approach according to which deviance may be understood as a socially constructed process linked to rules of behaviour created in particular social sub-groups, where illicit behaviours are interiorised and justified to respond to exogenously determined situations. Civic capital is also linked to moral standards as defined by the social cognitive theory. People avoid behaving in a detrimental way when they self-regulate and activate internal controls in accordance with such standards. Moral standards, which are socially modelled, are translated into actions and behaviours through which moral agency is exercised (Bandura et al., 1996). However, a self-regulatory system is not invariant, so conduct may differ significantly even if moral standards remain constant. The agreement between moral standards and effective behaviours may support the use of misbehaviours as proxies for non-civic-minded values with consequences on the public good.

A particular paradox between universalistic and particularistic understandings of social justice and civic behaviours is also identified in our research. Such tension is at the very core of the problem analysed and is both relevant for policy and theory. According to Bandura et al. (1996), detrimental conduct may be considered personally and socially acceptable depending on the extent to which they are portrayed in the service of valued social or moral purposes. This is a key factor in the process of moral justification of such behaviour and could be reinforced by an adverse socioeconomic context, exogenously determined and perceived as unfair. In this sense, cheating may simultaneously generate not only particular benefits but also general damage: some opportunistic actions may be undertaken for the benefit of the most disadvantaged, generating not only a particular benefit for those in more vulnerable conditions but also harming the implementation of accountability policies, possibly even including their eventual equity purposes from a universal understanding of social justice.

To limit and balance such tension, accountability systems might need to be revisited towards more intelligent designs (Crooks, 2003; Ehren et al., 2020; Lingard, 2009; O’Neill, 2013). In this sense, more flexibility is needed to ensure fairer designs and protect the general purpose of the accountability system — i.e. inform to improve and enhance more quality and equity for education systems — and, at the same time, ensure a proper systematisation to protect its internal and external validity. To advance towards such a horizon, accountability schemes should be oriented towards processes of quality improvement rather than external tools of control, involving different actors, but specially ensuring the role of experts and professionals, to enhance a system based on relations of trust among different actors. Where mistrust and perceptions of unfairness will be in place, policymakers should assume that opportunistic behaviours and cheating will prevail, limiting the capacity of accountability systems to properly work for quality improvement and the enhancement of educational justice.