Introduction

Typically, teachers have to make difficult and momentous decisions quickly, under time pressure, and in complex social situations. They have to implement these decisions directly and independently, acting routinely but at the same time adapted to changing situations (Berliner, 2001). The increased opening of schools toward inclusion is placing greater demands on the didactic and methodological knowledge and competences of teachers. Following UNESCO (2017) and Ainscow (2020), we see inclusion as a process of continuous school development that positively values the diversity of learners and seeks to identify and remove barriers to learning so that participation and learning success of all learners are effectively promoted and learners at risk of exclusion and underachievement can receive appropriate support. Inclusion is widely accepted as an idea or principle, but practical implementation is much more difficult (Kefallinou et al., 2020), many teachers subjectively experience inclusion as a burden (Avramidis & Norwich, 2002). Teacher professional development is a key factor in the development of inclusive practices at the school and classroom level (Kefallinou et al., 2020).

Many general teachers as well as special education teachers report to have been insufficiently prepared by pre-service or in-service training (Scherer et al., 2019a). With increasing heterogeneity in the classroom, teachers should consider the individual needs of the students, and ensure that students learn together and from each other in the classroom (Moser Opitz et al., 2018): Teachers need general pedagogical competence. But beyond that, central competencies must be trained domain specifically, e.g., the differential diagnosis of progress and problems in learning, the selection of individually suitable tasks, and the development of goal-oriented instructional support (Lindmeier, 2011). Within our project 'Designing Collaborative Learning Environments’ (DCLE) we offer an in-service training that ensures reflecting on inclusive mathematics teaching. We aim to help teachers to acquire and expand professional competencies in designing inclusive mathematics learning environments, to critically reflect on their own practice, and to develop it further by exploratory try outs of these learning environments in everyday teaching. At the same time, we hope to shape the teachers’ attitudes toward inclusive teaching and their self-efficacy expectations. Two differently designed in-service training courses were offered: One course was realized in a blended learning format; while, the other took place in an unsupported online setting.

This article will discuss how attitudes and self-efficacy expectations of general and of special education teachers developed through in-service training designed as a blended learning course. The effects will be compared to the effects of the unsupported online setting. The following section will explain the theoretical framework and our focus on attitudes toward inclusive teaching and self-efficacy expectations. Subsequently, we will describe our unsupported online and our blended learning in-service training before we present and discuss the design and the results of the empirical study.

Literature review and theoretical framework

Teachers' professional development and inclusive mathematics instruction

The development of teachers' professional competence is shaped by knowledge, experience of action and routines of action in practice as well as by professional ethos. Following Shulman’s (1987) model of professional teaching competencies we distinguish three cognitive dispositions as personal resources for inclusive mathematics teaching:

  • Mathematics content knowledge: an in-depth understanding of mathematics as a foundation of good teaching, especially of the mathematical concepts and procedures to be taught.

  • Mathematics pedagogical content knowledge: knowledge about the mathematical terms and procedures to be taught at the respective level of schooling and about their inner connections and meaningful arrangement, as well as knowledge about effective methods of classroom teaching and individual support, especially about the design of effective mathematical learning environments through substantial task formats and cooperative communicative activities of the learners. Following Wittmann (2001) learning environments should develop around substantial mathematical contents. “These allow, on the one hand, for the exploration of the epistemological structure in depth, and on the other hand, the reflection of didactical principles while testing substantial learning environments in practice, which adds to a deeper understanding of both the mathematics involved and students’ learning processes” (Nührenbörger et al., 2019, p. 63).

  • Pedagogical knowledge: interdisciplinary knowledge about students’ learning processes, about effective classroom management and lesson planning, about successful strategies of communication and interaction in classroom discourse, and about the design of effective learning environments.

This framework has been refined by further researchers, for example by Hill et al. (2008). They subdivide content knowledge (called subject matter knowledge) into the three domains common content knowledge, knowledge at the mathematical horizon and specialized content knowledge. According this, competent teachers understand the basic structures of their mathematical subject also in relation to the didactic components of teaching.

Going beyond these knowledge components for teachers, competencies can be conceptualized as latent cognitive and affective-motivational traits that are domain-specific, relate to real-world tasks, and underlie successful behavior in real-life situations (Blömeke et al., 2015; Jenßen et al., 2015). These relations are taken up in Blömeke’s and colleagues’ model of transformation of competence into performance, in which dispositions are mediated by situation-specific skills of perception and interpretation of and decision-making in specific practical situations (Fig. 1). These skills can be learned, they become effective in a task-specific manner, and they acquire stability across individual situations with professional routine (Barzel & Selter, 2015).

Fig. 1
figure 1

Relations between dispositions, situation-specific skills and teaching performance (Blömeke et al., 2015, p. 7)

The term inclusive education is often used but has different meanings. Ainscow and César (2006) describe different five approaches to inclusion in the field of tension between “inclusion as concerned with disability and `special educational needs´” (p. 233) and “inclusion as promoting a school for all” (p. 234) and “inclusion as Education for All” (p. 235). Following the last two ways of thinking on inclusion, the main aim of mathematics inclusive education is to break down barriers and to enable all children developing basic mathematical competences in interaction with each other (Scherer et al., 2016; Clapton, 2009). Instruction has to be adapted to the wide variety of learning potentials requiring adaptive mathematical teaching skills. Therefore, we interpreted the model of Blömeke et al. (2015) in the DCLE project for the needs of inclusive mathematics education (Fig. 2).

Fig. 2
figure 2

Relations between dispositions, situation-specific skills and teaching performance for inclusive mathematics instruction (dispositions to be evaluated in this study in italics)

The professional knowledge of teachers is a necessary, but not a sufficient condition for the preparation of, implementation of, and reflection on effective inclusive mathematics instruction. Knowledge is the basis for readiness to act when it meets supporting attitudes and convictions. The DCLE project focuses on two affective-motivational dispositions (in italics) that have been shown to be of particular importance (Scherer et al., 2016):

  • Attitudes concerning inclusive mathematics instruction, in particular the design and effectiveness of inclusive instruction, and attitudes toward the influence of learning difficulties.

  • Expectations of self-efficacy with regard to inclusive mathematics instruction in terms of differential mathematics instruction in general and instruction for children with special learning difficulties.

Attitudes constitute relatively enduring evaluations of objects, persons, groups, or issues, ranging from negative to positive. They are acquired through the individual biography and are subjectively considered to be correct. Furthermore, they influence the perception and experience of events, and they affect a person's willingness to act (Ajzen, 2001). Accordingly, positive attitudes toward inclusive education are relevant. Numerous empirical studies on the influence of attitudes have been summarized and analyzed internationally (De Boer et al., 2011; Scherer et al., 2016) and for German speaking countries (Garrote et al., 2020). The results can be summarized in three central findings:

  • There is a strong correlation between positive attitudes toward inclusion and the willingness to actively design inclusive learning environments. Those who consider inclusive teaching to be meaningful and evaluate it positively are more likely to be willing to engage in innovative teaching methods and to make an active effort to implement inclusive instruction (Hellmich et al., 2019; Wilson et al., 2019). In schools with positive teacher attitudes, inclusive teaching is more likely to be realized and implemented more sustainably (Tschannen-Moran & Woolfolk Hoy, 2001). Still, teachers' attitudes toward inclusion seem to play no significant role for changing the general social acceptance in inclusive classrooms (Garrote et al., 2020).

  • An examination of moderating variables shows that teachers generally report positive attitudes toward inclusive education, but are far more reluctant to assess their own competencies for inclusive instruction. Learning and language difficulties as well as physical and motor impairments are accepted by many teachers as being capable of inclusion; while, sensory impairments in hearing and in vision are critically assessed with reference to special means of communication. Serious intellectual impairments or psychological and behavioral disorders are meet with reservations (Avramidis et al., 2000; Hastings & Oakford, 2003).

  • Teachers who already are in professional practice are more confident than student teachers and teachers in early stages of professional development, especially if they have already had first-hand experiences with inclusion and if they have experienced inclusion successfully (De Boer et al., 2011).

However, overall agreement on inclusion depends on the type and extent of the learning impairments to be dealt with. Full inclusion is consequently rejected by some teachers. They do not see themselves in a position to teach children with severe intellectual impairments, with multiple disabilities, and especially children and adolescents with intensive and persistent behavioral disorders. Special education teachers often declare themselves responsible for these learners and they emphasize the need for intensive individual support in the case of disabilities. Prospective and experienced general and special education teachers consistently point out that inclusive instruction can only succeed if the burden on the whole learning group is not too high and if supportive materials are provided and supportive conditions are given at school (Scherer et al., 2016).

Recent research indicates that self-efficacy expectations may be decisive for the development of favorable attitudes toward inclusion and inclusive instruction (Savolainen et al., 2020; Schwab, 2019). According to Bandura (1997, p. 3) perceived self-efficacy refers to "beliefs in one's capabilities to organize and execute the courses of action required to produce given attainments". Applied to teaching self-efficacy has been defined as a teacher's judgement "of his or her capabilities to bring about desired outcomes of student engagement and learning, even among students who may be difficult or unmotivated" (Tschannen-Moran & Woolfolk Hoy, 2001, p. 783). Although the direct associations between teachers' self-efficacy expectations and their students’ achievement have been found to be modest but significant (Klassen & Tze, 2014), teachers' self­efficacy beliefs have been shown to have great influence on what happens in the classroom. According to a meta­analysis conducted by Zee and Koomen (2016) on 40 years of research, teachers with a strong sense of efficacy implement more effective strategies of classroom management and more effective teaching strategies. These teachers include process-oriented instruction and differentiation. Also, they vary instructional strategies based on students' needs in inclusive education and show a comparatively high level of professional commitment. They are more interested in professional development and less susceptible to burnout.

Banduras (1997) concept of teachers' self-efficacy is not a general but a specific construct related to domains of subject matter teaching, to instructional tasks and problems and to instructional contexts. A teacher may have different levels of self-efficacy beliefs for different domains and different contexts of instruction. In the context of inclusive teaching, self-efficacy is defined as a teacher's belief that he or she will be able to meet the requirements of particularly heterogeneous learning groups and that he or she will also be able to achieve good results with intellectually impaired, behaviorally demanding, or less motivated learners. The question arises as to how teachers can acquire positive self-efficacy expectations and whether and how they can be supported.

Self­efficacy expectations can be changed, but they cannot simply be passed on in a purely theoretical way through verbal mediation. They have to be acquired on the basis of mastery experiences (Bandura, 1997) involving the achievement of instructional goals through direct and personal action in the classroom. Mastery experiences through successful teaching is the first and strongest source of creating positive and stable self-efficacy expectations (Morris et al., 2017). It is important, that teachers perceive the difficulty of the instructional task as challenging and that they attribute success to their abilities, skills, and effort. Thus, mastery experiences do not affect self-efficacy beliefs directly. They are the results of reflective processes moderated by how the teachers interpret their experiences (Morris et al., 2017).

Teachers can be prepared for new or complex situations by vicarious experiences, a second source of creating self­efficacy expectations (Bandura, 1997). Vicarious experiences are derived from observing a model teacher or by observing oneself performing an instructional task. Model teachers may demonstrate efficient teaching strategies to experience positive consequences, or they may demonstrate how to cope with very demanding situations. Videos of oneself teaching can be compared to the teaching behavior of model teachers, and these activities can be supported by evaluative feedback in the form of social persuasions, a third method of acquiring self-efficacy. Social feedback and encouragement are especially important in new, complex and demanding situations. Feedback can be very helpful and supportive if it is specific and sincere and if the person providing feedback is knowledgeable and empathetic (Hattie & Timperley, 2007). Social support by a competent teacher will even be useful to reduce stress level and negative emotional arousal, Bandura's (1997) fourth and weakest source of information for self-efficacy.

The DCLE concept of in-service teacher training

Efficacy expectations are the results of reflective processes moderated by how teachers interpret experiences, observations and the feedback received (Morris et al., 2017). Empirical studies such as those by Morris and Usher (2011) or Tschannen-Moran and McMaster (2009) speak for the effectiveness of training measures that combine mastery experiences with verbal persuasion and feedback in real and simulated teaching situations. Various (meta-)studies found empirically supported criteria for designing effective programs of in-service professional development (Barzel & Selter, 2015; Darling-Hammond et al., 2017). In terms of content characteristics, it has been repeatedly demonstrated that in-service trainings that focus on subject-related learning and teaching are more effective in changing teachers’ classroom behavior than those that deal exclusively with general topics, e. g. pedagogical or psychological issues (Garet et al., 2001; Selter et al., 2015; Timperley et al., 2007). With regard to organizational characteristics, the long-term nature of continuing education is unanimously emphasized in many studies as the most important characteristic of successful continuing education (Borko, 2004; Timperley et al., 2007). The design of the DCLE concept of in-service teacher training was led by six characteristics that have been considered to be important and that have been shown to be empirically supported (see Barzel & Selter, 2015):

Competence orientation The orientation toward the content-related and methodological competencies to be acquired by the participants is a decisive prerequisite for their didactic and organizational lesson planning that meets the requirement of effectiveness (Garet et al., 2008; Neumann & Cunningham, 2009; Timperley et al., 2007).

Participant orientation Research has shown that inservice courses must address the participants’ individual needs, individual convictions and their heterogeneous individual prerequisites in a goal-oriented manner to develop them further according to demand with regard to their concrete teaching tasks (Clarke, 1994; Fishman et al., 2013).

Teaching–learning variety As Carpenter et al. (1989) or Lipowsky and Rzejak (2012) have shown, participants should be given sufficient time to acquire or deepen new competencies at different levels and in different settings.

Case-based learning In order for participants to be able to change their teaching routines and practices, they need suggestions and possibilities to incorporate the topics covered in the training into their concrete practice. Specific case vignettes and the orientation to the participants' practical experiences can be important points of reference for designing the in-service training (Borko, 2004; Kiemer et al., 2015).

Suggestion for cooperation The fifth characteristic of successful continuing education programs is the potential to encourage participants to cooperate (Boyle et al., 2005; Gräsel et al., 2007; Lund, 2019) because changing routines of action require discursive discussion within a community (Cochran-Smith & Lytle, 1999).

Encouraging reflection Finally, research shows that successful continuing education programs consist of a mixture of phases that first encourage action in the classroom or in continuing education practice and then reflect on that action (Bräuning & Nührenbörger, 2010; De Coninck et al., 2019; Llineares & Krainer, 2006).

During the in-service training, the website 'Math inclusive with PIKAS' was used as an online platform for self-study (pikas-mi.dzlm.de). The website presents basic principles of mathematics teaching and special education teaching such as 'active-discovery learning', 'natural differentiation' and the 'spiral principle' (Krauthausen & Scherer, 2013; Wittmann, 2001) or 'collaborative learning' (Scherer et al., 2016) with regard to design inclusive mathematics lessons. At the same time, important guidelines for inclusive mathematics instruction (especially 'adapting tasks', 'encouraging a change in presentation', 'diagnosis-guided support') are highlighted and substantiated with mathematical content that is important for students´ learning development in primary and early secondary school (e. g. building up number concepts, building up operational concepts, developing an understanding of place values) (Korten et al., 2019).

E-learning environments are convenient and increase access to instructional materials but they limit interactions with fellow learners and instructors provided by face-to-face learning. Moreover, they require self-regulated and highly independent self-study (Graham, 2019). In meta-analyses e-learning approaches were particularly effective when they provided teaching–learning arrangements that included well designed online elements as well as face-to-face learning (Bernard et al., 2014; Means et al., 2013).

Within the DCLE-project, we developed an in-service teacher training consisting of a series of six modules in a blended learning setting. The six face-to-face workshops included two full-day events at the beginning and end of the in-service training series as well as four half-day sessions. The training concept aims at the planning of collaborative learning environments ("CLEs") which should enable all heterogeneous pupils in a class to participate in a potential-optimizing and difference-sensitive way. For this purpose, the following six basic didactic topics are discussed during the in-service training series: (1) good tasks and differentiation concepts for learners with special needs in learning mathematics, (2) difference-sensitive lesson planning and adaptation of tasks, (3) diagnosis-guided support, (4) dealing with visual aids and change of presentation, (5) communication and cooperation of learners in class and (6) cooperative teaching.

These didactic topics are combined with five selected curricular contents, the mastery of which is central to build a basic mathematical understanding (Moser Opitz et al., 2017): (1) development of viable number concepts, (2) development of viable concepts about mathematical operations, (3) understanding of place values, (4) competence of flexible calculation and (5) development of ideas about measures and dealing with measures.

Participants had the opportunity to set their own content priorities using the website and to choose from a variety of multimedia approaches to the topics covered in the training. In this way, the blended learning combined the specific advantages of online materials for self-study (low-threshold, free and permanent availability, nonlinear processing of linked resources at individual pace) with the benefits of face-to-face workshops (personal contact, interaction and exchange with colleagues, discussions, feedback in the group, individual and cooperative processing of tasks) (Bernard et al., 2014; Graham & Allen, 2009; Ma & Lee, 2021) in combination.

At the same time, this design was suitable to alleviate two central difficulties of unsupported individual study: the lack of motivation as well as the inadequate learning and working methods of some participants. Lack of motivation, which often arises during long-term individual study, could be compensated through direct interpersonal contact with other participants. Inadequate learning and working methods could be compensated through modeling effects in cooperative group work (Borba et al., 2016). In addition, we aimed to implement the characteristics of effective teacher training by making use of the diversity of teachers in a participant-oriented way, thinking through and reflecting on authentic case studies, encouraging and supporting cooperative work, and actively developing competencies (Table 1).

Table 1 Quality dimensions of teacher training and selected realization possibilities in individual online self-study and in the face-to-face study (see Barzel & Selter, 2015)

In addition to the core element of blended learning, the content and practical examples of the online material were discussed in face-to-face workshop sessions and linked to the experiences of the participants' everyday teaching. In addition, the participants got to know exemplary learning environments, which they adapted for their own teaching according to context and individuality. Tasks that encouraged observation and experimentation in the classroom promoted the transfer of the learned in-service training content into everyday teaching practice. The described targeted tasks and the patterns of action experienced in the participants’ own teaching were collectively reflected on by the participants in the following workshop session. This reflection on the practice experienced by themselves and their colleagues seems to be of central importance for teachers’ competence development (Herzog, 1995; Jaworski & Huang, 2014; Wyss, 2008).

The in-service training concept also focuses on the participants’ cooperation in planning, implementation and reflection of the practical tasks by forming multi-professional teams. General education teachers have a stronger subject and subject didactic background to teach mathematics in class; while, special needs teachers have specific expertise in the area of learning impairments, diagnosis and support (Scherer et al., 2019a). The multi-professional teams of general and special education teachers had regular opportunities to exchange ideas across professional boundaries during the in-service training. In addition, they worked together during the practical phases in order to reflect not only on their personal planning and teaching in the inclusive classroom, but also on their professional collaboration with their colleagues.

Methodology

Design of the evaluation study

The effects of the unsupported online and blended learning conditions were compared in a quasi-experimental pre-post-test design with follow-up measurements and two intervention groups (A and B). Intervention group A started with the blended learning program after the pre-test. Intervention group B initially had access to the unaccompanied online offer without further support. Only after training in group A was completed, blended learning was conducted with group B. This allowed us to evaluate if any effects could be observed in the unsupported online phase in group B and if any effects observed in group A could replicated after blended learning in group B (Fig. 3).

Fig. 3
figure 3

Design of the study over time

Intervention group A completed the pre-tests in late summer 2018 (t1) and then began the blended learning program, which ended with the post-tests in February 2019 (t2). The follow-up tests were conducted in November 2019 (t3). In the first project phase, the control group (Intervention group B) was thus able to use the online offer at first. In September 2018 (t0), the first pre-tests were completed, which were repeated in February 2019 to evaluate the unaccompanied online offer (t1). The results also served as a pre-test for the blended learning offer, which started immediately afterward and was completed with the post-test in July 2019 (t2) and the follow-up test in February 2020 (t3).

Participants and procedures

For both intervention groups, regional samples were taken on a voluntary basis. The announcement of in-service training opportunities in the blended learning program was disseminated in two regions around Dortmund by local education agencies. Systematic sampling was not possible due to professional agreements. All interested teachers were invited to an information event in May (Group A) and September 2018 (Group B). First, they were given a detailed introduction to the online offer, which had been freely available to all teachers since the beginning of 2018. Afterward, scope, contents, tasks and methods of the blended learning in-service training were presented. 197 teachers attended the information events, 101 have chosen to participate in the program. The non-participating teachers voluntarily completed a short scale on inclusive attitudes and self-efficacy of teachers (Bosse & Spörer, 2014) but no additional data could be collected. In informal interviews, many teachers cited their workload as a reason for not participating.

The participating teachers met in two regional intervention groups A (n = 39) and B (n = 62). The majority of teachers in both groups was female (A: 87%, B: 81%). In both groups were more elementary teachers (A: 59%, B: 40%) than secondary teachers (A: 3%, B: 16%) and about 40% were special education teachers in each group. In group A, the participants were older and had more work experience. In mathematics teaching, fewer teachers reported to be in the first three years of their professional career (A: 5%, B: 23%), most teachers had 4 to 15 years of professional practice in both groups (A: 49%, B: 47%) and more teachers had more than 15 years of work experience in Group A (44%, B: 30%). In inclusive instruction, fewer teachers reported to be in the first three years of their professional career in group A (40%, B: 55%), most teachers had 4 to 9 years of professional practice in both groups (A: 44%, B: 39%) and more teachers had 10 or more years of work experience in Group A (11%, B: 5%).

Our project partners did not influence sampling procedures, but due to practical constraints we had to accept a voluntary sample—deliberate or random sampling could not be implemented. Our samples roughly reflected the age structure and gender distribution in the teachers’ staff in the region. The relatively short professional experience in inclusive teaching was due to the fact that inclusive teaching has only been actively introduced in Germany since 2009. However, a voluntary sample may somewhat be biased, because volunteers may differ in certain aspects from people not volunteering, e. g. they may be more motivated to take on additional work or they may be more interested in their professional development. To accept a sample on a voluntary basis was the only way to gain access to members of the target population.

Measures

The KIESEL-scales (Kurzskalen zur inklusiven Einstellung und Selbstwirksamkeit von Lehrpersonen [short scales for the measurement of inclusive attitudes and self-efficacy of teachers]; Bosse & Spörer, 2014) were used to operationalize the variables attitudes toward inclusive mathematics instruction, attitudes toward effects of inclusive mathematics instruction, and attitudes toward the influence of student behavior on inclusive mathematics instruction. These scales capture different facets of attitudes and self-efficacy expectations in written self- assessments reliably and accurately, which are collected in a bipolar four-level Likert response format. Items and answer formats were retained as far as possible but it was not asked for inclusive instruction in general, but for inclusive mathematics instruction. Instead of a four-level scale, a six-level response format was used in order to adapt the items to the second scale used. Because we did not add a middle response option and the verbal labeling of the rating scale points kept exactly the KIESEL format (example item: children with special needs have higher learning gains if they are taught in inclusive math classes. From 0 = do not agree at all up to 5 = agree completely) we did not expect validity issues, particularly since rating scales with 6 to 7 points are prominent with respondents and are expected to lead to gains in reliability and validity (Preston & Colman, 2000; Krosneck & Presser, 2010). From the pre-test data collected during the information events, 196 complete data sets could be used. Reliability coefficients (Cronbach’s α) did not exceed the frequently recommended orientation value of 0.70 (Cortina, 1993), that had been achieved in a large sample of university students (Bosse & Spörer, 2014), but taking into account the low number of items per scale (Cortina, 1993), they were considered to be still acceptable (0.69, 0.64 and 0.67).

In order to operationalize the variables self-efficacy with regard to the design of internally differentiated mathematics instruction in general and self-efficacy with regard to the design of adaptive mathematics instruction for specific learning conditions, we used appropriate items from a questionnaire on 'self-efficacy of student teachers with regard to inclusive instruction'. This scale will be published soon and has been used successfully in a validation study on attitude scales (Schulze et al., 2019). These items are intended to measure the expectations of self-efficacy in a differentiated, reliable and time-economical way. In the Likert-format described above, the items look at methods of adapting teaching to generally promote participation and learning opportunities in heterogeneous school classes (example item: I can develop ideas to adapt materials and methods in mathematics instruction to the individual learning needs of children) as well as in relation to specific special educational needs (example item: I can organize mathematics instruction so that children with learning difficulties can learn at their own pace).

Seven respectively five items were selected. Items relating to the special needs in hearing, vision and physical/motor development had been omitted. The item format was retained and the questions were adapted for inclusive mathematics instruction. The coefficients of internal consistency in our sample of 197 teachers attending the first information meeting were 0.86 and 0.67.

Research questions and hypotheses

The described focus of the study on the development of attitudes and self-efficacy expectations toward inclusive mathematics instruction through online vs. blended learning extends previous studies by evaluating the significance of the subject matter of teaching and the design of the in-service training. The following research questions are addressed:

  1. 1.

    How does the blended learning program affect teachers' attitudes toward inclusive mathematics instruction and self-efficacy expectations?

  2. 2.

    How do online resources without guidance and support affect teachers' attitudes toward inclusive mathematics instruction and the associated self-efficacy expectations?

  3. 3.

    Can moderator effects be observed that can be traced back to subject variables (intervention group, gender, type and level of teaching certificate, certificate in mathematics instruction, professional experience in general and in inclusive instruction)?

  4. 4.

    Will the effects of the blended learning program be maintained after the intervention?

The research questions 1, 2 and 4 were differentiated into five specific questions that relate to the dependent variables assessed, as will be shown in the next section. We expected positive effects for questions 1 and 2, moderator effects for professional experience and experience in mathematics instruction, and maintenance of effects for question 4.

Data analysis

Strategically, we had to decide whether the t test for dependent samples or the nonparametric Wilcoxon signed-rank test should be used for the analysis of data collected at different points in time. The t test is considered a robust procedure (Rasch & Guiard, 2004), but if extreme values distort the symmetry and the deviations from the normal distribution are significant, false positives are likely to occur (Bühner & Ziegler, 2017). In the present case, the empirical distributions were analyzed visually and tested on normality with the Kolmogorov–Smirnov test. With few exceptions, distributions were left skewed and showed extreme values. Deviations from a normal distribution were statistically very significant. Therefore, the Dependent-samples Wilcoxon signed-rank test was used. In those few cases where the difference values were approximately normally distributed, the t test for dependent samples was additionally calculated to validate nonparametrical results. Power was calculated post hoc with the G*Power (version 3.1.9.4, Faul et al., 2007) and effect sizes were calculated by dividing the z-score of the test statistic by the square-root of the number of observations. Moderator effects were tested by two-factorial analysis of variance with repeated measures (mixed design), a parametric but relatively robust method (Bühner & Ziegler, 2017). In order to test conservatively, only variables were analyzed when significant main effects were found.

Results

Effects of blended learning

Research question 1 was divided into five sub-questions: How does the blended learning program affect attitudes (1a) toward inclusive mathematics instruction, (1b) toward the effects of inclusive mathematics instruction, and (1c) toward the influence of student behavior on inclusive mathematics instruction? How does it affect self-efficacy expectations with regard to the design of (1d) generally internally differentiated mathematics instruction and (1e) adaptive mathematics instruction for specific learning conditions? To answer these questions, the difference values between pre- and post-test scores (t1, t2) were calculated and evaluated for the entire sample. With one exception (see below), the data were distributed left-skewed, showed extreme values and deviated significantly from a normal distribution (Kolmogorov–Smirnov test with Lilliefors correction, df = 67, p < 0.05).

In Table 2, the median and range values for the pre- and post-tests before and after blended learning (t1, t2) and the results of the Related-samples Wilcoxon signed-rank test are summarized. For all five variables, changes occurred as expected, because the median values increased after blended learning and ranges were reduced. These changes are particularly evident in the attitudes toward the design of inclusive mathematics instruction and in self-efficacy expectations with respect to the design of internally differentiated mathematics instruction in general as well as for specific learning conditions. The changes are less pronounced in attitudes toward the effects of inclusive math instruction and in attitudes toward the influence of student behavior on inclusive math instruction (cf. Table 2).

Table 2 Results of the Related-samples Wilcoxon signed-rank test for attitudes and self-efficacy expectations before and after blended learning

Attitudes toward the design of inclusive math instruction show a quite pronounced and statistically highly significant effect with medium effect size; while, less pronounced increases in attitudes toward the effects of inclusive math instruction and toward the effects of student behavior on inclusive math instruction are not significant. For self-efficacy in relation to the organization of generally internally differentiated mathematics instruction and self-efficacy in relation to the organization of adaptive mathematics instruction with specific learning conditions, positive effects are statistically highly significant with medium to high effect sizes. The three significant effects had been confirmed with high statistical power (1 − β = 0.99). Since the difference values of the latter variable were approximately distributed normally, the t test for dependent samples was additionally calculated to confirm the nonparametrically obtained result. This test yielded a highly significant result with high effect size for one-sided testing, confirmed with high statistical power (t = − 7.36; df = 89; α = 0.00, d = 0.78; 1 − β = 0.99).

Looking back at the results, the hypothesis was confirmed: The blended learning program developed in the project seems to have had a positive influence on teachers' attitudes and self-efficacy expectations with regard to inclusive mathematics instruction.

Effects of online resources

To answer the second research question, five dependent variables have been collected in analogy to the first research question, so that five specific questions could be pursued: How do online resources without guidance and support affect attitudes (1a) toward the design of inclusive mathematics instruction, (1b) toward the effects of inclusive mathematics instruction, and (1c) toward the influence of student behavior on inclusive mathematics instruction? How do they affect self-efficacy expectations concerning the design of (1d) generally internally differentiated mathematics instruction and (1e) adaptive mathematics instruction with specific learning conditions? To answer these questions, differences between the first pre-test in the introductory session (t0) and the second pre-test after the waiting period of about 20 weeks immediately before the start of the blended learning program (t1) were calculated for the waiting control group (B). Apart from one exception (see below), the distributions were skewed, showed extreme values, and deviated significantly from a normal distribution (Kolmogorov–Smirnov test with Lilliefors correction, df = 67, p < 0.05).

In Table 3, the median and range values for the first and second pre-test before and after access to online resources without guidance and support (t0, t1) and the results of the Related-samples Wilcoxon signed-rank test are summarized. For attitudes toward inclusive mathematics instruction, median values were stable and range decreased; whereas, the mean values for attitudes toward the effects of inclusive mathematics instruction and attitudes toward the influence of student behavior on inclusive math instruction increased slightly with range unchanged respectively reduced. For self-efficacy with regard to the organization of internally differentiated mathematics instruction in general and self-efficacy with regard to the design of adaptive mathematics instruction for specific learning conditions, slightly increased median values were combined with reduced respectively increased ranges.

Table 3 Results of the Related-samples Wilcoxon signed-rank test for attitudes and self-efficacy expectations before and after access to online resources

The null hypothesis could not be refuted. The Wilcoxon test showed very low effect sizes (r) close to zero for all five variables, which could not be statistically verified. Due to the sample size (n = 65), the statistical power for detecting small differences was very low (1 − β from 0.10 to 0.34). Since the difference values of the variable self-efficacy with respect to the design of internally differentiated mathematics instruction were approximately normally distributed, the t test for dependent samples was additionally calculated to verify the nonparametrically achieved result. This test provided a non-significant effect close to zero (t = − 0.44; df = 64; n.s.; d = 0.05; 1 − β = 0.05, one-sided test).

All in all, the hypothesis could not be confirmed. Having access to a website on inclusive mathematics instruction without guidance and support does not seem to have greater effects on the development of teachers' attitudes and self-efficacy expectations toward inclusive mathematics instruction.

Moderator effects

As the participants in the study could only be recruited on a voluntary basis, it was not possible to balance the intervention groups by random or deliberate sampling. However, variables had been identified that might possibly moderate the observed effects: Intervention group, gender of participants, type and level of teaching certificate, certificate in mathematics instruction, professional experience in general and with inclusive teaching. Moderating effects were tested by means of two-factor analysis of variance with repeated measures (mixed design), a relatively robust procedure (Bühner & Zöfel, 2017). The Levene test was calculated and only in two cases the assumption of homogeneity of variances could not be met. In order to test conservatively, only those three variables that showed statistically significant main effects were analyzed: attitudes toward the design of inclusive mathematics instruction, self-efficacy expectations with regard to general differentiation, and self-efficacy expectations with regard to specific differentiations in learning difficulties.

The results of the Wilcoxon tests were corroborated. The main effects for the repeated measures factor (pre-post-test, t1/t2) were highest and showed highly significant effects (p < 0.001), verified with sufficient statistical power (1 − β from 0.68 to 0.99). This applied to all three dependent variables and to all moderator variables with one exception. The repeated measures factor was found not be significant, if the variable attitudes toward designing inclusive mathematics instruction was moderated by the subject factor professional experience with inclusive instruction.

The interaction effects between the repeated measures factor and the subject variables were of low effect size and statistically significant in only three cases. The factor intervention group moderated the self-efficacy expectations with regard to general differentiation (F(1,93) = 6.6, p < 0.05; ||2partial = 0.07, 1 − β = 0.72). The factor gender moderated self-efficacy expectations with regard to differential mathematics instruction (F(1,93) = 7.3, p < 0.01, ||2partial = 0.07, 1 − β = 0.76) and adaptive mathematics instruction for specific learning conditions (F(1,88) = 5.93, p < 0.05, ||2partial = 0.06, 1 − β = 0.67). In all the other analysis, there were no statistically significant differences to be found between general or special needs teachers, primary or secondary school teachers, teachers with little or no experience in inclusive instruction or in mathematics instruction. Mathematics teachers tended to show higher effects, but due to sample size the differences did not reach statistical significance.

Long term effects

Will the effects of the blended learning program be maintained after the intervention? In order to answer this question, the difference values between post-test (t2) and follow-up-test scores (t3) were calculated and evaluated for the entire sample. The values of the first two attitudinal variables were approximately distributed normally but slightly left-skewed, the values of the other three variables deviated significantly from a normal distribution (Kolmogorov–Smirnov test with Lilliefors correction, p < 0.05).

In Table 4, the median and range values for the post-test after the blended learning program (t2) and the follow-up test six months later (t3) as well as the results of the Related-samples Wilcoxon signed-rank test are summarized. For all five variables, median values stay stable or decrease slightly; whereas, ranges decrease for attitudes toward inclusive mathematics instruction and increase for the remaining four variables. Overall, it was found that the effects of the blended learning program remained stable or decreased only very little in the following months.

Table 4 Results of the Related-samples Wilcoxon signed-rank test for attitudes and self-efficacy expectations after blended learning (post-test and follow-up)

The Wilcoxon test showed extremely low effect sizes near zero for all five variables, which could not be statistically confirmed, in part attributable to small sample size (n = 39) and very low statistical power (1 − β from 0.05 to 0.27, two-sided testing). Since the distributions of the first two attributional variables approximated normal distribution, additional t tests for dependent samples were calculated to confirm the nonparametrically obtained results. The t-values were positive and negative, statistically not significant with very low effect sizes close to zero (attitudes toward the design of inclusive mathematics instruction: t = − 0.857, df = 38, n.s., d = 0.14; attitudes toward the effects of inclusive mathematics instruction: t = 0.322, df = 38, n.s., d = 0.05). However, small effects could not be statistically verified due to small sample size and lack of statistical power (1 − β < 0.15).

Discussion

Effects of blended learning Intensive use of online materials can be achieved by combining them with face-to-face workshop activities. In this study, 91 teachers participated in six workshops, in which six quality dimensions of successful in-service teacher training (competence orientation, participant orientation, teaching–learning variety, case-based learning, stimulation of cooperation, and promotion of reflection) were consistently realized. To promote theory–practice transfer, practical tasks were to be worked and reflected on during the workshops. For this purpose, tandems were formed, consisting of a general teacher and a teacher for special needs education, if possible.

Positive changes from pre-test to post-test were observed in all attitudinal and self-efficacy measures. Attitudes toward the effects of inclusive mathematics instruction and the influence of student behavior on inclusive mathematics instruction showed little change and were not statistically significant. Participants still consider the learning prerequisites and behavior that learners bring to the classroom to be crucial and they evaluate the possibilities of inclusive mathematics instruction in a similar way as before the in-service training. In contrast, they evaluate their ability to design inclusive math lessons better after the training than before. This finding is also found in the self-efficacy expectations because teachers are much more confident about their options for differentiated and individualized mathematics instruction in general and for children with specific learning difficulties in particular. They do not change their rating of inclusive mathematics instruction in general but they do change the rating of their options for action in a positive and statistically significant manner with sufficient statistical power.

In the DCLE project, we were therefore able to create a blended learning in-service teacher training in which we succeeded in supporting professional readiness to act by improving attitudes to inclusive mathematics instruction and associated self-efficacy expectations. The blended learning format seems to be effective because all measured values change in the direction assumed. In addition, it fits that the highest effects can be recognized precisely with the attitudes toward the organization of inclusive mathematics instruction and with the self-efficacy expectations generally and particularly, which are the variables that are highly relevant for action from our point of view. These are statistically well established (1-ß achieves an almost perfect value).

The success of blended learning is also documented in the meta-analyses of Means et al. (2013) and of Bernard et al. (2014), both of which report mean effect sizes of about 0.3 standard deviations. Although the results of our experiment cannot be traced back to individual features of the program, but can only be established summatively. The effectiveness of the program is probably due to the fact that it succeeded in increasing the intensity of the participants' interaction with the materials offered online during the workshops and during teaching in their own classrooms as well as in promoting and intensifying the interaction between teachers. Bernard et al. (2014) consider these two variables the key variables in blended learning. Future research should explore whether this is true for teacher education and in-service training in general and which of the six quality dimensions of our program are particularly important.

Effects of the online offer In the first information session, the teachers in the waiting control group were made aware of the online resources available to them in the following five months. They were able to access multimedia materials on the didactics and methodology of inclusive mathematics instruction at primary level and concrete examples for inclusive lesson planning were accessible. The results from the second data collection showed statistically insignificant effects close to zero. Thus, in this sample, a pure online format without supporting activities proved to be less effective in changing attitudes and self-efficacy expectations concerning inclusive mathematics instruction. These findings are important because the transfer of professional knowledge into one's teaching often fails due to negative attitudes and unsuccessful self-efficacy expectations.

The relatively low effectiveness of unsupported online offers has been empirically proven in international research. In the meta-analyses of Means et al. (2013) and of Bernard et al. (2014), individual studies with high effects are cited, but average effect sizes close to zero are reported. Our findings, therefore, correspond with previous findings, but still caution is advised when interpreting them. The use of the online materials in this study was voluntary. It was not possible to individually monitor who actively used which materials, and how often and how intensively they did that. Consequently, only statements relating to the overall group can be made. Although it can be determined that the unsupported online offer had little effect among the 68 teachers in the waiting control group, it is not possible to analyze in detail whether intensive use has any desirable effects. Future studies will have to clarify this question.

Moderator effects Compared to the general effectiveness of the blended learning training, the effects of person-related variables were very small and statistically significant in only three cases. Concerning the variable self-efficacy with regard to general differentiation in inclusive mathematics instruction, there was a disordinal interaction with the intervention group. Group A achieved higher initial values in the pre-test than group B (M = 24.1 vs. M = 21.7), after the training the increase for group A (M = 26.8) was smaller than for group B (M = 26.6). There was an ordinal interaction between the gender factor and both self-efficacy expectations: Female teachers were more reserved in their assessment of self-efficacy before and after the training than male teachers, but showed a higher increase in the post-test values (general differentiation: female teachers from M = 25.8 to M = 28.4; male teachers from M = 21.8 to M = 26.3/Differentiation for specific learning difficulties: male teachers from M = 16.8 to M = 18.8; female teachers from M = 14.2 to M = 17.1). These findings cannot be interpreted at this stage. Future projects will need to determine whether they describe random characteristics of the sample or whether they can be replicated.

The results show that the changes brought about by the training were much greater than any influence of other variables. The fact that this does not apply to attitudes toward the design of inclusive mathematics instruction, taking into account experiences with inclusion, could be due to the fact that everyone assessed this variable similarly before and after the training, regardless of their experience with inclusion.

Long term effects For teachers in the field of inclusive education, online materials are useful because they can be prepared in a multimedia format and adaptively designed for practical use. In addition, they can be used free of charge at any time. This study shows that a change in attitudes and self-efficacy expectations could not be achieved by simply providing online materials but that the combination with face-to-face workshops incorporating the online materials was very successful. The follow-up tests show that these effects were still stable after six months. Consequently, the blended learning program has proven to be successful and may be recommended for practical use. Future research should clarify which features and activities of our program are particularly effective and whether the effects are equally effective for all teachers under varying conditions or whether interaction effects have to be considered.

Conclusions, open questions and limitations

Inclusive teaching is demanding and many teachers do not feel to be well prepared. Teacher professional development has been found to be a key factor in the implementation of inclusive practices at the school and at the classroom level (Kefallinou et al., 2020). To provide theoretical information alone is of little help, since teachers have to develop teaching strategies adapted to the content to be learned and to the skills and competencies of the students. Starting with a model of teacher’s competencies, in which cognitive and affective dispositions are mediated by situations-specific skills of perception, interpretation and decision -making and transformed into observable teacher performance (Blömeke et al., 2015), we designed an in-service teacher training led by six quality dimensions that have empirically been shown to be conducive to effective teacher training (Barzel & Selter, 2015), focusing on competence development, participant orientation, and teaching–learning variety, and promoting case-based learning, cooperation, and reflection. Because the demands of inclusive teaching vary with the content of instruction, we combined six didactic principles of mathematics teaching with five curricular contents central to the development of mathematical understanding in the primary grades (Moser Opitz et al., 2017). In an effort to combine the effects of online learning with face-to-face interaction and with cooperative learning and reflection, we designed six workshops within a blended learning format.

We evaluated the effects of the blended learning program with two intervention groups of voluntarily participating teachers in a pre- and post-test design, tested if effects could be replicated, and compared them to the effects of a pure online offer without support and assistance. We assessed two variables that have been shown to be relevant to the development of inclusive teaching, the teachers’ attitudes toward inclusive mathematics instruction, and their self-efficacy expectations with regard to inclusive mathematics instruction. In the first group (n = 39), the online only condition did not show any effects, whereas after blended learning positive changes from pre-test to post-test were observed in all attitudinal and self-efficacy measures. The effects were small for some subscales and they could not be statistically verified due to small sample size, but they were higher and they reached statistical significance when the attitudes toward the organization of inclusive mathematics instruction and self-efficacy expectations in inclusive mathematic teaching were concerned. These results could be replicated in the second intervention group (n = 62). In both groups, the effects of blended learning were relatively high and more pronounced than any moderator effects of person-related variables.

Providing information on inclusive teaching online or in print may be one first step, but there is more to develop inclusive teaching skills and competencies. Our research shows, that blended learning may be effective, combining the advantages of independent self-study and online resources with face-to-face interaction and cooperative learning. Our online resources and our in-service training program are specifically adapted to inclusive mathematics instruction. They have been well documented and they are available, ready for implementation.

Field-based research has its limitations. It may be strong in terms of external validity, but it does not allow complete control of confounding factors in terms of internal validity. Both intervention groups had to be selected from two regions and on a voluntary basis. Because intervention studies require many human and material resources, it was not possible to conduct research with a larger sample. Due to small sample size, the effects of the moderator variables gender of participants, type and level of teaching certificate, certificate in mathematics teaching, and professional experience in general and with inclusive teaching did not reach statistical significance. They need further statistical validation.