Introduction

The initial and in-service education of teachers is crucial to improve students’ learning (European Commission, 2010; Borko, 2004), it being one of the most prominent topics in educational research. A specific and promising subdomain is that devoted to teacher learning in collaborative enquiry-oriented professional communities (Robutti et al., 2016; Zaslavsky et al., 2003; Stoll et al., 2006; Vescio et al., 2008).

Amongst possible approaches, lesson study (LS) is considered a powerful strategy for teacher learning (Doig & Groves, 2011; Cheung & Wong, 2014; Seleznyov, 2018). As a result, a growing interest in LS has been observed in the past 20 years. It can be traced, for instance, through a basic search on Scopus or the Web of Science databases.

Our work is particularly motivated by the difficulties we observe in prospective teachers, year after year, to activate their theory-based knowledge when they have to face professional tasks such as designing and/or implementing lessons based on said knowledge. This phenomenon cannot simply be explained using naïve arguments like prospective teachers’ lack of understanding of the theoretical knowledge, regardless of the fact that said understanding should always be considered fragile and in progress. It is undoubtedly related to the so-called theory-practice gap (Cochran­Smith & Lytle, 1999; Darling-Hammond, 2000; Korthagen, 2010).

According to existing research, LS could contribute to reducing this gap (see, for instance, Hourigan and Leavy, 2019, Pérez Gómez et al., 2015, Soto Gómez et al., 2016, Mayorga Fernández et al., 2021), leading to teacher preparation programmes based on strong theory-practice synergies and collaborative partnerships between universities and schools, which are considered to be highly effective (Helgevold & Wilkins, 2020).

At present, there is a solid research agenda that explores LS in initial teacher education (ITE). In a literature review, Ponte (2017) mentions several issues, he found in research that deserves more attention, such as defining the aims and expected outcomes of the LS process, how relationships between participants are established, the problem of scaling-up or the adaptation and simplification of the LS process. He suggests that, since a single LS cannot achieve the whole spectrum of aims of prospective teachers’ preparation, LS in ITE must have a formative aim that should be explicitly stated. This is consistent with the 2011 OECD global summit report, which noted that ITE programmes should be built upon clear and concise profiles of what teachers are expected to know and able to do (Helgevold & Wilkins, 2020).

Furthermore, in Larssen et al. (2018) literature review of LS in ITE, a lack of clarity was observed in the definition of teacher learning and in the use of learning theories. However, a strong orientation towards notions of social and collaborative perspectives on learning focusing on enquiry and reflection processes is observed. They suggest that research should be more explicit about how teacher learning is defined and observed in LS processes.

According to Ponte (2017) and Helgevold and Wilkins (2020), researchers should be clear about the formative aim of the teacher preparation programmes they design and test, and about what teachers are expected to know and do with such knowledge. In our study, following the tradition in our institution, we explicitly build upon the Theory of Didactical Situations (TDS) (Brousseau, 2002). In other words, we aim to explore how LS might help prospective teachers activate their knowledge about the TDS for the purpose of designing and implementing lessons based on this theory. The choice of a theoretical framework regarding the teaching and learning of mathematics is not irreleavant. As stated in García et al. (2019), the theory assumed critically affects the work of teachers in an LS process and their professional learning. This idea will be further developed in Sect. 2.1, where we will discuss the use of the notion of didactic paradigms recently introduced into the framework of the Anthropological Theory of the Didactic (Bosch et al., 2020). We thus primarly seek to make a contribution with respect to the role of theories in LS.

Larssen et al. (2018) suggest that researchers should be explicit about the teacher learning perspective they adopt. Since the prospective teachers involved in this study had already completed a course about the TDS (see Sect. 3.1), our main aim was not to find out how prospective teachers build such knowledge, but if LS could increase their ability to use it in real context. For this purpose, we consider that the notion of self-efficacy, referred to in Bandura’s social learning theory, is especially appropriate, as we will discuss in Sect. 2.4.

The research question explored in this paper could, therefore, be formulated as follows: How can a TDS-oriented LS process contribute to increasing prospective teachers’ self-efficacy to plan and teach TDS-based lessons, reducing the theory-practice gap?

In what follows, we discuss the role the explicit use of theories could play when planning and conducting LS, building on the idea of didactic paradigms. We focus on a specific theory, the TDS, showing how it could be re-interpreted in terms of a possible didactic paradigm, and how it might affect an LS process. We also very briefly comment on how the notion of self-efficacy is considered and used in our research. We then move on to the specific features of our research: (1) the design and implementation of an LS experience within the paradigm of the TDS; (2) the development of a quantitative instrument to measure teachers’ self-efficacy to plan and teach TDS-based lessons and (3) the presentation and discussion of the results of a pre–post study with prospective teachers that engaged in the LS process and with prospective teachers that followed a different educational route. Finally, we draw some conclusions about the benefits of LS and suggest lines for future research.

Theoretical framework

In this section, we introduce different theoretical tools that will be used to plan and conduct our research. We also formulate our results and draw conclusions from them.

Lesson study, theories and didactic paradigms

LS is a collaborative and research-oriented practice aimed primarily at developing teachers’ knowledge and skills. There are several descriptions of LS as a cyclical process (Murata, 2011; Shimizu, 2014; Doig et al., 2011; Takahashi & McDougal, 2016). However, all of them share some common features: (1) The teachers’ work focuses on a theme or research question that concerns student learning; (2) there is an elaborate design of a teaching proposal based on a thorough enquiry process of curricular materials, existing resources, research results, etc.; (3) the teaching proposal is taught in a real classroom by a member of the LS group and observed by the rest of the teachers and (4) a group discussion, focused on student learning rather than on teacher actions, and that may include input from an external expert (such as a university professor or an experienced teacher), takes place at the end of the process.

As Elliott (2012) suggested, LS in Japan seems to be under-theorised. Stigler and Hiebert (2016) elaborated further on Elliot’s position noting that “much of the theory behind LS is implicit, and also bound up with wider beliefs about teaching and learning” (p. 583). Wood (2020), for his part, considers that more attention should be paid to the theories of learning through LS, and to how its implementation leads to teacher learning.

In an attempt to making theories explicit, the aim of this section is to specify what Stigler and Hiebert (2016) called implicit beliefs about teaching and learning behind LS. To do so, we propose the notion of didactic paradigm recently introduced by Gascón and Nicolás (2019; 2021) within the Anthropological Theory of the Didactic (Bosch et al., 2020), which further develops the initial notion of didactic paradigms initiated by Chevallard (2015). According to these authors, the notion arises from the assumption that the epistemological model implicitly or explicitly assumed within a didactic institution strongly conditions the didactical practices that take place in it. In fact, each institution not only considers an epistemological model about mathematics, or about a specific mathematical domain, but also adopts and promotes a set of teaching ends (for instance, student understanding of mathematical objects or developing problem-solving skills in students). It also creates or selects a set of means considered useful to achieve these ends. Both the teaching ends and the means are connected with and critically conditioned by the epistemological and didactic model of mathematics that exists in this institution. Besides, very often, these ends and means, as well as the epistemological and didactic model behind them, emerge as a reaction to some undesirable didactic facts (such as rote learning). These four elements (epistemological and didactic model, teaching ends, means and phenomena) characterise the didactic paradigm assumed by a didactic institution. Didactic paradigms are cultural constructions. They can reflect how the teaching of mathematics has become organised in an institution. Besides, from a research perspective, researchers can build idealised didactic paradigms built, for instance, from the assumptions of a specific theory. This will be the case of this study, in which we will consider a didactic paradigm that derives from the TDS, as we explain in the next section.

The notion of didactic paradigm, and the role it could play in LS, expands upon the idea developed in García et al. (2019). In this paper, the authors analysed how the epistemological and didactic models assumed by a LS community shapes crucial aspects of the LS process. Some of these aspects could be the kind of research questions that are admissible, how those research questions can be formulated, the nature of the tasks considered to build the lesson plan or what is worth discussing in the post-lesson discussion. We believe that the notion of didactic paradigm could be relevant to researchers to uncover the existing beliefs about teaching and learning within any LS process (Stigler & Hiebert, 2016) and to question the role they play.

For instance, it could be useful to re-interpret, in terms of conflicts between existing and/or intended didactic paradigms, some of the issues different authors have found in their research on LS. Thus, several authors (Groves et al., 2016; Clivaz and Miyakawa, 2020; Asami-Johansson, 2021, Takahashi and McDougal, 2016) report issues when implementing LS in countries other than Japan, which could be interpret as signs of tensions and contradictions between the didactic paradigm adopted by the LS communities in these studies and the “Japanese paradigm”, the so-called structured problem-solving approach (Hino, 2007, 2015).

Moreover, it is also common to encounter research that reports LS processes taking place under the conditions and principles of didactic paradigms other than the Japanese one (e.g. Wake et al., 2016; Schoenfeld et al., 2019, or Huang et al., 2016: Jessen et al., 2023), which adds distinctive features to the process.

Finally, in the case of the Japanese LS, in which the structured problem-solving paradigm is explicitly adopted, concern is expressed about explaining the role this paradigm plays in how LS is conceived. For instance, Fujii (2016, 2018) discusses the interplay between LS and the Japanese paradigm, considered as two sides of the same coin. Inoue (2011) and Takahashi (2008) focus on the neriage phaseFootnote 1, and Tan (2021) on banshoFootnote 2 as significant and distinctive features of the “Japanese paradigm”.

As a conclusion, we hypothesise that behind every LS process, there is a didactic paradigm, implicitly or explicitly assumed by the LS community, which shapes the process and determines teacher learning. The notion of didactic paradigm refines the notion of “theories” and/or “culture” in LS, offering researchers a different perspective to question them within the context of LS.

In what follows, we will explicitly adopt a didactic paradigm, and, in accordance with our hypothesis, we will re-interpret LS based on said paradigm and link it with the learning of future teachers.

The Theory of Didactical Situations as a didactic paradigm

The TDS originated in France in the 1970s as a research theory (Brousseau & Warfield, 2014). However, its historical evolution has given rise to a set of principles and tools that could be re-interpreted as a potential didactic paradigm, but it is not exactly the theory itself. It is this interpretation of the TDS that we will develop in this section.

Following Radford (2008), we can summarise the key features of the underlying epistemological model of the TDS: (1) knowledge results as the “optimal” solution to a certain situation or problem; (2) according to Piaget’s genetic epistemology, learning is interpreted as a form of cognitive adaptation; (3) for every piece of mathematical knowledge, there is a family of situations that give it an appropriate meaning and (4) student autonomy is a necessary condition for the genuine learning of mathematics.

The TDS teaching ends could briefly be summarised as the meaningful learning of mathematical concepts, assuming that a concept has a meaning for students if they consider it to be the best solution within a problematic situation. As Clivaz (2015) states:

“The fundamental hypothesis of the TDS is that ‘every mathematical concept is the solution of at least one specific system of mathematical conditions, which itself can be interpreted by at least one situation’ (Brousseau & Warfield, 2014, p. 165). Therefore, the knowledge can be modelled in a fundamental situation, or a family of situations, that will preserve and even give back the sense of this knowledge” (Clivaz, 2015, p. 247).

To uncover the means the TDS paradigm proposes to reach these ends, it is crucial to consider the notions of didacticalFootnote 3 situations and adidactical situations, represented in Fig. 1. A didactical situation is “a project organised so as to cause one or some students to appropriate some piece of mathematical reference knowledge” (Brousseau & Warfield, 2014, p. 163). A special kind of didactical situations is the adidactical ones, in which “teacher’s specific intentions are successfully hidden from the students and the student can function without the teacher’s intervention” (Warfield, 2014, p. 12). In adidactical situations, students engage in the most independent and fruitful possible interaction with the proposed learning environment (Brousseau, 2002). When students face such situations, they have to produce statements and discuss their validity; make decisions; formulate hypotheses predicting and judging their consequences and communicate, produce and organise models, arguments and proofs. They also have to evaluate and correct themselves by looking at the feedback provided by the learning environment, without the teacher’s intervention (Brousseau & Warfield, 2014).

TDS-based lessons are structured around a set of adidactical situations that offer a proper representation of the mathematical knowledge at stake. A careful design of the learning environment (called the didactical milieu) is necessary so that: (a) It offers a challenge students could face by using some mathematics they already know; (b) changes introduced by the teacher in this milieu could make the students’ approaches fail (or at least become less effective or harder); (c) students’ experience the failure (or limitations) of their approaches by themselves, thanks to the feedback they get from the situation; (d) as a reaction, students look for new strategies to deal with the situation, testing them against the milieu (which validates their appropriateness) and (e) the optimal solution under the given conditions is based on the mathematical knowledge the teacher wants them to learn (thus, learning emerges as the best possible adaptation to a challenging environment).

Three main didactical tasks characterise the teacher’s actions in TDS-based teaching: (1) the devolution of the situation, ensuring that students accept engaging in the situation (Brousseau & Warfield, 2014); (2) the management of the situation through strategic interventions in key aspects of the milieu (controlling the so-called didactical variables) and (3) the institutionalisation, detaching the optimal strategy from the situation and presenting the canonical form of the mathematical knowledge involved, and drawing conclusions for the organisation of further sequences (Brousseau & Warfield, 2014).

To plan and conduct their teaching within the TDS paradigm, future teachers need to develop their knowledge connected to the meaning of the mathematical objects they want their students to learn. They also need to develop their skills to design appropriate situations, by making decisions regarding key features of the milieu associated to them. Besides, teachers need to develop their ability to manage these situations in actual classrooms, being effective in conducting the devolution of the situations and modifying the didactical variables in situ to provoke the desired adaptations of students’ solutions. All this knowledge and all these skills are related to the design and management of the learning milieu. For the purpose of our study, we will refer to them as belonging to the teacher–milieu dimension (see ① in Fig. 1).

Fig. 1
figure 1

Adidactical situations, didactical situations and teachers’ dimensions (based on Clivaz, 2015, p. 249) ① Teacher-milieu dimension ② Teacher-student dimension

Teachers also need to anticipate the kind of strategies students could use when facing problems within each situation, especially to identify those students could initially use (base strategies), and how these might evolve under the given conditions towards the optimal strategy. We will refer to all of these as belonging to the teacher–student dimension (see ② in Fig. 1).

Teacher–milieu and teacher–student dimensions are obviously connected: The teacher needs to establish links between the progression of students’ strategies and the management of situations. For analytical purposes, we, therefore, differentiate between them in the context of this study.

Lesson study based on the Theory of Didactical Situations

We will use TDS-LS to refer to an LS process explicitly designed and conducted under the principles of the TDS paradigm, aimed at developing teachers’ knowledge and skills to plan and teach in accordance with them.

In TDS-LS, teachers potentially construct specific professional knowledge that, for analytical purposes, we structured into two different phases: the design phase and the implementation phase. In the design phase, which goes from formulating the research question to writing a lesson plan, including the research on existing materials, teachers could potentially create and/or expand their knowledge about:

  • The epistemology of the mathematical objects at stake (the meaning of mathematical objects).

  • The conditions that situations should have to provoke the learning of a specific piece of mathematical knowledge (fundamental situations).

  • How the learning milieu of a situation should be structured so that it offers meaningful feedback to student actions without the teacher’s intervention (the adidactical nature of the situation).

  • The features of the situation (milieu) teachers can modify to provoke certain adaptations of student behaviour (didactical variables), making previous strategies fail and provoking the emergence of new ones.

  • Students’ potential strategies when facing a situation.

  • How to combine strategies and didactical variables for them to make wise decisions that allow or block certain strategies.

  • How the situation could be structured at the beginning and the instructions they should give so that students accept engaging in it (devolution of the situation).

How this knowledge relates to specific constructs from the TDS is mentioned in brackets.

In the second phase, related to the implementation of the research lesson, the observation of the classroom, the post-lesson discussion and the final reflection, teachers find opportunities to improve on previous knowledge while building new knowledge connected with:

  • The actual devolution of the situation, especially actions to ensure that students understand the situation and want to engage in it (for instance, how to communicate the problem to students without giving information about the mathematics the teacher wants them to learn).

  • The identification of student strategies in vivo, within an adidactical situation, connecting them with specific features of the situation and the learning milieu.

  • How to regulate student interactions within the situation, by making decisions in situ regarding didactical variables, especially if the students’ adaptation to the given situation is not happening as planned, or if the teacher wants to provoke a different adaptation to the situation.

  • How to summarise from students’ strategies. Once the teacher considers the students have established the relation envisaged to the mathematical knowledge, his/her intervention is crucial to help the students isolate it from circumstantial aspects of the situation. Thus, the teacher makes the knowledge the students have already constructed explicit, and incorporates it into the mathematical repertoire of the group (institutionalisation phase).

The notion of self-efficacy and connections with a specific didactic paradigm

In our research, the notion of teacher self-efficacy is considered an instrumental choice in order to clearly set out the teacher learning perspective we adopt, following the recommendation of Larssen’s et al. (2018). Although it is a well-known and widely accepted construct, in this section, we will briefly describe how we interpret it and how we relate it to the notion of didactic paradigm for the sake of our study.

Bandura (1977) defines self-efficacy as people’s beliefs in their “capabilities to organise and execute the courses of action required to produce given attainments” (p. 3). According to Tschannen-Moran et al. (1998), “efficacy expectation is the individual’s conviction that he or she can orchestrate the necessary actions to perform a given task” (p. 210). In the context of education, teacher self-efficacy belief can be interpreted as “a judgment of his or her capabilities to bring about desired outcomes of student engagement and learning” (Tschannen-Moran & Hoy, 2001, p. 783).

Teachers’ sense of efficacy is not necessarily uniform across the many different types of tasks teachers have to face, nor across different subject matters (Tschannen-Moran & Hoy, 2001). In fact, self-efficacy affects the kind of tasks a person tends to choose, the effort and persistence he/she puts in completing them, as well as the emotional relations he/she establishes with them (Gonzalez-DeHass & Willems, 2012).

Teachers face many kinds of professional tasks. We here concentrate on the ones related to the planning of lessons and their teaching, which are strongly connected with LS experiences. From a domain-specific perspective of teachers’ self-efficacy (Klassen et al., 2011), we consider that planning and teaching tasks are framed within a didactic paradigm which, in turn, introduces some important distinctive features. Since teacher self-efficacy is task-dependent (Tschannen-Moran & Hoy, 2001), it is first of all necessary to identify which the most important and specific tasks related to planning and teaching within a certain didactic paradigm are. The second step is to find out how confident teachers feel to tackle and effectively accomplish those tasks.

Since our study focuses on the TDS paradigm, an instrument to measure teachers’ self-efficacy to perform specific types of tasks connected with planning and teaching TDS-based lessons needs to be developed. This instrument, used before and after the TDS-LS experience, will help us clarify the impact of LS in the domain-specific self-efficacy beliefs of the prospective teachers participating in our study (see Sect. 3.3).

Methodology

In this section, we will explain the context in which the LS experience took place, the design of the LS cycles, the development of a questionnaire to measure prospective teachers’ self-efficacy within the TDS paradigm using the data we collected.

Background and groups involved in the study

Our study focuses on prospective teachers enroled in a 4-year Bachelor’s degree in early childhood education at the Universidad de Jaén (Spain). The prospective teachers take a Mathematics Education Course (MEdC) on the teaching of mathematics in early childhood education in their 3rd yearFootnote 4. In this course, they are introduced to the TDS paradigm, mainly from a theoretical perspective (learning about the TDS; mathematical knowledge to be taught in early childhood education; studying the adidactical situations to teach this knowledge, as well as didactical variables within these situations; knowing about students’ strategies and their evolution when they face these situations, etc.). The future teachers also explore particular examples of TDS-based teaching situations and engage in didactic analysis practices (that might include, for instance, the analysis of short videos of students facing a situation and examples of students’ work). Yet, they do not have the opportunity to experience actual teaching practices during this course.

Once the MEdC finishes, the prospective teachers engage in practicum at early childhood schools. Their first practicum takes place in their 3rd year (as soon as the MEdC finishes) and lasts 10 weeks. The second practicum is organised in the 4th year of their Bachelor’s degree and lasts 15 weeks (Fig. 2). During the practicums, the future teachers have the opportunity to experience TDS teaching (which ultimately depends on their mentors at the early childhood schools), but there is no direct supervision from the university educators that taught the MEdC or from any mathematics education experts.

Fig. 2
figure 2

Structure of the Bachelor’s degree in early childhood education at the Universidad de Jaén (Spain)

The sample of our study (LS group, LSG) consists of 47 prospective teachers in their 4th year that enroled in the optional course where LS was implemented. They had already completed the MEdC as well as the two practicums. They engaged in the TDS-LS cycles from February to May 2018 (12 out of 15 weeks).

Our study includes a second group, which also consisted of 47 unbiased chosen prospective teachers in their 3rd year that had just finished the MEdC. While the LS experience was being implemented, the prospective teachers in this second group engaged in their first practicum (10 weeks, see Fig. 2). We will refer to this group as the practicum group (PG).

Design and implementation of the TDS-LS cycles

The 47 prospective teachers of the LSG worked in eight groups (seven groups of six members and one group of five members) in the context of an optional course in the 4th year of their degree. From the beginning, we made it clear that the LS would take place within the framework of the TDS paradigm. We asked the pre-service students to bring the learning materials of the MEdC (see Fig. 2) in which they had already learnt about the teaching and learning of mathematics in early childhood education from the perspective of the TDS.

The authors of this paper played a dual role, which was sometimes difficult to differentiate: as university lecturers (mentor role) facilitating the students’ work and offering guidance, and as experts in teacher education (knowledgeable other role). We acted as experts in the TDS paradigm, giving strategic feedback in the design phase, occasionally correcting the future teachers’ proposals if important mistakes were detected and introducing key questions and/or ideas during the post-lesson discussion to try keep it focused on important issues and connected to key ideas of the assumed didactic paradigm.

Different templates we designed beforehand were used during every phase of the LS process. This kind of support is found in other LS experiences with prospective teachers (e.g. Sims and Walsh, 2009; Leavy and Hourigan, 2018). The templates include relevant features of the TDS paradigm the future teachers should consider when writing their lesson plans, in the observation of the research lesson, in the post-lesson discussion and in their final reflection. We provide extra information about these templates below, as we describe the different phases.

The first challenge we met was the formulation of the research question. According to Sims and Walsh (2009), prospective teachers’ lack of experience in teaching and student learning could hinder their capacity to identify and formulate meaningful questions and to follow the whole LS process. Because of that, during the first 2 weeks (Fig. 3), we introduced some general ideas about LS and the different phases of an LS process (two sessions of an hour each). We also reviewed some notions of the TDS paradigm and revisited the main features of some fundamental situations. To keep the work of the prospective teachers focused, we suggested that they should restrict their work to the following mathematical knowledge in early childhood education: enumerationFootnote 5, cardinal numbers and numbering, ordinal numbers and the introduction to addition. Thus, each group started working independently under our supervision (two sessions of an hour each), selecting a mathematical topic from the ones mentioned above, the levelFootnote 6, and planning some questions and diagnostic activities to enquire about students’ knowledge about this topic. Afterwards, the prospective teachers visited a school and met their future students. They carried out their activities in an informal manner (a 1-h session). The idea was to offer the prospective teachers the opportunity to get some first-hand insight into what the students knew and were able to do, hence facilitating the formulation of meaningful research questions. Finally, they returned to the university to share their first insights with their lecturers and peers (a 1-h session).

Fig. 3
figure 3

Structure of the LS experience with pre-service early childhood education teachers

After those 2 weeks, the prospective teachers worked at the university for 5 weeks (three sessions of 1 h each, approximately 15 h, see Fig. 3). They worked mainly in groups, although, from time to time, we generated some collective moments to share key ideas or to introduce some knowledge if the groups were struggling with something. During those weeks, they formulated and polished their research question, examined existing resources (documents they had from the MEdC about the TDS and others facilitated by us) and wrote the lesson plans of their research lessons (research and design phases). They used a template we designed, which included key features of the TDS paradigm they had to consider and develop: a general description of the adidactical situation and of the milieu, justification of the adidactical nature of the milieu, didactical variables managed to generate a sequence of situations, the kind of the adidactical situations proposed (action, formulation and validation), the relationship between the decisions made regarding the didactical variables and the approaches the students might use, students’ mistakes that might appear during the implementation of the lesson and possible reactions of the teacher to them.

Once they completed their designs, we accompanied the prospective teachers to early childhood schools where they taught their research lessons. A post-lesson discussion took place immediately after the lesson (each group spent approximately 3 h on both activities, Fig. 3). Only the members of the same group and the authors of this paper, as their mentors, attended the research lesson and took part in the post-lesson discussion. To take notes as the group members observed the lesson and to keep their observation focused, we gave them a template. It included entries like: how the devolution occurred, the students’ approaches they observed; the students’ mistakes, challenges and mental blocks; how the teacher managed the lesson as well as any other comments they considered worth taking down.

We video-recorded the research lesson, while the post-lesson discussion was audio-taped. A few days later, we held a second post-lesson discussion, using an edited version of the video recording of the lesson we prepared for them. With the aim of reducing the length of the lesson (from over an hour to 20–30 min on average), we removed the fragments in which nothing interesting occurred (for instance, when the pre-service teacher welcomed the students, or when she/he prepared the physical milieu before introducing a new situation). Moreover, we selected key episodes of the research lesson like the moments when the devolution of the situation took place, moments when the students activated different approaches to solve the situation or moments in which, in our opinion, there were interesting student–student or student–teacher interactions. This second post-lesson discussion was more informal than the first one and took place at the university. It mainly focused on meaningful aspects of the first lesson worth considering in the second LS cycle.

The first cycle concluded with the writing of a final report, also by making use of a template. First, the teachers had to compare their design with what actually happened in the research lesson in terms of the following: how the devolution took place, how the didactical variables were managed, which approaches the students used, what mistakes the students made and the teacher’s interventions during the class. Second, we asked the prospective teachers to explain to what extent they had answered their research question, based on the evidence gathered during the research lesson and the post-lesson discussion. Finally, they had to decide if they wanted to keep their research question (if they considered that they did not get a satisfactory answer), or if they wanted to formulate a new research question derived from the previous one, based on what they had observed during the first cycle. They were also invited to write down their first thoughts about possible changes in their designs with a view to a new LS cycle.

The second cycle lasted 4 weeks (Fig. 3). The prospective teachers revisited the previous lesson plan for 2 weeks, making changes to and polishing their former research question, or using it as the starting point for the design of a new one, in case they considered a new research question. They engaged in a new enquiry process and in writing a new lesson plan, employing a template similar to the one used in the first cycle (approximately 6 h, Fig. 3). In all cases, the teachers developed new lessons based on new milieus, although these could resemble the previous lessons.

The prospective teachers then taught the new research lesson to the same students they had worked with before, and participated in a new post-lesson discussion (3 h per group, Fig. 3). By teaching the new lesson to the same students, the future teachers had the opportunity to link their designs to what they had learnt in the first cycle. Moreover, they could experience student learning as a continuous process based on what students know, and focused on developing their knowledge further, which is fundamental to the TDS paradigm.

The LS experience ended by watching an edited version of the video of the second research lesson and the writing of a final report (approximately 3 h, Fig. 3). The templates used here were the same as in the first cycle. Even though a third cycle will not take place, we asked the prospective teachers to keep on thinking about how the lesson had worked, to what extent they had found answers to their research questions, and in which direction they might continue. We thus wanted to emphasise that LS is a long-term strategy that could be used for longer periods for prospective teachers to learn about student learning and to develop their professional knowledge.

Instrument and data collecting

Teachers’ self-efficacy is related to specific tasks (Tschannen-Moran et al., 1998). It is interpreted as a judgement teachers make about their ability to provoke some kind of student engagement and learning (Tschannen-Moran & Hoy, 2001). Since this engagement and learning, as well a the professional tasks a teacher faces, is directly connected to the didactic paradigm the teacher uses implicitly or explicitly, a specific instrument that captures the particularities of the paradigm employed needs to be developed.

Following Bandura (2006), we phrased items as “I can do…” sentences, followed by a kind of professional task typical of designing or teaching within the TDS paradigm (see Table 3 for some examples). The participants had to express their level of confidence to carry out the task described in each item, using the scale represented in Fig. 4.

Fig. 4
figure 4

Response scale (Bandura, 2006, p. 312)

The initial version of the questionnaire included 29 items and was filled in by 139 prospective teachers that had already completed the MEdC, but were not subjects of our study. After a factor analysis of the data (using SPSS Statistics version 24), we reformulated some items, and other were removed to improve statistical consistency. The final version contains 26 items.

We used factor analysis to explore and define the underlying structure of the data matrix. This led to the formulation of some “factor dimensions” that enabled explaining the data using fewer concepts, the so-called factors. First, we investigated the appropriateness of the factor analysis by analysing the correlation matrix, which revealed that all correlations were significant. Second, we checked the significance of the correlation matrix using Bartlett’s test of sphericity (which gave a result of χ2(325) = 4336.688, p < 0.001, confirming the significance), as well as the Kaiser–Meyer–Olkin (KMO) test (which gave a value 0.952, higher than 0.5, and hence showed the appropriateness of the factor analysis).

Once these preliminary steps confirmed the viability of the factor analysis, we found two factors that explained 68.149% of the total variance, according to the root cause analysis. The first factor explained 36.782% of the total variance and displayed a highly positive correlation with 15 items related to the planning activity. We named it the planning domain factor. The second factor explained 31.367% of the total variance and had a high positive correlation with 11 items related to the teaching activity. We labelled it the teaching domain factor. It was computed by means of the method of principal components to extract factors, plus a VARIMAX rotation to simplify their interpretation. Moreover, all variables showed communalities greater than 0.5, which justified their inclusion in the factor analysis. Lastly, we calculated the reliability of the additive scales (generated by both factors) using Cronbach’s alpha to determine their internal consistency, obtaining 0.96 for the planning factor and 0.95 for the teaching factor.

In conformity with the TDS paradigm, the teacher–milieu (T–M) and teacher–student (T–S) domains are fundamental to our analysis. As the factor analysis did not group items together in accordance with these dimensions, we decided to consider them as additional sub-scales, allocating items to either one or the other dimension from a theoretical viewpoint. Table 1 summarises the number of items assigned to each sub-scale as well as their Cronbach’s alpha (in brackets) indicating that the four sub-scales are reliable.

Table 1 Number of items within each sub-scale and Cronbach’s alpha (in brackets)

In summary, the structure of the questionnaire is that the items can be grouped into two different domains, planning and teaching, which emerge from the factor analysis. They can also be grouped into two dimensions, introduced from a theoretical perspective. Table 2 summarises the topics addressed within each domain and dimension, while Table 3 exemplifies some of the items. The structure of the questionnaire will be used to interpret the data and draw conclusions.

Table 2 Structure of the questionnaire: topics addressed within each domain of activity and dimension
Table 3 Example of items within the questionnaire following the structure “I can…

We administered the questionnaire to the LSG and the PG at two different moments: during the pre-test, immediately before starting the LS and the practicum, respectively; during the post-test, when the LS and the practicum had both finished. Since both groups had already completed the MEdC when they filled in the questionnaire (pre-test stage), their knowledge about the TDS paradigm could be considered equal. This statement is supported by the data we present below. However, the starting point of both groups was not exactly the same, because the prospective teachers in the LSG were in their 4th year and had already finished two practicums, while the prospective teachers of the PG were about to start their first practicum (see Fig. 2). Therefore, the research here presented should be understood as a quasi-experimental study because, from an orthodox perspective of a randomised control trial, the PG does not entirely fulfil the conditions of a control group. However, despite the limitations of quasi-experimental studies, we consider that the contrast between the two groups may offer some interesting insight into LS compared to other practice-based experiences in ITE.

To calculate the impact of these experiences on the prospective teachers, we checked if there were significant dissimilarities between the means of the LSG and the PG using a two-factor ANOVA analysis with repeated measures. Besides studying the statistical significance, we also analysed the effect size of the differences using Cohen’s d. Previously, we conducted a descriptive statistical study to check if the groups were homogeneous with respect to the initial values of the scale. Table 4 represents the descriptive analysis of scores within each scale, depending on the group. Furthermore, the Student’s t-test for independent samples shows no statistically significant difference between each group in any of the scales, which means that there is no significant bias in the setting of both groups and that they are homogeneous concerning these variables.

Table 4 Descriptive and comparative scores before the intervention in each group

Results

First of all, we analysed if there were significant differences between the two stages in the LSG. Table 5 shows the differences of means between pre-test and post-test in the LSG for each variable (planning, planning (T–M), planning (T–S), teaching, teaching (T–M) and teaching (T–S)), as well as the values for each difference (also see Fig. 5). We also carried out the same calculations with the PG (also displayed in Table 5; Fig. 5).

Table 5 Pre–post differences within each group

Comparing the values (p < 0.001) with the established level of significance (5%), we conclude that the differences between the two stages in the LSG are significant. This is also observed in the PG.

Second, we studied the possible differences between the PG and the LSG at each stage (pre and post), considering the PG as a kind of “control group”. Table 6 gathers the differences of means between both groups in the pre-test and the post-test for each variable (planning, planning (T–M), planning (T–S), teaching, teaching (T–M) and teaching (T–S)), as well as the value for each difference (also see Fig. 5).

Table 6 LSG–PG differences of means at the same stage

According to the values in Table 6, LSG–PG, differences are not statistically significant in the pre-test, while the same differences are statistically significant later in the post-test (compared with the established 5% level of significance).

Fig. 5
figure 5

Evolution of PG and LSG over time and within each scale

Besides the study of significant differences of means, we also calculated the effect size as a necessary complement to the hypothesis test, because it offers a more meaningful comparison of the existing differences between the pre-test and post-test in both groups. In other words, we computed the effect size of LS in the LSG compared with the evolution of the PG. To this end, we used Cohen’s d coefficient. The results are displayed in Fig. 6.

Fig. 6
figure 6

Pre–post effect size within each group in each domain and sub-dimensions

The effect size results show that the self-efficacy increase in the LSG is greater than in the PG in all dimensions. Nevertheless, we also calculated the global effect size (Fig. 7) to find out the global impact of LS in the LSG compared to the evolution of the PG. This measure is interesting because it puts together both the LSG and the PG at both the pre-test and the post-test stages. The results show that, even though both groups experienced an increase in their self-efficacy, LS had a higher impact on prospective teachers than the practicum.

Fig. 7
figure 7

Global effect size of LS in the LSG compared to the PG

Discussion

Following the results, discussion will be structured in two strands. The first one, longitudinal, will focus on the evolution of each group along their experiences. The second one, cross-sectional, will focus on the differences between both groups at the beginning and at the end of their experiences.

Evolution of each group

A first global look at the results (Table 5; Fig. 5) reveals that the LSG and the PG reported an increase in the prospective teachers’ self-efficacy perception, both in the planning and teaching domains and in the T–M and T–S dimensions within them.

When we focus on the evolution of the LSG, the first important outcome is that there is a significant increase in the prospective teachers’ self-efficacy to plan and teach within the TDS paradigm in all the variables. Therefore, we can state that the TDS-LS experience had an important impact on prospective teachers, in line with other studies like Mostofo’s (2013), focused on perceived teacher self-efficacy, Hourigan & Leavy’s (2019), focused on prospective teacher perceived learning, or Jessen’s et al. (2023), focused on TDS-based lessons in the context of a LS experience.

According to Chong & Kong’s (2012) qualitative study, the LS approach offers specific structures and processes that support teacher efficacy. In another study, using a repeated measures approach like ours, Schipper et al. (2018) also reported a self-efficacy increase. Puchner and Taylor (2006) found, in two case studies, that the planning of the research lesson and the observation of student performance in LS might support the development of teachers’ efficacy. The results of our study are in line with these findings. The novelty is that, on the one hand, we offer quantitative evidence (compared to Chong & Kong’s and to Puchner & Taylor’s studies), and, on the other hand, we analyse self-efficacy related to a specific didactic paradigm in mathematics education, something that we could barely find in any other research.

Considering the different domains and dimensions of our study, the results show that the increment in the teaching and in the planning dimensions are similar. This suggests that the LS experience contributed to a balanced development of the prospective teachers’ TDS-related self-efficacy beliefs. This shows a potential reduction of the theory-to-practice gap, since the higher the levels of self-efficacy are, the more likely it is that prospective teachers engage in planning and teaching TDS-based lessons.

Besides, another outcome is that there is also a significant increase in the perceived self-efficacy of the PG (see Table 5), which is somehow unexpected. It could be explained in terms of the experiences they had during the practicum period. Although the practicum was not the focus of our research, we considered it as a potential contextual variable, and included some extra questions about it in the post-test of the PG. According to their answers, 59.6% of the prospective teachers in the PG reported to have observed TDS-based teaching (vicarious experience) during their practicum, 57.4% reported to have engaged in TDS-based teaching (mastery experience), and 46.8% answered to have both observed and taught TDS-based lessons. This might explain, to some extent, why their self-efficacy increased, although this cannot be conclusive because a supposedly similar practicum experience of the LSG before the LS experience seemed to have had no impact. Undoubtedly, this raises a very interesting issue that should be considered in future research.

Differences between groups

If we evaluate the results of the LSG in front of the PG, in the pre-test, we see that the scores are similar, there being no significant differences between them. Consequently, despite the dissimilarities described as regards the background of both groups, the data support that they were quite similar in terms of their relation to the TDS paradigm and the associated self-efficacy beliefs when their practice-based experience started, giving soundness to a possible comparison between them.

However, in the post-test, the differences between both groups are statistically significant (see Table 5; Fig. 5). This means that, although both LS and practicum increased the prospective teachers’ self-efficacy, the impact of the former is higher. Cohen’s d and effect size measures confirm this statement: While we measured a medium effect size of the practice-oriented experience the PG had (Cohen’s d coefficient between 0.20 and 0.50), there was a large effect size (according to Cohen, 1988) and even a very large effect size (according to Sawilowsky, 2009) of the LS experience in the LSG. Size effect measures are conclusive about the higher impact of the TDS-LS experience versus practicum, clearly supporting the benefits of LS in ITE, despite the unexpected growth of the PG, we have already discussed.

A possible reason, amongst others, for the greater increase in the LSG could be the fact that LS was explicitly designed and implemented around a specific didactic paradigm, compared with the practicum experience in our study. Besides, going back again to the sources of self-efficacy, Watson and Marschall (2019) found, consistently with the previous studies, that teacher self-efficacy is acquired through vicarious experience, verbal persuasion and mastery experience. LS can be seen as an infrastructure that supports such experiences. LS can make prospective teachers experience successful teaching episodes (mastery experiences), while also fostering the observation of such episodes (vicarious experiences). Besides, moments of verbal persuasion arise throughout the whole process, from the initial formulation of the research question to the planning phase, but mainly in the post-lesson discussion. However, and from our experience, in order to maximise the potential of LS, particularly in the case of ITE, the role of university lecturers, in their dual position as mentors and knowledgeable other, seems to be crucial (which has also been discussed elsewhere, like in Baldry and Foster, 2019, Ni Shuilleabhain and Bjuland, 2019, or in Jessen et al., 2023). In the case of the future teachers in our study, that have already had access to the theoretical knowledge they would need to plan and teach TDS-based lessons, we could experience firsthand the difficulties and doubts they had at each stage of the LS. Our support and guidance, by means of the templates we prepared, as well as directly during the sessions, seems to have been crucial to make the LS experience relevant to them. We hypothesise that this explains, to a certain extent, the higher impact measured in the LSG.

Conclusions

Our study is embedded in the current research agenda about teacher learning through collaboration (Robutti et al., 2016) and, in particular, about LS and the connections between theoretical and practical knowledge.

The first contribution of our research is related to the problem of theorising LS. We have argued that LS is always connected, implicitly or explicitly, with a didactic paradigm, that is with an underlying epistemological and didactic model, a set of teaching ends it promotes, the means proposed to achieve these ends and, occasionally, certain didactic facts the paradigm reacts to. The didactic paradigm assumed has a direct impact on how LS is organised and conducted, and on the kind of professional knowledge teachers build from it. Therefore, from a research perspective, we advocate the explicit consideration and questioning of the didactic paradigm adopted by the LS community (the TDS paradigm, in our research).

Second, after conducting two LS cycles with early childhood education prospective teachers, the TDS-LS study experience created a positive impact on the prospective teachers’ self-efficacy beliefs to plan and teach within the TDS. Chong and Kong (2012) state that high self-efficacy values are predictors of behaviours. Hoy and Spero (2005) claim that teachers’ efficacy beliefs appear to affect the effort teachers invest in teaching, their level of aspiration and the goals they set. Therefore, we can conclude that, after their participation in the LS experience, it is more likely that prospective teachers engage in planning and teaching TDS-based lessons, which could contribute to reducing the theory-to-practice gap.

Third, LS seems to have a higher impact on prospective teachers’ self-efficacy than other practice-based teacher education actions. In our study, the comparison between the LSG and the PG clearly shows the benefits of LS.

We can also identify limitations, which might give rise to some future lines of enquiry. An important issue that needs further consideration is that both groups (LSG and PG) started with the same self-efficacy measures, regardless of the fact that they were in different years of their degree programme, and, in both cases (LS and practicum), there was a statistically significant increase. The LS experience obtained better results, but it remains unclear what would have happened if the PG practicum experience had been a less vicarious and more mastery experience (Bandura, 1977), with direct and explicit supervision with regard to planning and teaching within the TDS paradigm (besides the MEdC, which is common to both groups).

Another limitation is connected with the fact that the practicum experience did not seem to have the same impact in both groups. There seems to be a contradiction: One practicum period in the PG had a higher impact than two periods in the LSG. We need to consider other variables, like the interval of time between the practicum and the moment the prospective teachers completed the questionnaires, its nearness in time to the MEdC, where the prospective teachers learnt about the TDS paradigm, or issues such as the short-time and long-time effect of any teacher education intervention.

Future lines of research could be the replication of the study considering other didactic paradigms. This might contribute to providing insight into the dependency between teacher learning in LS and the paradigm adopted by the LS group. This replication might require important but interesting adaptations in some of the tools which, in turn, help reveal the connections that might exist between the LS study process and the paradigm adopted, in contexts other than the TDS paradigm.

Another line of research could be related to finding connections between self-efficacy beliefs and teachers’ knowledge growth. Although self-efficacy seems to be an “elusive construct” (Tschannen-Moran & Hoy, 2001), Thomson et al. (2017) have found significant correlations between prospective teachers’ self-efficacy beliefs and their pedagogical content knowledge. We are also interested in finding potential connections between paradigm-specific teachers’ self-efficacy beliefs and the development of teachers’ knowledge and skills. In this regard, the qualitative analysis of the LS cycles we are carrying out could contribute to determining if the LS experience enhanced the development of prospective teachers’ knowledge, as was the case with their self-efficacy beliefs. This might link our research with the previous studies (Ni Shuilleabhain, 2016; Leavy & Hourigan, 2018; Clivaz & Ni Shuilleabhain, 2019; Hourigan & Leavy, 2019) that have already documented teachers’ growth in different dimensions of their content knowledge for teaching (Ball et al., 2008). However, the novelty would be to explicitly consider the didactic paradigm involved in the LS experience.