Teachers’ procedural fluency and strategic competence

In the mathematics classroom, supporting students in solving problems with multiple solutions and comparing multiple solutions is an important and very demanding part of a teacher’s role (Richland et al., 2017; Smith & Stein, 2018). Teachers have to anticipate likely student responses and to prepare appropriate reactions or explanations. If the lesson plan includes independent seat work followed by a whole-class discussion, teachers have to be capable of singling out student approaches that are particularly suitable for presentation and joint comparison. In this instructional setting, they, moreover, need to be in the position to help the class “make mathematical connections between different students’ responses and between students’ responses and the key ideas” (Stein et al., 2008, p. 321). In order to identify and encourage such connections and comparisons, teachers themselves need to be able to solve the problems to be dealt with in flexible, meaningful, and creative ways (Vale & Pimentel, 2011). In other words, they have to possess sufficient procedural fluency and strategic competence so as to master the content they teach in a specific grade in a comprehensive way, building on a thorough understanding of the underlying mathematical structures and concepts (Ball & McDiarmid, 1996; Gitomer & Zisk, 2015).

It is for this reason that procedural fluency and strategic competence are important topics in preparation programs for mathematics teachers (Gravemeijer, 2002; National Council of Teachers of Mathematics, 2014; Niss, 2003). These subject-specific skills are also reflected, for example, in core standards for mathematics education in the German-speaking countries Switzerland (EDK, 2015), Austria (BGBl, 2021), and Germany (KMK, 2004), all of which have been influenced by the Principles and Standards for School Mathematics as defined by the National Council of Teachers of Mathematics (2005) (Schulz, 2010). However, teachers’ proficiency has to comprise more than merely knowing what their students are expected to learn (Shulman, 1986). In addition to pedagogical content knowledge (PCK), they need to develop more advanced kinds of content knowledge (CK) on which they can draw when they teach a specific subject at school (Ball et al., 2008). This is even more important as elementary teachers’ mathematical content knowledge is assumed to contribute to the quality of instruction (Charalambous et al., 2012; Stein et al., 1990) and to the students’ achievement in mathematics (Charalambous et al., 2020; Hill et al., 2005; Kelcey et al., 2019).

Concepts of number and operations with numbers, for example, lie at the heart of primary school mathematics. They lay important foundations for later stages of education in fields that apply or even explicitly build on mathematics (Kilpatrick, 2001). For this reason and following the content domain "numbers" from the TIMSS 2019 framework (Mullis & Martin, 2017), the further considerations focus on tasks with natural and rational numbers, as teachers who teach grades 4–6 should be able to solve. Such tasks may afford abilities to decontextualize or contextualize quantities (NCTM, 2000) in order to solve simple word problems (Schulz et al., 2020; Verschaffel et al., 2020) or to reason about relations between numbers (Carpenter et al., 2003; Schulz, 2018). Such specific requirements in elementary mathematics are hereafter referred to as operating and mathematizing with natural and rational numbers.

Procedural fluency and strategic competence, which are targeted in the present study, represent two core components of mathematical proficiency (Kilpatrick, 2001; National Council of Teachers of Mathematics, 2014) of learners and of teachers. Therefore, they should also be captured in assessments of teachers’ content knowledge (CK), for example, to track learning levels of prospective teachers for the purpose of planning subsequent support programs. In the following sections, procedural fluency and strategic competence are discussed. Their consideration in the findings on teachers' CK in operating and mathematizing with natural and rational numbers is reviewed. From that the developmental goals and research questions of the present study regarding the development and validation of a new assessment instrument and the recording of the states of learning of prospective teachers are derived.

Procedural fluency

Procedural fluency (Hiebert, 1986; Rittle-Johnson et al., 2001; Schneider et al., 2011), i.e., meaningful, flexible, accurate and efficient use of procedures to solve problems, is interrelated with conceptual understanding, i.e., comprehension and connection of concepts, operations, and relations. For example, a good conceptual understanding of place value and relations between numbers and operations is related to fluency and flexibility in computing with multi-digit numbers (Fuson et al., 1997; Schulz, 2018) and decimal fractions (Steinle & Stacey, 1998), which contributes to estimating the results of a procedure (Lemonidis, 2016; Star et al., 2009). Procedural fluency is primarily characterized not by a quick and secure processing of calculations, but rather by the ability to reason flexibly and meaningfully about mathematical symbolic notations like numbers, operations, procedures, terms, equations, and formula, in order to use them efficiently. Such kinds of reasoning may encompass a contextualization of mathematical notations by translating them into descriptions, actions, or images, which help to illustrate their basic characteristics. This may also be accomplished mentally through the activation or construction of mental models of mathematical notations. Bidirectional translation processes between mathematical notations and contexts, including mathematization and contextualization, bridge the gap between situated knowledge and formal mathematics and contribute to knowing when and how to use which procedure (Fischbein, 1994; Gravemeijer, 1999; Prediger, 2008; Schulz et al., 2020; Verschaffel, Greer, & De Corte, 2007).

Strategic competence

Strategic competence refers to the ability to formulate, represent, and solve mathematical problems. It depends on understanding the quantities involved in problems and their relationships as well as on fluency in solving non-routine problems (Kilpatrick, 2001). To represent a problem accurately, it is necessary to understand the situation, including its key features, by building a mental model or mathematical representation of its essential components and their structural relations (Gravemeijer, 1999; Schoenfeld, 1992; Verschaffel et al., 2020). This should avoid “number grabbing” (Kilpatrick, 2001, p. 124) and the superficial application of rules and formulas without understanding (Buys, 2008; Verschaffel, Greer, & De Corte, 2000). In elementary mathematics, strategic competence integrates heuristic strategies like making a drawing, using manipulatives, writing an equation, writing down examples and intermediate calculations systematically, or making use of other tangible representations like tables or diagrams (Diezmann, 2002; Lesh et al., 1987; O'Connell, 2000). Heuristic strategies, which themselves may be regarded as procedures, must be chosen and adapted flexibly. Hence, both procedural fluency and strategic competence are based on a flexible and meaningful use of procedural and conceptual knowledge components and are interrelated in a versatile manner (Hiebert, 1986; Star, 2005).

Existing findings from assessments for prospective teachers

Reviews of recent studies indicate that teachers as well as prospective teachers differ considerably in the level of their mathematical content knowledge (Blömeke, Hsieh, Kaiser, & Schmidt, 2014; Ma, 2010). It has repeatedly been reported that subpopulations of prospective teachers lack a solid understanding even of elementary mathematics (Auslander et al., 2019; CBMS, Conference Board of the Mathematical Sciences, 2012; Gruber, 2018; Niss, 2003). Depaepe and colleagues (2015) found that prospective elementary teachers had merely little procedural knowledge about rational numbers although it forms part of the curriculum for upper elementary school. Newton (2008) reported that prospective elementary teachers’ understanding of fractions, including flexibility and transfer, remained limited even after a course that had been designed to deepen their knowledge in this regard. Wasserman (2013) described novice teachers’ difficulty in categorizing and solving combinatorial problems. The results of other studies indicate that (prospective) teachers found it difficult to solve proportional and nonproportional problems (Rizvi & Lawson, 2007; Thompson & Bush, 2003; van Dooren et al., 2002). In their meta-analysis, Thanheiser et al. (2014) summed up that many prospective teachers have a tendency to focus on procedures rather than concepts across several content areas.

It is generally emphasized that there is a need for targeted assessments of student teachers' skills and learning progress (Brabeck et al., 2016), and standards for the preparation of mathematics teachers also refer to procedural fluency and strategic competence (Rasch et al., 2020). However, there are only a few publications on corresponding studies and on assessment instruments for (prospective) teachers: Aguilar and Telese (2018) evaluated elementary pre-service teachers’ non-routine problem-solving activities by using a rubric. The five comprehensive tasks addressed logic, geometry, fractions, patterns, and algebraic generalizations. The authors reported a moderate level of student teachers’ procedural fluency, and that their problem-solving skills could be improved in that they should be able to use more efficient or sophisticated approaches to solve problems. Ubah and Ogbonnaya (2021) evaluated elementary school pre-service teachers’ solutions to pattern problem-solving tasks in an interview study by using content analysis. The authors reported that half of the pre-service teachers did not have the creative potential to prepare learners even after they had been exposed to advanced mathematics content as part of their teacher education. Altarawneh and Marei (2021) videotaped and analyzed preservice classroom teachers’ instructional performance. They reported that the level of performance in procedural fluency and strategic competence was weak, and that there was a positive moderate correlation between academic achievement and performance level. In summary, no assessment instrument could be found that specifically measures (prospective) elementary teachers’ procedural fluency and strategic competence, in particular with a focus on operating and mathematizing with natural and rational numbers. It is this concern that has constituted the starting point for the development of a specific instrument and the design of a connected empirical study, both of which are introduced in the following sections.

Objectives and research questions

According to the development and research needs as outlined in the previous section, the study to be presented in the following pursued two aims. The first aim consisted in developing an assessment instrument for capturing procedural fluency and strategic competence of elementary school student teachers with respect to tasks about natural and rational numbers in a variety of content domains that are intended to be dealt with at upper elementary school level (grades 4–6, including the transition to grade 7). The theoretical task analysis in the methods section serves the purpose of theoretical validation. The second aim was to apply the instrument in a sample of first-semester student teachers, which includes the empirical validation of the newly developed assessment instrument. The empirical study intended to clarify.

  • whether the tasks are sufficiently challenging for student teachers so as to stimulate the use of strategies and flexible ways of solving the problems,

  • to determine which subdomains or tasks pertaining to procedural fluency and strategic competence with respect to tasks about natural and rational numbers prove particularly challenging for student teachers,

  • and to verify that the mathematical challenges in solving the tasks are based on linking mental models, strategies, and procedures flexibly rather than a lack of computational confidence.

In this way, the study was expected to validate the assessment instrument, to complement the still fragmentary findings on student teachers’ procedural fluency and strategic competence with a focus on operating and mathematizing with natural and rational numbers, and to point to components of content knowledge that need to be acquired in preparation programs for elementary school teachers but are probably not mastered well enough (Auslander et al., 2019; Hart et al., 2016; Thanheiser et al., 2014). In concrete terms, the study intended to answer the following four research questions:

  1. (1)

    Are the tasks challenging enough for student teachers to encourage the use of procedural fluency and strategic competence?

  2. (2)

    Which subdomains or tasks included in the assessment of procedural fluency and strategic competence in operating and mathematizing with natural and rational numbers prove to be particularly challenging for student teachers?

  3. (3)

    Do the analyses of solution paths confirm that the tasks promote flexibility, i.e., that the tasks allow for multiple solutions?

  4. (4)

    Do the analyses of solution paths confirm that errors and incomplete solutions are predominantly due to a lack of linking mental models, strategies and procedures flexibly rather than a lack of computational confidence?

Methods

Participants

In total, 280 first-semester student teachers from 16 regular teaching courses who were enrolled in a preparation program for elementary school teaching participated in the study. The set of tasks to be dealt with was identical for all student teachers within the same course group. The assessment took place during the obligatory attendance time, but participation was voluntary and anonymous. Teaching course content did not overlap with assessment content until the mid-semester testing time.

The educational background of all students in the cohort was as follows: 40.2% gained direct access to UAS (university of applied sciences) studies with high school diplomas and 14.2% with specialized baccalaureates (“Fachmatura”). All other students had to pass an entrance examination which included typical tasks of a high-school graduation exam (algebra, geometry, probability): 7.6% came from a specialized secondary school (3 years in addition to compulsory secondary school: “Fachmittelschule”), 3% had at least 3 years of professional experience after vocational training, and another 11.8% had acquired a professional baccalaureate after vocational training. Corresponding information was missing for 23.3% of the students. In summary, this means that at least 6 years have passed since elementary school (1st–6th grade) for all first-semester student teachers.

Assessment instrument

The newly developed assessment instrument consists of 15 different, specifically designed task formats about operating and mathematizing with natural and rational numbers. Their difficulty corresponds to the standard skills that are expected at upper primary school level (grades 4 to 6) or partially at the transition to secondary school (grade 7). With respect to the purpose of assessing procedural fluency and strategic competence, there were three criteria that had guided the development of the tasks: All tasks had to involve the need to make flexible adjustments to appropriate (a) mental models, (b) strategies, and (c) ways of thinking about a procedure or formula that can facilitate flexible, efficient, and non-algorithmic solutions of the tasks. Depending on the mathematical background of first-semester student teachers, the tasks can either represent novel problems that require the search for a suitable approach, or, as already familiar problems, they can be solved elegantly and efficiently with reduced computational effort and cognitive load.

Understanding multiplication and division and the ability to apply multiplicative thinking to a variety of situations which are founded on multiplication lays significant foundations for further learning of fractions, ratios, proportions, and algebraic relationships (Hurst & Hurrell, 2014; Lobato, Ellis, Charles, & Zbiek, 2010). Moreover, multiplicative thinking and mathematical reasoning are closely related at the transition from primary to secondary schooling (Callingham & Siemon, 2021). This argues for a special focus on multiplication and division in an assessment of procedural flexibility and strategic competence in operating and mathematizing with natural and rational numbers.

The following two subsections provide a detailed analysis of the tasks (Table 1) with respect to the three criteria (a–c), which serves the purpose of their theoretical validation. The use of strategies (b) is distinguished from thinking about procedures or formulas (c) as follows:

Table 1 Problem analysis Booklet AA—possible mental models, strategies, and reasoning about procedures

Strategies (b) denote a change of representation or a general heuristic. Representational changes can involve contextualization (e.g., translating a given calculation into a word problem, translating a multiplicative term into an area model) or de-contextualization (translating a textual task into an equation or calculation). If representational changes are made only in the mind, they would be referred to here as mental models (a). Heuristics can equally involve representational changes (translating a textual task into a sketch, representing relationships between quantities in tables), but also finding examples for combinatorial problems or (systematically) varying probe calculations.

Thinking about procedures or formulas (c) is not connected with a (recognizable) representational change or a generally usable heuristic. For example, in multiplicative term comparisons, only the last digits of the products are determined to decide whether an equation can be true. Or computational laws are used as orientation, e.g., that in products both factors may be changed multiplicatively inversely without changing the result. When converting terms, computational benefits are used to take advantage of particularly clever and computationally less complex ways of calculating.

Tasks with natural numbers

Flexibility in using numbers can be increased by mental arithmetic and estimation (Kilpatrick, 2001, p. 216; Star et al., 2009): “Estimation requires a flexibility of calculation that emphasizes adaptive reasoning and strategic competence, guided by children’s conceptual understanding of both the problem situation and the mathematics underlying the calculation.” Depending on the kind of estimation problem, a solution may include reasoning about the effect of an operation (Task A2) or about the principles of an operation (Task A3):

  • Example A2 (Table 1): 21,012 ÷ 222 = ? In which interval is the result? 0–10/10–100/100–1000 , etc.

    A solution may (a) make use of the quotative mental model of division (“Does 222 fit a hundred times into 21 012?”), (b) rest on a strategy like contextualization (“222 euro are a little bit more than a hundredth of 21 012 euro”), or (c) include mental reasoning about intermediate results of the division algorithm.

  • Example A3 (Table 1): 880 · 670 = 900 · 650. Is the equation correct?

    A solution may (a) draw on an understanding of the equal sign as expressing an equivalence relation instead of calculating and comparing the two products, (b) make use of contextualization by drawing, sketching, or imagining (similar) areas on graph paper and identifying the difference between 88 · 67 and 90 · 65 (“88 times 65 overlap; compare two times 65 versus two times 88”), and (c) be based on reasoning about the last numbers of the two (equivalent) products (0 versus nonzero) or about the inappropriateness of an inverse additive manipulation of two factors. Similar considerations would apply to an equation that requires division (Task A4 in Table 1).

Adapting formulas for the solution of problems is another type of task whose solution may demand flexibility: A solution to Problem A1 (Table 1), for instance, can be found by solving the equation for an arithmetic mean for the missing number. This would represent a case of strategy use (b, c: finding and transforming an equation) but neglects that there are more elegant ways of solving the problem that require less computational effort: It might be recognized that 19, 20, and 21 are 1 + 2 + 3 bigger than the given average 18. Therefore, the missing number must be 18—6. This alternative would (a) make use of an appropriate mental model of the mean (e.g., a bar chart with an average line) and (c) include flexible and meaningful reasoning about the formula of the arithmetic mean.

Word problems like A5 (Table 1) require a situational model, which may be constructed (a) mentally or (b) by using a strategy like drawing a sketch or a table. A table would depict the relations between the quantities. Likely starting numbers can be changed until the sum 50 is reached. Changing the starting numbers systematically would include (b, c) flexible and meaningful reasoning about the calculation. Finding an equation with variables would also be (b) an appropriate strategy, but (c) working out equations with variables is not expected at primary school level.

Mathematization of sequences (A6) and inverse-proportion problems (A7 in Table 1) require (a) an understanding of the situation and (b) the mathematization of the situation, which can involve suitable drawings, stepwise calculations, or equations that must be (c) manipulated flexibly and meaningfully. Similarly, combinatorics problems (A8 and A9 in Table 1) necessitate (a) an understanding of the situation. Appropriate (b) strategies for combinatorial problems in primary school can consist, for instance, in finding examples, looking for more examples, and counting them by means of systematic listing (English, 2005; Lockwood & Gibson, 2016).

Number system conversion problems (A10 in Table 1) are usually not dealt with in primary school classrooms. Nevertheless, they are usually discussed in teacher education programs for mathematics teaching in order to deepen (prospective) teachers’ understanding of the decimal number system (Krauthausen, 2018; Reiss & Schmieder, 2007). (a) Mental models include thinking in groups of the amount of the base number while (b) strategies and the flexible and meaningful adaptation of (c) formulas could consist in continued division and interpretation of the rest or multiplication of digits with place values.

Tasks with rational numbers

As regards procedural fluency and strategic competence, operations with fractions, decimal numbers, and percentages include types of problems that are known to be challenging for learners in school and even adults. Some of the division problems with decimals (A11 in Table 11) or fractions (A12) can be worked out with reference to (a) the quotative model for division (“How many times does 11 fit into 5 \(\frac{1}{2}\)? … a half time”) or the partitive model for division (“11 halves are to be shared among 11 persons. How much does every person get?”). These models are already indicative of (b) the availability of solution strategies, provided that suitable mental models for the interpretation of fractions (5 \(\frac{1}{2}\) = eleven halves, quasi-cardinal meaning of a fraction) or decimals (0.24 means 24 hundredths) are at hand as well. In the case of decimals, (c) rules for division that pertain to the decimal point or canceling of the quotient may be applied for calculation purposes (e.g., 0.24 ÷ 0.02 = 24 ÷ 2), which transforms the given problem into an easier problem.

Percentage-of-a-fraction tasks (A13 in Table 1) are probably novel tasks for most first-semester student teachers because such kinds of problems are rarely or never included in textbooks for school mathematics. A solution may (a) interpret the given percentage as an operator and (b, c) transfer it into an instance of division of fractions, which can be regarded as quasi-cardinal units (25% of eight elevenths = eight elevenths divided by four). With mental models and strategies like these, the given problem turns into a comparatively simple problem, whereas a purely formal way of solving the problem (e.g., 25/100 · \(\frac{8}{11}\)) without recognition of helpful relations between numbers leads to a higher computational effort and cognitive load.

An efficient ordering of fractions (A14 in Table 1) may include (a) mental models and (b) a flexible and meaningful adaptation of strategies such as interpreting fractions as part of a whole and comparing their size to suitable auxiliary numbers like 0.5 or 1 (Reinhold & Reiss, 2020): \(\frac{6}{13}\) is less than \(\frac{1}{2}\), whereas \(\frac{4}{7}\) is greater than \(\frac{1}{2}\). This way of reaching a solution is less complex than comparing \(\frac{6}{13}\) and \(\frac{4}{7}\) directly, that is, by finding the common denominator.

Percent-of-percent tasks (A15 in Table 1) require (a) an understanding of the situation and (b) mathematization of the situation, which might include an interpretation of the situation as a stepwise subtraction of percentages (e.g., “First, subtract 25% from 160; then subtract 25% from the new price”) or (c) taking advantage of the associative property of multiplication since (0.75 × 0.75) × 160 is more difficult to calculate than 0.75 × (0.75 × 160).

These analyses of the tasks used illustrate that there is a wide variety of subject matter for operating and mathematizing at the end of elementary school (grades 4 to 6) or at the transition to secondary school (grade 7: multiplication and division with decimal numbers and percentages). Associated with this are a variety of suitable task formats that can be solved particularly skillfully and easily on the basis of procedural fluency and strategic competence. The tasks presented here represent essential aspects of operating and mathematizing with natural and rational numbers.

Assessment design

The assessment tasks were presented in the open-source learning management system “ILIAS,” with which all student teachers who attend the preparation program in the Zurich University of Teacher Education are familiar. Depending on the type of task, the booklets made use of one of six different response formats (see Table 1 and Table 5): numerical answers including natural numbers and decimals (Tasks 1, 6, 7, 8, 9, 10, 11, 15); multiple numerical answers (Task 5); text answers for (cancelled) fractions or decimals, for example in the form of “1/2” or “0.5,” or “0,5” (Tasks 12, 13); yes/no-answers (Tasks 3, 4); single-choice answers (Task 2); multiple single-choice answers for ordering fractions (Task 14).

Each task was scored 1 point for correct responses. In Task 14, in which three fractions had to be sorted by size, a total of 0.33 plus 0.33 plus 0.34 points were awarded: Thus, the correct order of the three fractions \(\frac{6}{13}\) < \(\frac{13}{26}\) < \(\frac{4}{7}\) resulted in one point. If none of the three fractions was given in the correct order, e.g., \(\frac{4}{7}\)< \(\frac{6}{13}\)  < \(\frac{13}{26}\), this resulted in 0 points. If only the largest or the smallest fraction was given in the correct order, e.g., \(\frac{13}{26}\) < \(\frac{6}{13}\)  < \(\frac{4}{7}\), this resulted in 0.33 and 0.34 points, respectively. Also, for the correct answer to task 5, in which three numbers had to be given, 0.33 plus 0.33 plus 0.34 = 1 point total was awarded. All missing answers were coded as incorrect and thus given 0 points.

The online assessment comprised four booklets with each of them consisting of 15 tasks. Each booklet included seven or eight tasks that were also part of two other booklets, which allowed concurrent calibration of all tasks in a single data set in the subsequent Rasch analysis (Morrison & Fitzpatrick, 1992). Booklet AA contained all A-Tasks in Table 1 and Booklet BB all B-Tasks in Table 5 (see Appendix) while Booklet AB consisted of Tasks A1–A7, A10 and B8, B9, B11–B15, and Booklet BA included Tasks B1–B7, B10 and A8, A9, A11–A15.

The maximum test time was 45 min. Six of the 280 participants completed the test in less than 10 min while 15 student teachers needed more than 35 min. The median was 21.4 min.

Before they took the online test, all participants were encouraged to use paper and pencils to write down intermediate results, try out examples, or make drawings that might help them solve the tasks. These notes formed no integral part of the solution; only the results had to be entered into the online booklet. However, these notes were collected and used for the empirical analyses of solution pathways (see qualitative coding). These notes did not include student names to avoid otherwise necessary burdensome formal approval procedures for conducting the study. Therefore, the results of the qualitative coding of solution paths cannot be linked to the results of the quantitative analyses at the individual student level.

Not every student took written notes on every task. Some tasks could be solved by mental computation if one understood the problem sufficiently and found an appropriate way to solve it. It was assumed that the analysis of the notes provides particular evidence of considerations for solving tasks that are individually found to be challenging. Given the missings in the notes, those 10 of the total 15 tasks for which the most notes were found in an initial subsample were selected for coding. These were the tasks (both A and B versions, see Tables 1 and 5): 1, 5, 6, 7, 8, 9, 11, 12, 13, 15. On average, a written note could be found for 7.00 of these 10 tasks (70%) in the total 231 note sheets.

Analyses

Quantitative task analyses

The Rasch parameters of the items were calculated with the help of the ConQuest software (Wu, Ray, & Haldane, 2005). Weighted mean squares (MNSQ) for single items between 0.8 and 1.2 indicated a good fit while values below 1.3 were interpreted as acceptable fit to the Rasch model (Wright, Linacre, Gustafson, & Martin-Lof, 1994). The purpose of the calibration was to represent all items and persons on a common scale (logits) which allows to compare the item difficulties of all four booklets. For the student teachers, maximum likelihood ability parameters (MLE) were estimated. The sum of all person parameters was set at 0. This allowed interpreting positive item difficulty parameters as being difficult (success probability < 0.5) for an average person and negative item difficulty parameters as being easy (success probability > 0.5) for an average person.

Qualitative coding of solution pathways

Coding of noted solution paths was performed with a deductively and inductively developed coding manual by two raters using a consensus procedure. In the first inductive step, distinguishable strategies (correct solution paths), errors (incorrect solution paths), and approaches (incomplete solution paths without indication of errors or the correct result) were identified, coded, and the codes were systematically sorted. Few not comprehensible solutions with correct results were coded separately (32 out of 951 correct responses in total). In the second deductive step, strategies were further distinguished as to whether they were appropriate for elementary school or whether procedures from secondary school (e.g., equations with variables) were used. For the errors, a distinction was made whether they were based on a computational error or on an inappropriate mathematization of the specific task. In the case of the approaches, a distinction was made as to whether these already contained a (first) mathematization appropriate to the task, or whether merely symbolic notations were noted in a superficial manner without any appropriate mathematization of the specific task.

Using this procedure, approximately 20% of the material was initially coded jointly by both raters. Examples for the codes can be found in the accompanying explanatory text to Tables 3 and 4. Most of the codes were found to be unambiguous and did not require further confirmation. The remaining 80% of the material was then coded by one rater, who marked all codes that might not be completely unambiguous. These marked codes were discussed in regular meetings of both raters and decided by consensus, partly adjusting the coding manual.

Results

Task difficulties

Research question 1 relates to whether the tasks are challenging enough. The data analyses (Table 2) showed that the difficulty of all tasks was within a range between well over zero and approximately 0.8. The task difficulties were distributed over the entire range, with the lowest task difficulty being 0.19 (Table 2: B10). Thus, all tasks proved to be challenging enough for student teachers to stimulate the use of strategies and flexible ways of solution.

Table 2 Task difficulty estimates (ordered from most difficult to easiest)

All tasks fitted the Rasch model, with a range of task difficulty estimates of 1.69 + 1.54 = 3.23 (logits). This ensures that students' abilities can be captured across a wide range of different proficiency levels. Solution frequencies below 0.5 or logits < 0 mark tasks where at least half of all students had little to no chance of answering the tasks correctly. The overall view of Table 2 shows that at least 20% of the students had difficulties already with solving word problems and percentage tasks. For at least 30% to slightly more than half of the students, multiplication and division tasks were challenging or unsolvable. At least 70% of the students failed most of the primary school-specific combinatorics tasks.

Subdomains with specific difficulties

Research question 2 analyzes which subdomains prove to be particularly challenging. An ordering of the tasks according to their difficulty as shown in Table 2 resulted in four groups of tasks with similar difficulty and similar content. Only two tasks could not be assigned to one of these four groups (A6, B7). The four groups indicate that there are subdomains or tasks in the assessment of procedural fluency and strategic competence in operating and mathematizing with natural and rational numbers that can be characterized by specific overall difficulties and thus challenged first-semester elementary school student teachers to a greater or lesser extent:

  1. (1)

    The two number system conversion tasks proved to be most difficult for first-semester student teachers with means of 0.19 and 0.20.

  2. (2)

    The second group included all combinatorial tasks with means between 0.20 and 0.37.

  3. (3)

    The third group comprised tasks with means between 0.43 and 0.69, all of which included multiplication and division with natural numbers, fractions, and decimals.

  4. (4)

    Word problems including percentages, unknown reference sets, one of the two inverse-proportion problems, and the calculation of means proved to be comparatively easy with means between 0.69 and 0.79. Nevertheless, even the easiest task of the assessment, which asked for one missing number when the mean and all other numbers are given (Tables 2 and 5: B1), was not answered correctly by 21% of the participants.

No hypotheses comparing the difficulties of the task groups were formulated prior to the study. The clear ordering of most tasks in four almost non-overlapping difficulties was therefore surprising and is interpreted in the discussion section.

Multiple solutions

Research question 3 examines whether the two times 10 = 20 tasks (versions A and B) whose noted solution paths were coded qualitatively permit different ways of solving the problems. The analyses identified a total of 73 different correct solution paths (strategies). This corresponds to an average of 3.65 different solution paths per task and confirms that the newly developed assessment tasks allow multiple solutions.

Of the total 919 comprehensible and coded correct solution pathways, 153 (16.7%) were classified as fitting only for the secondary level (e.g., equations with variables, see strategy S1m for tasks A1 and B1 in Table 3), or 766 (83.3%) as fitting also for the primary level. Complementing the previous theoretical task analyses, these findings suggest that the test tasks are suitable to assess student teachers’ flexible solutions regarding procedural fluency and strategic competence in operating and mathematizing with natural and rational numbers at an upper elementary school level.

Table 3 Strategies, types of errors, and approaches for the means tasks (A1, B1)

The summarized results for research questions 3 and 4 are concretely illustrated in the next two sections using an easy task format (Table 2: A1, B1—means) and a moderately difficult task format (A13, B13—percent of a fraction). The qualitative coding of solution pathways, as described in above, coded distinguishable strategies (S: correct solution paths), errors (E: incorrect solution paths), and approaches (A: incomplete solution paths without indication of errors or the correct result) for every task. In the next two sections, at first rather complex solution ways are listed. These can be regarded partly as standard solution ways without flexible adaptation of the calculations to the given tasks (S1m in Table 3, S1p in Table 4). The following strategies increasingly consider calculation advantages which are offered by the numbers given in the tasks for less complex ways of solution (S2m-S4m in Table 3, S2p-S4p in Table 4). The listed types of errors first describe those which are characterized by a superficial and inappropriate mathematization of the given tasks (E1m in Table 3, E1p and E2p in Table 4). Subsequently, calculation errors are listed in case of a basically suitable mathematization of the tasks (E2m in Table 3, E3p in Table 4). Finally, first those attempts are listed that involve a rather formal or unnecessarily complex approach (A1m in Table 3, A1p in Table 4), followed by notes on calculation paths that tend to be more goal-oriented but are incomplete (A2m in Table 3, A2p in Table 4).

Table 4 Strategies, types of errors, and approaches for the percent of a fraction tasks (A13, B13)

Strategies, errors, and approaches for the means tasks

Table 3 shows the frequencies of strategies, errors, and approaches coded for the Means task format (A1, B1): S1m denotes solution paths in which an equation with x-variable for the requested number was set up and solved. For S2m, the difference between the required total sum of all four numbers and the partial sum of the three given numbers was calculated, yielding the solution number. For S3m, the respective deviations of the three given numbers from the mean were calculated and balanced with the fourth, searched number. S4m marks a systematic probing and S5m a correct result with an unclarified solution path. In the case of error types, E1m characterizes a superficial search for a solution path in which no goal-oriented calculation is recognizable. E2m denotes a calculation error with an overall goal-oriented solution path. In the approaches, A1m denotes a probing without finding a solution and A2m denotes an incomplete attempt involving the total sum of the four numbers.

Strategies, errors, and approaches for the percent of a fraction tasks

Table 4 shows the frequencies of strategies, errors, and approaches coded for the Percent of a fraction task format (A1, B1): S1p denotes finding a solution by multiplying the initial fraction by a decimal, percent, or hundredth fraction, e.g., \(\frac{8}{11}\bullet \frac{25}{100}\). In S2p, the fraction is multiplied by the percentage value converted to the truncated fraction, e.g., \(\frac{8}{11}\bullet \frac{1}{4}\). In the solution path S3p, the task was converted into a division task with a natural number, e.g., \(\frac{8}{11}\)÷4. S4p summarizes other ways of solving, including repeated halving to determine 25% of \(\frac{8}{11}\). S5p includes a correct result with an unclarified solution path. In the case of error types, E1p characterizes a superficial search for a solution path in which no goal-oriented calculation is recognizable. Error type E2p contains errors in the operationalization of the "percentage value of", e.g., if the wrong inverse fraction was used, such as \(\frac{8}{11}\bullet \frac{4}{1}\). E3p denotes calculation errors when multiplying with decimal numbers or hundredths of a fraction, e.g., when trying to calculate \(\frac{1}{100}\) of \(\frac{8}{11}\) first or to convert the fraction \(\frac{8}{11}\) into a decimal number. Notations of incomplete applications of the rule of three with the variable x or of terms without further transformation were summarized as approaches A1p. A2p denotes an incomplete attempt involving, e.g., a first notation of a reasonable multiplication with a decimal or percentage number.

Error analysis

Research question 4 examines whether the errors and approaches are predominantly attributable to a lack of linking mental models, strategies, and procedures flexibly rather than a lack of computational confidence. Of the total 522 identified errors, 372 (71.3%) were classified as an inappropriate mathematization of the specific task, e.g., E1m in Table 3 for tasks A1 and B1 (Means), E1p and E2p in Table 4 for tasks A13 and B13 (Percent of a fraction). In contrast with this, only 150 errors (28.7%) were coded as computational errors, e.g., E2m in Table 3, E3p in Table 4. This clearly shows that the most common sources of error are due to conceptual requirements in mathematizing rather than procedural requirements in formal computation.

Of the total 143 approaches, 116 (81.1%) were classified as merely symbolic notations in a superficial manner without any adequate mathematization of the specific task, e.g., A1m in Table 3, A1p in Table 4. In contrast with this, only 27 approaches (18.9%) were coded as having a (first) mathematization appropriate to the task, e.g., A2m in Table 3, A2p in Table 4.

Both results confirm that the errors and approaches found are primarily due to a lack of linking mental models, strategies, and procedures flexibly, which corresponds to an inadequate understanding of the problem situations and the solution paths. The challenge for the students when working on the tasks was obviously predominantly to find the appropriate operations (mathematizing) when searching for the solution paths and to apply them flexibly. In other words, the task solutions of the student teachers are based more on conceptual understanding as required in procedural fluency and strategic competence than on an application of automated computational skills.

Discussion

Theoretical and empirical validation of the assessment instrument for procedural fluency and strategic competence

The detailed task analysis in Table 1 and in the methods section served to theoretically validate the newly developed tasks for assessing procedural fluency and strategic competence. It could be shown that the tasks allow for several solution pathways, which were characterized by a flexible combination of mental models, strategies, and reasoning about procedures or formulas. The empirical quantitative task analysis served to validate the assessment instrument by showing that the tasks were sufficiently challenging for student teachers to stimulate the use of strategies and reasoning about procedures or formulas. Finally, the empirical qualitative coding of solution paths, errors, and approaches confirmed the results of the previous theoretical task analysis: multiple solution paths could be identified that represent distinguishable and flexible combinations of mental models, strategies, and reasoning about procedures or formulas. The very most errors were classified as incorrect mathematizations rather than computational errors. This underscores that the challenges of understanding the problem situation and translating the relationships among relevant quantities into adequate symbolic notations explain the item difficulties of the quantitative analyses. Such types of mathematical challenges relate to procedural fluency and strategic competence. In addition, most approaches were classified as purely symbolic notations in a superficial way without adequate mathematization of the specific task, which also indicates a lack of understanding of the problem situations and of adequate solution pathways. In summary, the theoretical and empirical analyses confirm that the newly developed assessment instrument captures student teachers’ procedural fluency and strategic competence, not just computational skills in elementary mathematics.

Challenging areas of operating and mathematizing with natural and rational numbers

The observed solution rates of the tasks ranged from 0.19 to 0.79, indicating that a large proportion of the 280 participating first-semester student teachers had severe difficulty in solving tasks that assess procedural fluency and strategic competence in operating and mathematizing with natural and rational numbers at the level of upper elementary school mathematics. Some tasks on multiplication and division with natural and rational numbers could only be answered correctly by less than half of the students. All tasks on combinatorics and number conversion were solvable for only about 40–20% of all students (Table 2). Assuming that most participants had not actively concerned themselves with mathematics since they had completed their general education, it seems probable that the solution frequencies (means) are, by and large, representative of the level of competence that they had reached by the end of secondary school or high school and with which they had entered the teacher education program. This assumption can serve as a starting point for the generation of further hypotheses that may help explain the item difficulties:

  1. (1)

    Number conversion tasks are usually not part of the school curriculum. Nevertheless, the discussion of this type of task in teacher education programs is frequently used as an opportunity to deepen the understanding of the decimal number system (Krauthausen, 2018; Reiss & Schmieder, 2007). The low solution frequencies of about 0.2 confirm personal teaching experiences of the author that suggest that this topic poses a big challenge to many student teachers who intend to teach at primary school (Voermanek & Schulz, 2023). It seems likely to assume that the difficulty may not result from the demands of the computational procedures themselves but rather from a lack of understanding of the principles of place value systems that underlie the procedures.

  2. (2)

    The kinds of combinatorics problems that were presented in the assessment involved small numbers as results and had not been designed to be solved with the help of combinatorial models (permutation, variation, combination) that are common in secondary school or high school. Instead, the problems suggested strategies that are common in primary school such as, for instance, finding examples and counting them systematically by listing (English, 2005; Lockwood & Gibson, 2016). In principle, some of the problems could have been solved by applying combinatorial models, but this would have required the additional generation of a suitable situational model, followed by proper mathematization. Irrespective of the approach that the participants had opted for, however, the necessity to understand the situation, to create a corresponding situational model, and to apply a suitable strategy like listing or mathematization might explain the low solution frequencies between 0.2 and 0.37.

  3. (3)

    It had not been expected that all multiplication and division tasks with natural numbers, fractions, and decimals would form a single group with difficulties between 0.43 and 0.69. Overall, division problems tended to be more difficult than multiplication problems, in particular when division involved fractions and decimals. All student teachers had completed secondary school or graduated from high school where the problems and operations are much more complex and usually require an extensive use of calculus. Against this background, it seems that at least some of the newly developed assessment tasks were perceived as novel problems so that the participants were not able to draw on experience and familiar patterns. As set forth in the theoretical analysis of the tasks, most of these problems can be solved elegantly without notable computational effort. Such an approach requires the availability of an adequate mental model for the operations, however, and makes it necessary to find a suitable strategy for applying the procedures flexibly, meaningfully, and creatively. Moreover, these components of mathematical knowledge and understanding have to be related to each other (Skemp, 2006). Problem solvers have to identify relationships between numbers and operations and make active use of them (Carpenter et al., 2005; Schulz, 2018).

  4. (4)

    It had been expected that word problems, percentages, and the calculation of means would only pose a comparatively minor challenge to the student teachers because these types of tasks are common in school, at least as far as primary school and lower secondary school are concerned. As this need not necessarily be the case in higher secondary school or high school, however, it is plausible to assume that it had been a long time ago since the participants had to solve such tasks. Approximately a quarter of them had problems with these kinds of tasks. The theoretical and empirical analyses showed that solving them requires an understanding and mathematization of the situations as well as the application of strategies for transforming formulas or equations, which obviously proved challenging for some teacher students in the assessment context too.

The international state of research about (prospective) elementary teachers’ content knowledge includes findings that suggest that it tends to be rather limited (Thanheiser et al., 2014), for example, as regards rational numbers (Depaepe et al., 2015; Newton, 2008), combinatorics (Wasserman, 2013), and proportional and nonproportional problems (Rizvi & Lawson, 2007; Thompson & Bush, 2003; van Dooren et al., 2002) at the level that is targeted in the curriculum of upper elementary school. The present study extends these findings by providing specific results concerning prospective teachers’ knowledge of numbers and operations with natural and rational numbers, place value, computational strategies, estimation, (inverse) proportion, combinatorics, mathematization, and word problems. The newly devised instrument assessed procedural fluency and strategic competence in these domains by presenting tasks that require a flexible interrelation of mental models, strategies, and procedures if the problems are to be solved meaningfully, efficiently, and creatively.

Limitations

Although the assessment instrument already includes tasks from several important subdomains regarding procedural fluency and strategic competence in operating and mathematizing with natural and rational numbers at the level of upper elementary school mathematics, the selection of problems and task formats inevitably remains limited and incomplete. The pool of tasks is therefore in need of extension. Furthermore, the sample of participants, prospective primary school teachers in their first semester at the Zurich University of Teacher Education may affect the generalizability of the findings. At other universities, the composition of student teachers might be different, or the educational objectives for the target level to be taught may differ (Copur-Gencturk, Jacobson, & Rasiej, 2021).

Implications for teacher education

Procedural fluency and strategic competence in operating and mathematizing with natural and rational numbers represent core components of mathematical proficiency of which elementary teachers need to have a deep understanding (Kilpatrick, 2001; National Council of Teachers of Mathematics, 2014). In summary, the findings underline the need for extended support in elementary mathematics for student teachers. The results suggest that during their initial education, many prospective teachers may not receive enough adequate opportunities to acquire in-depth knowledge of the mathematics that they will have to teach in their classrooms (Greenberg & Walsh, 2008; Masingila et al., 2012). One promising approach in this regard can consist in engaging student teachers in cognitively demanding and sufficiently challenging tasks without an obviously entailed solution path that focus on elementary mathematics and require complex and non-algorithmic thinking and, more generally, in providing them with opportunities to explore the nature of mathematical concepts, processes, or relationships (Aguilar & Telese, 2018; Auslander et al., 2019; Feldman et al., 2016; Stein et al., 2009).

Assessment tasks can point to aspects of teacher content knowledge that need to be acquired as part of professional preparation (Ball et al., 2008; Baumert et al., 2010; Döhrmann, Kaiser, & Blömeke, 2014; Phelps et al., 2020). Thus, an assessment such as the one that was implemented in the present study can be used for formative purposes too. It can provide first-semester student teachers with early feedback on the types of tasks or sub-tasks with which they may be struggling. In addition, comparisons of teacher students’ distinguishable solution pathways, monitored by teacher educators in specific courses, might develop content knowledge (CK) and pedagogical content knowledge (PCK) in tandem (Chamberlin et al., 2008; Norton, 2019). Such use of formative assessments in teacher education could provide opportunities not only to explore and deepen understanding of the concepts, processes, or relationships being assessed, but also to learn how comparing solution paths for tasks with multiple solutions can be an integral part of mathematics instruction and promote procedural fluency and strategic competence.

Experience with the use of the instrument also for evaluation at the end of a program is currently gathered at the Zurich University of Teacher Education. Future findings from a use of the assessment instrument for procedural fluency and strategic competence in operating and mathematizing with natural and rational numbers as an initial diagnosis and final test may illustrate in later publications the promising areas of application of the new instrument in the context of the education of prospective teachers. In this way, the project presented here aims to make an innovative and relevant contribution to mathematics teacher education in the long term.