Skip to main content
Log in

Evaluating mathematics lessons for cognitive demand: Applying a discursive lens to the process of achieving inter-rater reliability

  • Published:
Journal of Mathematics Teacher Education Aims and scope Submit manuscript

Abstract

In this study, we examine “what went wrong” in our professional development program for encouraging cognitively demanding instruction, focusing on the difficulties we encountered in using an observational tool for evaluating this type of instruction and reaching inter-rater reliability. We do so through the lens of a discursive theory of teaching and learning. Data consisted of 10 coders’ coding sheets while learning to apply the Coding Rubric for Video Observations tool on a set of recorded mathematics lessons. We show that some discrepancies between novice coders and experts were found in narratives about valued actions relating to social aspects of teaching, such as teacher explanations and students’ struggle and discussion. These were relatively easy to detect from the written evaluations in the coding sheets. More problematic to pinpoint were places where discrepancies were found between novice and experts’ stories about the mathematical objects discussed. These required re-observing the recorded lessons. The analysis revealed that novice coders referred to objects about which the observed teacher mainly explained calculations, whereas the experts searched for (often, signaling the absence of) mathematical objects that invited exploration. We discuss these findings based on a theory of different Pedagogical Discourses that form the background for expert and novice coders’ interpretations of the coding manual. We conclude with practical implications for the process of achieving inter-rater reliability on observation tools for cognitively demanding instruction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. The course was introduced to students as an opportunity to learn about cognitively demanding instruction, to view multiple “real life” lessons, and to get experience with research practices of coding according to observation protocols. Students could choose whether to take the class after the project was introduced, and the vast majority of them chose to do so.

  2. The coding workshop included mathematics lessons both from elementary-school and middle-school. The number of lessons that appeared in this section describes both elementary and middle school lessons. However, the elementary-school lessons’ codings are not included in the present study due to the possibility that some difficulties in reliability around these lessons stemmed from the teachers not being sufficiently familiar with the elementary school mathematics curriculum.

  3. “Bar agreement” was calculated by collapsing levels in the rubric into “high” and “low” and calculating reliability on these collapsed levels. This, in accordance with similar measures of reliability performed in Stein et al. (2017). See more on these reliability calculations in Heyd-Metzuyanim et al. (2020).

  4. The full coding scheme can be found in Stein et al., (2017), Appendix B, DS24. In the finding section and the appendix, we specified some of the rubrics that are relevant to our study.

  5. One set of coding sheets included “double coding” since there were two tasks in the lesson, thus we had in fact 8 sheets.

  6. This problem was adapted from a lesson plan developed by the Institute for Learning, University of Pittsburgh (https://www.nctm.org/Conferences-and-Professional-Development/Principles-to-Actions-Toolkit/Resources/9-MS-Brovey-CalingPlans2-LessonGuide/).

  7. Hebrew: Hok ha-pilug. Literally: the rule of distribution.

References

  • Berliner, D. C. (1994). The wonder of exemplary performances. Creating Powerful Thinking in Teachers and Students, 1–42.

  • Boston, M. D. (2012). Assessing instructional quality in mathematics. The Elementary School Journal, 113(1), 76–104.

    Article  Google Scholar 

  • Boston, M. D., & Candela, A. G. (2018). The instructional quality assessment as a tool for reflecting on instructional practice. ZDM - Mathematics Education, 50(3), 427–444. https://doi.org/10.1007/s11858-018-0916-6

    Article  Google Scholar 

  • Boston, M. D., & Smith, M. S. (2009). Transforming secondary mathematics teaching: Increasing the cognitive demands of instructional tasks used in teachers’ classrooms. Journal for Research in Mathematics Education, 40(2), 119–156. https://doi.org/10.2307/40539329

    Article  Google Scholar 

  • Casbergue, R. M., Bedford, A. W., & Burstein, K. (2014). CLASS reliability training as professional development for preschool teachers. Journal of Research in Childhood Education, 28(4), 426–440. https://doi.org/10.1080/02568543.2014.944724

    Article  Google Scholar 

  • Copur-Gencturk, Y. (2015). The effects of changes in mathematical knowledge on teaching: A longitudinal study of teachers’ knowledge and instruction. Journal for Research in Mathematics Education, 46(3), 280–330. https://doi.org/10.5951/jresematheduc.46.3.0280

    Article  Google Scholar 

  • Candela, A. G., & Boston, M. D. (2022). Centering professional development around the Instructional Quality Assessment rubrics. Mathematics Teacher Educator, 10(3), 204–222.

  • Cortina, K. S., Miller, K. F., McKenzie, R., & Epstein, A. (2015). Where low and high inference data converge: Validation of CLASS assessment of mathematics instruction using mobile eye tracking with expert and novice teachers. International Journal of Science and Mathematics Education, 13(2), 389–403. https://doi.org/10.1007/s10763-014-9610-5

    Article  Google Scholar 

  • Gee, J. (2011). An introduction to discourse analysis: Theory and method (3rd ed.). Routledge.

    Google Scholar 

  • Gee, J. (2014). How to do discourse analysis—A toolkit (2nd ed.). Routledge.

    Book  Google Scholar 

  • Graham, M., Milanowski, A., & Miller, J. (2012). Measuring and promoting inter-rater agreement of teacher and principal performance ratings. Center for Educator Compensation Reform. In Online Submission (Issue February).

  • Heyd-Metzuyanim, E., & Shabtay, G. (2019). Narratives of “good” instruction: Teachers’ identities as drawing on exploration vs. acquisition pedagogical discourses. ZDM, 51(3), 541–554. https://doi.org/10.1007/s11858-018-01019-3

    Article  Google Scholar 

  • Heyd-Metzuyanim, E., Munter, C., & Greeno, J. (2018). Conflicting frames: A case of misalignment between professional development efforts and a teacher’s practice in a high school mathematics classroom. Educational Studies in Mathematics, 97, 21–37. https://doi.org/10.1007/s10649-017-9777-0

    Article  Google Scholar 

  • Heyd-Metzuyanim, E., Nachlieli, T., Weingarden, M., & Baor, R. (2020). Adapting a professional development program for cognitively demanding instruction across shifting contexts. Educational Studies in Mathematics, 104(3), 385–403. https://doi.org/10.1007/s10649-020-09967-y

    Article  Google Scholar 

  • Hill, H. C., Blazar, D., Mcginn, D., Kraft, M. A., Beisiegel, M., Humez, A., Litke, E., & Lynch, K. (2012). Validating arguments for observational instruments: Attending to multiple sources of variation. Educational Assessment, 17, 1–19. https://doi.org/10.1080/10627197.2012.715019

    Article  Google Scholar 

  • Holland, D., Lachicotte, W., Skinner, D., & Cain, C. (1998). Identity and agency in cultural worlds. Harvard University Press.

    Google Scholar 

  • Jackson, K., Garrison, A., Wilson, J., Gibbons, L., Shahan, E., Gibbons, L., & Shahan, E. (2013). Exploring relationships between setting up complex tasks and opportunities to learn in concluding whole-class discussions in middle-grades mathematics instruction. Journal for Research in Mathematics Education, 44(4), 646–682.

    Article  Google Scholar 

  • Ma, J. Y., & Singer-Gabella, M. (2011). Learning to teach in the figured world of reform mathematics: Negotiating new models of identity. Journal of Teacher Education, 62(1), 8–22. https://doi.org/10.1177/2F0022487110378851

    Article  Google Scholar 

  • Munter, C., Stein, M. K., & Smith, M. S. (2015). Dialogic and direct instruction: Two distinct models of mathematics instruction and the debate(s) surrounding them. Teachers College Record, 117(11), 1–32.

    Article  Google Scholar 

  • Nachlieli, T., & Heyd-Metzuyanim, E. (2021). Commognitive conflicts as a learning mechanism towards explorative pedagogical discourse. Journal of Mathematics Teacher Education, 25(3), 347–369. https://doi.org/10.1007/s10857-021-09495-3

    Article  Google Scholar 

  • National Council of Teachers of Mathematics. (2000). Principles and standards for school mathematics.

  • Perel, H. (2012). Guidelines for planning and analyzing mathematics lessons according to the lesson study process. Ministry of Education, The State of Israel (In Hebrew). Accessed 8 May 2018.

  • Praetorius, A. K., & Charalambous, C. Y. (2018). Classroom observation frameworks for studying instructional quality: Looking back and looking forward. ZDM - Mathematics Education, 50(3), 535–553. https://doi.org/10.1007/s11858-018-0946-0

    Article  Google Scholar 

  • Resnick, L. B., Asterhan, C. S. C., & Clarke, S. N. (2018). Accountable talk: Instructional dialogue that builds the mind. Educational Practices Series. Geneva, Switzerland: The International Academy of Education (IAE) and the International Bureau of Education (IBE) of the United Nations Educational, Scientific and Cultural Organization (UNESCO).

  • Reinholz, D. L., Stone-Johnstone, A., & Shah, N. (2020). Walking the walk: Using classroom analytics to support instructors to address implicit bias in teaching. International Journal for Academic Development, 25(3), 259–272. https://doi.org/10.1080/1360144X.2019.1692211

  • Sabers, D. S., Cushing, K. S., & Berliner, D. C. (1991). Differences among teachers in a task. American Education Research Journal, 28(1), 63–88.

    Google Scholar 

  • Sawada, D., Piburn, M. D., Judson, E., Turley, J., Falconer, K., Beford, R., & Bloom, I. (2002). Measuring reform practices in science and mathematics classrooms: The reformed teaching observation protocol. School Science and Mathematics Journal, 102(6), 245–253. https://doi.org/10.1111/j.1949-8594.2002.tb17883.x

    Article  Google Scholar 

  • Sfard, A. (2007). When the rules of discourse change, but nobody tells you: Making sense of mathematics learning from a commognitive standpoint. Journal of the Learning Sciences, 16(4), 565–613.

    Article  Google Scholar 

  • Sfard, A. (2008). Thinking as communicating. Cambridge University Press.

    Book  Google Scholar 

  • Sherin, M. G., & Han, S. Y. (2004). Teacher learning in the context of a video club. Teaching and Teacher Education, 20, 163–183. https://doi.org/10.1016/j.tate.2003.08.001

    Article  Google Scholar 

  • Smith, M. S., & Stein, M. K. (2011). 5 practices for orchestrating productive mathematics discussions‏. National Council of Teachers of Mathematics.

  • Star, J. R., & Strickland, S. K. (2008). Learning to observe: Using video to improve preservice mathematics teachers’ ability to notice. Journal of Mathematics Teacher Education, 11(2), 107–125. https://doi.org/10.1007/s10857-007-9063-7

    Article  Google Scholar 

  • Stein, M. K., Correnti, R., Moore, D., Russell, J. L. J., & Kelly, K. (2017). Using theory and measurement to sharpen conceptualizations of mathematics teaching in the Common Core era. AERA Open, 3(1), 1–20. https://doi.org/10.1177/2F2332858416680566

    Article  Google Scholar 

  • Stein, M. K., Engle, R. A., Smith, M. S., & Hughes, E. K. (2008). Orchestrating productive mathematical discussions: Five practices for helping teachers move beyond show and tell. Mathematical Thinking and Learning, 10(4), 313–340. https://doi.org/10.1080/10986060802229675

    Article  Google Scholar 

  • Sun, J., & van Es, E. A. (2015). An exploratory study of the influence that analyzing teaching has on preservice teachers’ classroom practice. Journal of Teacher Education, 66(3), 201–214. https://doi.org/10.1177/0022487115574103

    Article  Google Scholar 

  • Tong, F., Tang, S., Irby, B. J., Lara-alecio, R., Guerrero, C., & Lopez, T. (2019). A process for establishing and maintaining inter-rater reliability for two observation instruments as a fidelity of implementation measure: A large-scale randomized controlled trial perspective. Studies in Educational Evaluation, 62(October 2018), 18–29. https://doi.org/10.1016/j.stueduc.2019.04.008

    Article  Google Scholar 

  • van Es, E. A., & Sherin, M. G. (2008). Mathematics teachers’ “learning to notice” in the context of a video club. Teaching and Teacher Education, 24(2), 244–276. https://doi.org/10.1016/j.tate.2006.11.005

    Article  Google Scholar 

  • Weingarden, M., & Heyd-Metzuyanim, E. (2023). What can the realization tree assessment tool reveal about Explorative classroom discussions? Journal for Research in Mathematics Education, 54(2), 97–117.

  • Zepeda, S., & Jimenez, A. (2019). Teacher observation and reliability: Additional insights gathered from inter-rater reliability analyses. Journal of Educational Supervision, 2(2), 11–26. https://doi.org/10.31045/jes.2.2.2

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Einat Heyd-Metzuyanim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: selected rubrics from the coding rubric for video observations tool (Stein et al., 2017)

The Enactment rubric.

Enactment: Identify the manner in which the task was enacted

5. Doing mathematics (e.g., framing problems, making conjectures, looking for patterns, examining constraints, determining whether an answer is valid or reasonable, knowing when a problem is solved, justifying, explaining, challenging)

4. Use of procedures with connection to meaning, concepts or understanding

3. Use of procedures without connection to meaning, concepts or understanding

2. Memorization

1. Little or no mathematical thinking occurred

99. Unsystematic and nonproductive exploration

The explicit attention to concept (EAC) rubric.

Explicit ATTENTION TO CONCEPts (EAC)

4. One or more concepts are discussed and/or defined in depth. This entails explicit noting of the concept in a whole-class setting, including explanation and/or elaboration of critical features of the concept and connections to the larger web of mathematical ideas or procedures. Connections also include some allusion to the solution’s pathways applicability to other related problems

3. One or more concepts are discussed and/or defined in some detail. This entails explicit noting of the concept in a whole-class setting. Explanation and/or elaboration of critical features of the concept may be incomplete, and connections to the larger web of mathematical ideas or procedures are notably present, but weak

2. One or more concepts are referred to in a whole class setting, but critical features are NOT explained and/or elaborated or are not correctly discussed. Connections (other than those to real life) are absent or tenuous

1. No concepts are discussed or defined publicly OR are only named in passing

Students' opportunity to struggle (SOS) rubric.

Students' opportunity to struggle

5. Students are given opportunity to work individually, in pairs, or in small groups on complex tasks that they have not been shown how to solve. The tasks afford multiple solution strategies and/or ways of representing elements of the problem space. Teacher facilitates student thinking (e.g., through scaffolding, content support, provision of tools) but is very careful not to do the thinking for the students

4. Students are given opportunity to work individually, in pairs, or in small groups on complex tasks that they have not been shown how to solve. The tasks afford multiple solution strategies and/or ways of representing elements of the problem space. Teacher guides student thinking in more directive ways than in 5, but students still shoulder most of the responsibility for the thinking

3. Tasks that are designed to provide unstructured, divergent thinking followed by structured “telling”

2. Within a bounded space (created and constrained by the teacher or the task), students struggle to make sense of something about which the “correct process” or “correct answer” is not immediately evident. Usually happens near the end of the task (e.g.,stretch or transfer problems after directive telling/demonstrating

1. Students do not have the opportunity to struggle with important mathematics. The teacher or task tells them everything they need to know and do to solve the “problem.”

The Norms around classroom discourse rubric.

Norms around classroom discourse

4. Some students offer up ideas and strategies based on their own thinking and reasoning; the teacher and other students listen and respond to those ideas

3. Some students offer up ideas and strategies based on their own reasoning; the teacher listens and responds to those ideas and strategies

2. Students’ participation includes answers to the teacher’s questions plus the provision of explanations regarding how they did the problems

1. Students’ participation is limited to short answers to the teacher’s questions. (IRE discourse style predominates)

N/A. Silent work

The consolidation rubric.

Consolidation (student opportunity to compare and contrast various student-generated solution strategies/representations/ideas)

3. At least 2 student-generated different solution strategies/representations/ideas are publicly displayed to the whole class and their affordances and constraints (similarities and differences) are explicitly discussed with respect to illuminating critical features of the targeted concept

2. At least 2 student-generated different solution strategies/representations/ideas are publicly displayed to the whole class, but they are not substantively different and/or their affordances and constraints are not discussed in any depth

1. Only one student-generated solution strategy/representations/ideas is publicly shared

N/A. Only answers are shared, multiple representations/strategies are not allowed for by the problem, no public discussion

Appendix 2: The full excerpt about angles

Turn

Speaker

What was said (what was done)

1

Teacher

Quiet… Pay attention to what Shir and Tania did. (…) he did a combination of geometry, where we learned (about) “corresponding and alternate angles between alternate lines” and they did (…) What is our given? Shir, mark the angle. OK. let’s look. Angles are 3 letters

2

Shir

It’s not an angle, it’s an angle bisector

3

Teacher

OK, so you write it this way. Notice, we haven’t yet learned to write this. (Walks to the board to write). Let’s write it this way. If I want to write, then I write it this way. We say EF bisects ADC. Good. Let’s see

 

Tal explains how he decides what the angles are through calculations. The teacher stops him from time to time, explaining to the students why these calculations are allowed, due to the properties of the corresponding and alternating angles

13

Teacher

Great. Good for you. Sit down (to the whole class). Look, please. You (all) see the angle that they calculated? We'll also learn that the sum of one-sided angles, we learned (previously) only (about) corresponding angles. There's another type of angles we have not talked about which is, actually, once there are parallel lines, so the sum of corresponding angles, the sum of one-sided (interior) angles between parallel lines is 180. That means that if this one is 100 we could immediately say that this one is 80. Wonderful, good work, excellent. Another group?

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Weingarden, M., Heyd-Metzuyanim, E. Evaluating mathematics lessons for cognitive demand: Applying a discursive lens to the process of achieving inter-rater reliability. J Math Teacher Educ 26, 609–634 (2023). https://doi.org/10.1007/s10857-023-09579-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10857-023-09579-2

Keywords

Navigation