Abstract
In this study, we examine “what went wrong” in our professional development program for encouraging cognitively demanding instruction, focusing on the difficulties we encountered in using an observational tool for evaluating this type of instruction and reaching inter-rater reliability. We do so through the lens of a discursive theory of teaching and learning. Data consisted of 10 coders’ coding sheets while learning to apply the Coding Rubric for Video Observations tool on a set of recorded mathematics lessons. We show that some discrepancies between novice coders and experts were found in narratives about valued actions relating to social aspects of teaching, such as teacher explanations and students’ struggle and discussion. These were relatively easy to detect from the written evaluations in the coding sheets. More problematic to pinpoint were places where discrepancies were found between novice and experts’ stories about the mathematical objects discussed. These required re-observing the recorded lessons. The analysis revealed that novice coders referred to objects about which the observed teacher mainly explained calculations, whereas the experts searched for (often, signaling the absence of) mathematical objects that invited exploration. We discuss these findings based on a theory of different Pedagogical Discourses that form the background for expert and novice coders’ interpretations of the coding manual. We conclude with practical implications for the process of achieving inter-rater reliability on observation tools for cognitively demanding instruction.
Similar content being viewed by others
Notes
The course was introduced to students as an opportunity to learn about cognitively demanding instruction, to view multiple “real life” lessons, and to get experience with research practices of coding according to observation protocols. Students could choose whether to take the class after the project was introduced, and the vast majority of them chose to do so.
The coding workshop included mathematics lessons both from elementary-school and middle-school. The number of lessons that appeared in this section describes both elementary and middle school lessons. However, the elementary-school lessons’ codings are not included in the present study due to the possibility that some difficulties in reliability around these lessons stemmed from the teachers not being sufficiently familiar with the elementary school mathematics curriculum.
“Bar agreement” was calculated by collapsing levels in the rubric into “high” and “low” and calculating reliability on these collapsed levels. This, in accordance with similar measures of reliability performed in Stein et al. (2017). See more on these reliability calculations in Heyd-Metzuyanim et al. (2020).
The full coding scheme can be found in Stein et al., (2017), Appendix B, DS24. In the finding section and the appendix, we specified some of the rubrics that are relevant to our study.
One set of coding sheets included “double coding” since there were two tasks in the lesson, thus we had in fact 8 sheets.
This problem was adapted from a lesson plan developed by the Institute for Learning, University of Pittsburgh (https://www.nctm.org/Conferences-and-Professional-Development/Principles-to-Actions-Toolkit/Resources/9-MS-Brovey-CalingPlans2-LessonGuide/).
Hebrew: Hok ha-pilug. Literally: the rule of distribution.
References
Berliner, D. C. (1994). The wonder of exemplary performances. Creating Powerful Thinking in Teachers and Students, 1–42.
Boston, M. D. (2012). Assessing instructional quality in mathematics. The Elementary School Journal, 113(1), 76–104.
Boston, M. D., & Candela, A. G. (2018). The instructional quality assessment as a tool for reflecting on instructional practice. ZDM - Mathematics Education, 50(3), 427–444. https://doi.org/10.1007/s11858-018-0916-6
Boston, M. D., & Smith, M. S. (2009). Transforming secondary mathematics teaching: Increasing the cognitive demands of instructional tasks used in teachers’ classrooms. Journal for Research in Mathematics Education, 40(2), 119–156. https://doi.org/10.2307/40539329
Casbergue, R. M., Bedford, A. W., & Burstein, K. (2014). CLASS reliability training as professional development for preschool teachers. Journal of Research in Childhood Education, 28(4), 426–440. https://doi.org/10.1080/02568543.2014.944724
Copur-Gencturk, Y. (2015). The effects of changes in mathematical knowledge on teaching: A longitudinal study of teachers’ knowledge and instruction. Journal for Research in Mathematics Education, 46(3), 280–330. https://doi.org/10.5951/jresematheduc.46.3.0280
Candela, A. G., & Boston, M. D. (2022). Centering professional development around the Instructional Quality Assessment rubrics. Mathematics Teacher Educator, 10(3), 204–222.
Cortina, K. S., Miller, K. F., McKenzie, R., & Epstein, A. (2015). Where low and high inference data converge: Validation of CLASS assessment of mathematics instruction using mobile eye tracking with expert and novice teachers. International Journal of Science and Mathematics Education, 13(2), 389–403. https://doi.org/10.1007/s10763-014-9610-5
Gee, J. (2011). An introduction to discourse analysis: Theory and method (3rd ed.). Routledge.
Gee, J. (2014). How to do discourse analysis—A toolkit (2nd ed.). Routledge.
Graham, M., Milanowski, A., & Miller, J. (2012). Measuring and promoting inter-rater agreement of teacher and principal performance ratings. Center for Educator Compensation Reform. In Online Submission (Issue February).
Heyd-Metzuyanim, E., & Shabtay, G. (2019). Narratives of “good” instruction: Teachers’ identities as drawing on exploration vs. acquisition pedagogical discourses. ZDM, 51(3), 541–554. https://doi.org/10.1007/s11858-018-01019-3
Heyd-Metzuyanim, E., Munter, C., & Greeno, J. (2018). Conflicting frames: A case of misalignment between professional development efforts and a teacher’s practice in a high school mathematics classroom. Educational Studies in Mathematics, 97, 21–37. https://doi.org/10.1007/s10649-017-9777-0
Heyd-Metzuyanim, E., Nachlieli, T., Weingarden, M., & Baor, R. (2020). Adapting a professional development program for cognitively demanding instruction across shifting contexts. Educational Studies in Mathematics, 104(3), 385–403. https://doi.org/10.1007/s10649-020-09967-y
Hill, H. C., Blazar, D., Mcginn, D., Kraft, M. A., Beisiegel, M., Humez, A., Litke, E., & Lynch, K. (2012). Validating arguments for observational instruments: Attending to multiple sources of variation. Educational Assessment, 17, 1–19. https://doi.org/10.1080/10627197.2012.715019
Holland, D., Lachicotte, W., Skinner, D., & Cain, C. (1998). Identity and agency in cultural worlds. Harvard University Press.
Jackson, K., Garrison, A., Wilson, J., Gibbons, L., Shahan, E., Gibbons, L., & Shahan, E. (2013). Exploring relationships between setting up complex tasks and opportunities to learn in concluding whole-class discussions in middle-grades mathematics instruction. Journal for Research in Mathematics Education, 44(4), 646–682.
Ma, J. Y., & Singer-Gabella, M. (2011). Learning to teach in the figured world of reform mathematics: Negotiating new models of identity. Journal of Teacher Education, 62(1), 8–22. https://doi.org/10.1177/2F0022487110378851
Munter, C., Stein, M. K., & Smith, M. S. (2015). Dialogic and direct instruction: Two distinct models of mathematics instruction and the debate(s) surrounding them. Teachers College Record, 117(11), 1–32.
Nachlieli, T., & Heyd-Metzuyanim, E. (2021). Commognitive conflicts as a learning mechanism towards explorative pedagogical discourse. Journal of Mathematics Teacher Education, 25(3), 347–369. https://doi.org/10.1007/s10857-021-09495-3
National Council of Teachers of Mathematics. (2000). Principles and standards for school mathematics.
Perel, H. (2012). Guidelines for planning and analyzing mathematics lessons according to the lesson study process. Ministry of Education, The State of Israel (In Hebrew). Accessed 8 May 2018.
Praetorius, A. K., & Charalambous, C. Y. (2018). Classroom observation frameworks for studying instructional quality: Looking back and looking forward. ZDM - Mathematics Education, 50(3), 535–553. https://doi.org/10.1007/s11858-018-0946-0
Resnick, L. B., Asterhan, C. S. C., & Clarke, S. N. (2018). Accountable talk: Instructional dialogue that builds the mind. Educational Practices Series. Geneva, Switzerland: The International Academy of Education (IAE) and the International Bureau of Education (IBE) of the United Nations Educational, Scientific and Cultural Organization (UNESCO).
Reinholz, D. L., Stone-Johnstone, A., & Shah, N. (2020). Walking the walk: Using classroom analytics to support instructors to address implicit bias in teaching. International Journal for Academic Development, 25(3), 259–272. https://doi.org/10.1080/1360144X.2019.1692211
Sabers, D. S., Cushing, K. S., & Berliner, D. C. (1991). Differences among teachers in a task. American Education Research Journal, 28(1), 63–88.
Sawada, D., Piburn, M. D., Judson, E., Turley, J., Falconer, K., Beford, R., & Bloom, I. (2002). Measuring reform practices in science and mathematics classrooms: The reformed teaching observation protocol. School Science and Mathematics Journal, 102(6), 245–253. https://doi.org/10.1111/j.1949-8594.2002.tb17883.x
Sfard, A. (2007). When the rules of discourse change, but nobody tells you: Making sense of mathematics learning from a commognitive standpoint. Journal of the Learning Sciences, 16(4), 565–613.
Sfard, A. (2008). Thinking as communicating. Cambridge University Press.
Sherin, M. G., & Han, S. Y. (2004). Teacher learning in the context of a video club. Teaching and Teacher Education, 20, 163–183. https://doi.org/10.1016/j.tate.2003.08.001
Smith, M. S., & Stein, M. K. (2011). 5 practices for orchestrating productive mathematics discussions. National Council of Teachers of Mathematics.
Star, J. R., & Strickland, S. K. (2008). Learning to observe: Using video to improve preservice mathematics teachers’ ability to notice. Journal of Mathematics Teacher Education, 11(2), 107–125. https://doi.org/10.1007/s10857-007-9063-7
Stein, M. K., Correnti, R., Moore, D., Russell, J. L. J., & Kelly, K. (2017). Using theory and measurement to sharpen conceptualizations of mathematics teaching in the Common Core era. AERA Open, 3(1), 1–20. https://doi.org/10.1177/2F2332858416680566
Stein, M. K., Engle, R. A., Smith, M. S., & Hughes, E. K. (2008). Orchestrating productive mathematical discussions: Five practices for helping teachers move beyond show and tell. Mathematical Thinking and Learning, 10(4), 313–340. https://doi.org/10.1080/10986060802229675
Sun, J., & van Es, E. A. (2015). An exploratory study of the influence that analyzing teaching has on preservice teachers’ classroom practice. Journal of Teacher Education, 66(3), 201–214. https://doi.org/10.1177/0022487115574103
Tong, F., Tang, S., Irby, B. J., Lara-alecio, R., Guerrero, C., & Lopez, T. (2019). A process for establishing and maintaining inter-rater reliability for two observation instruments as a fidelity of implementation measure: A large-scale randomized controlled trial perspective. Studies in Educational Evaluation, 62(October 2018), 18–29. https://doi.org/10.1016/j.stueduc.2019.04.008
van Es, E. A., & Sherin, M. G. (2008). Mathematics teachers’ “learning to notice” in the context of a video club. Teaching and Teacher Education, 24(2), 244–276. https://doi.org/10.1016/j.tate.2006.11.005
Weingarden, M., & Heyd-Metzuyanim, E. (2023). What can the realization tree assessment tool reveal about Explorative classroom discussions? Journal for Research in Mathematics Education, 54(2), 97–117.
Zepeda, S., & Jimenez, A. (2019). Teacher observation and reliability: Additional insights gathered from inter-rater reliability analyses. Journal of Educational Supervision, 2(2), 11–26. https://doi.org/10.31045/jes.2.2.2
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: selected rubrics from the coding rubric for video observations tool (Stein et al., 2017)
The Enactment rubric.
Enactment: Identify the manner in which the task was enacted |
---|
5. Doing mathematics (e.g., framing problems, making conjectures, looking for patterns, examining constraints, determining whether an answer is valid or reasonable, knowing when a problem is solved, justifying, explaining, challenging) |
4. Use of procedures with connection to meaning, concepts or understanding |
3. Use of procedures without connection to meaning, concepts or understanding |
2. Memorization |
1. Little or no mathematical thinking occurred |
99. Unsystematic and nonproductive exploration |
The explicit attention to concept (EAC) rubric.
Explicit ATTENTION TO CONCEPts (EAC) |
---|
4. One or more concepts are discussed and/or defined in depth. This entails explicit noting of the concept in a whole-class setting, including explanation and/or elaboration of critical features of the concept and connections to the larger web of mathematical ideas or procedures. Connections also include some allusion to the solution’s pathways applicability to other related problems |
3. One or more concepts are discussed and/or defined in some detail. This entails explicit noting of the concept in a whole-class setting. Explanation and/or elaboration of critical features of the concept may be incomplete, and connections to the larger web of mathematical ideas or procedures are notably present, but weak |
2. One or more concepts are referred to in a whole class setting, but critical features are NOT explained and/or elaborated or are not correctly discussed. Connections (other than those to real life) are absent or tenuous |
1. No concepts are discussed or defined publicly OR are only named in passing |
Students' opportunity to struggle (SOS) rubric.
Students' opportunity to struggle |
---|
5. Students are given opportunity to work individually, in pairs, or in small groups on complex tasks that they have not been shown how to solve. The tasks afford multiple solution strategies and/or ways of representing elements of the problem space. Teacher facilitates student thinking (e.g., through scaffolding, content support, provision of tools) but is very careful not to do the thinking for the students |
4. Students are given opportunity to work individually, in pairs, or in small groups on complex tasks that they have not been shown how to solve. The tasks afford multiple solution strategies and/or ways of representing elements of the problem space. Teacher guides student thinking in more directive ways than in 5, but students still shoulder most of the responsibility for the thinking |
3. Tasks that are designed to provide unstructured, divergent thinking followed by structured “telling” |
2. Within a bounded space (created and constrained by the teacher or the task), students struggle to make sense of something about which the “correct process” or “correct answer” is not immediately evident. Usually happens near the end of the task (e.g.,stretch or transfer problems after directive telling/demonstrating |
1. Students do not have the opportunity to struggle with important mathematics. The teacher or task tells them everything they need to know and do to solve the “problem.” |
The Norms around classroom discourse rubric.
Norms around classroom discourse |
---|
4. Some students offer up ideas and strategies based on their own thinking and reasoning; the teacher and other students listen and respond to those ideas |
3. Some students offer up ideas and strategies based on their own reasoning; the teacher listens and responds to those ideas and strategies |
2. Students’ participation includes answers to the teacher’s questions plus the provision of explanations regarding how they did the problems |
1. Students’ participation is limited to short answers to the teacher’s questions. (IRE discourse style predominates) |
N/A. Silent work |
The consolidation rubric.
Consolidation (student opportunity to compare and contrast various student-generated solution strategies/representations/ideas) |
---|
3. At least 2 student-generated different solution strategies/representations/ideas are publicly displayed to the whole class and their affordances and constraints (similarities and differences) are explicitly discussed with respect to illuminating critical features of the targeted concept |
2. At least 2 student-generated different solution strategies/representations/ideas are publicly displayed to the whole class, but they are not substantively different and/or their affordances and constraints are not discussed in any depth |
1. Only one student-generated solution strategy/representations/ideas is publicly shared |
N/A. Only answers are shared, multiple representations/strategies are not allowed for by the problem, no public discussion |
Appendix 2: The full excerpt about angles
Turn | Speaker | What was said (what was done) |
---|---|---|
1 | Teacher | Quiet… Pay attention to what Shir and Tania did. (…) he did a combination of geometry, where we learned (about) “corresponding and alternate angles between alternate lines” and they did (…) What is our given? Shir, mark the angle. OK. let’s look. Angles are 3 letters |
2 | Shir | It’s not an angle, it’s an angle bisector |
3 | Teacher | OK, so you write it this way. Notice, we haven’t yet learned to write this. (Walks to the board to write). Let’s write it this way. If I want to write, then I write it this way. We say EF bisects ADC. Good. Let’s see |
Tal explains how he decides what the angles are through calculations. The teacher stops him from time to time, explaining to the students why these calculations are allowed, due to the properties of the corresponding and alternating angles | ||
13 | Teacher | Great. Good for you. Sit down (to the whole class). Look, please. You (all) see the angle that they calculated? We'll also learn that the sum of one-sided angles, we learned (previously) only (about) corresponding angles. There's another type of angles we have not talked about which is, actually, once there are parallel lines, so the sum of corresponding angles, the sum of one-sided (interior) angles between parallel lines is 180. That means that if this one is 100 we could immediately say that this one is 80. Wonderful, good work, excellent. Another group? |
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Weingarden, M., Heyd-Metzuyanim, E. Evaluating mathematics lessons for cognitive demand: Applying a discursive lens to the process of achieving inter-rater reliability. J Math Teacher Educ 26, 609–634 (2023). https://doi.org/10.1007/s10857-023-09579-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10857-023-09579-2