Statistics Day 2020

Supporting teachers new to teaching statistics

Posing investigative questions

What makes a good investigative question?

A good investigative question is one that: has both the variable(s) and population(s) clear; has the intention clear; is able to be answered with the data; is about the whole group; and is interesting (Arnold, 2013, p.135).

What are the underpinning concepts that are needed to support the teaching and learning around posing investigative questions?

Pedagogical aspects to consider include: the types of investigative questions to address and when; at what point in the teaching and learning sequence will the teacher introduce posing investigative questions; and how or if the teacher will let the students “discover” for themselves the criteria for what makes a good investigative question.

The underlying big concepts and the components of what make a good investigative question include students needing: a sense of population; a sense of “tend” and “typical”; a sense of the variable(s); and an image of a hypothesised distribution(s) as they start posing their investigative question (Arnold, 2013, p.136).

Research papers

Critiquing investigative questions

Arnold, P., & Pfannkuch, M. (2018). Critiquing Investigative Questions. In International Conference on Teaching Statistics (ICOTS10, July, 2018),. IASE.

Related online resources

Curriculum level 4 resource
See lesson 2 posing investigative questions, curriculum level 5 resource

See session 2: Problem for ideas on interrogating investigative questions

Back to top


CODAP is free educational software for data analysis. This web-based data science tool is designed as a platform for developers and as an application for students in grades 6-14.

All posts related to using CODAP are here, including specific teaching activities and how to guides.

Back to top

Describing the shape of statistical distributions

What descriptors do year 10 (ages 14–15) students intuitively use for distributional shape?

In response to the first question: the distribution shape descriptors that the students intuitively use have two components to them (Figure 8-14, Arnold, 2013, p.214).

The first component is symmetry of the graph, i.e. whether the graph is symmetric or asymmetric, and this gives five categories: uniform, bell shaped or normal, other symmetric graphs, left skew and right skew. The second component is modality, i.e. whether the graph is unimodal, bimodal or has some other modality. The use of modality as part of the distributional shape descriptor is a new idea as previous descriptors such as uniform, normal, skew and bimodal (Bakker, 2004; delMas et al., 2005) did not necessarily attend to both components. Skew as a descriptor only attends to symmetry and bimodal only attends to modality, whereas the descriptors uniform and bell-shaped (i.e. normal) – special cases of symmetrical graphs – attend to both components. (By definition a uniform distribution has no mode and a bell-shaped or normal distribution has one mode.)

For students to successfully classify distributional shape from samples they need to be able to infer the shape. This is more easily done from dot plots (Bakker, 2004; Pfannkuch, 2006), but also becomes easier with experience. Experience is acquired over time, but in this research a deliberate act was made to build students’ experiences faster. This was through the development of a contextual library of variable and population shapes. Connected to this is predicting distributional shapes before the data are looked at or sourced. The prediction, too, is based on knowledge of the variable and population and this knowledge is built using previous experiences. (Arnold, 2013, pp.229-230)

What makes a good distribution description at level 5 (ages 13–15) in the New Zealand curriculum?

A good distribution description at level 5 (ages 13–15) includes a description of the overall shape of the distribution and at least two other features (Figure 8‑16, Arnold, 2013, p.217) and links these features to the context through the variable, units and an acceptable population descriptor, while a very good distribution description describes at least three features of the distribution in addition to the overall shape, connects the context throughout the description and may include some explanation or interpretation of the data in context. The features of distribution at level 5 (ages 13–15) in the New Zealand curriculum are specified in the distribution description framework (DDF) in Figure 8-16 (Arnold, 2013, p.217).

These features are organised by the overarching statistical concepts: contextual knowledge, distributional, graph comprehension, variability, and signal and noise. By attending to the different features, a good picture of the distribution can be build up.

The DDF was designed to support the development of the notion of distribution. The DDF for level 5 (ages 13–15) reflects the statistical knowledge related to distribution that students at this level have access to. Students at this level should be able to call upon any of the features of the DDF, depending on the context and the data available. (Arnold, 2013, p.230)

What distributional shapes and graphs do year 10 (ages 14–15) students predict when given the context?

The students appear to have an image of the distribution in mind and have sketched this with justification. The students have connected their visual image, analytical thinking and verbal description together to sketch a sensible graph with appropriate values for each of the scenarios. As students build their contextual shape library, they build the experiences they can call on to predict distributions, and their growing knowledge of the different features of distributions supports the detail in their prediction; for example, giving maximum and minimum values, where the peak(s) or mode(s) is, and an idea of the modal cluster location. The process of drawing a predicted distribution calls on the mental processes of visual, analytic and verbal descriptive thinking (Aspinwall, Haciomeroglu, & Presmeg, 2008). The visual thinking is captured by the construction of the basic distribution sketch; adding detail to the sketch (for example, age ranges or what the peak value might be) supports the analytical thinking; and justifying, either verbally or in written form, connects the image to analytic thinking and helps to sustain student thinking. (Arnold, 2013, p.231)

Research papers

Arnold, P., & Pfannkuch, M. (2012). The language of shape. Proceedings of the 12th International Congress on Mathematical Education (ICME-12, July, 2012), Seoul, South Korea (pp. 2446–2455). Seoul, South Korea: ICME-12.

Describing distributions

Arnold, P., & Pfannkuch, M. (2014). Describing distributions. Sustainability in statistics education. Flagstaff, AZ: International Statistical Institute. 

Back to top

Making the call

What underpinning concepts do students need to support them to make a call at curriculum level 5 (ages 13-15)?

 ProbabilisticGeneralisationEvidence from data
VerbalisationsArticulating the uncertainty embedded in an inferenceMaking a claim about the aggregate that goes beyond the dataBeing explicit about the evidence used
Underpinning reasoning conceptsSampling variability Uncertainty
Sample size (not @ L5)  
Population Distribution
Shift, Overlap, Position of medians
Decision guide
Statistical inference for comparison of two samples of quantitative data at New Zealand Curriculum level 5

Figure 7-10. Framework for thinking about statistical inference and sampling reasoning (Arnold, 2013, p.165) Note: Adapted from Makar & Rubin (2009, p. 85).

Can year 10 (ages 14–15) students consistently and coherently make a statistical inference?

Students can consistently and coherently make statistical inferences. The teaching sequence, with its focus on underpinning concepts, using physical and dynamic visual simulations and allowing students to “reinvent” the decision rules for making the call, makes a contribution to the pedagogical content knowledge base for the wider statistics education community (Arnold, 2013, p. 188).

Teaching sequence is on CensusAtSchool. The material is in three parts, making the call is specifically covered in part 3, but the previous two parts provide developmental concepts to support making the call.

Workshops for teachers to support the teaching sequence

Animations to support the teaching sequence

Sampling variation animations

What evidence do students use to make the call at curriculum level 5 (ages 13–15) given suitable learning experiences for developing criteria to make a call?

The two situations were designed to focus on one relevant aspect at a time (Bodemer, Ploetzner, Feuerlein, & Spada, 2004) and to extract principles (Bakker & Gravemeijer, 2004) relevant to the two different situations. The evidence that the students used to make the call included the amount of overlap between the two boxes (in box plots) and the position of the medians relative to the overlap. They used these two pieces of evidence from the data in conjunction with the curriculum level 5 (ages 13–15) decision rule to make the call (or not) that condition A tended to have bigger/faster/longer values than condition B back in the populations. As a result of this research, the specific evidence that students at this age need and can access was realised, and the findings allowed the researchers to focus more clearly on pertinent features and concepts. Progress has been made towards resolving the problematic situation that there was no consensus on how to make a call. Students have moved from calling on summary statistics being higher to using an appreciation of sampling variability and shift, overlap, and position of medians to make their call.

The learning emphasis was not on the “rule” for the decision criteria; rather, it was on developing the underpinning concepts, such as sample-to-population and sampling variability ideas, that are needed to use the rule with understanding. The students in this study seemed to understand how and why the use of the overlap and position of the medians relative to the overlap informed their use of the rule to consistently and coherently answer their comparative investigative question. (Arnold, 2013, p.188)

CODAP doc for Making the call Karekare College

Research papers

Building students’ inferential reasoning: Statistics curriculum Levels 5 and 6

TLRI research initiative

Pfannkuch, M., Arnold, P., & Wild, C. (2011). Building students’ inferential reasoning: Statistics curriculum levels 5 and 6 : Statistics: It’s reasoning, not calculating. Wellington: Teaching and Learning Research Initiative.

Enhancing Students’ Inferential Reasoning: From Hands-On to “Movies”

Arnold, P., Pfannkuch, M., Wild, C. J., Regan, M., & Budgett, S. (2011). Enhancing students’ inferential reasoning: from hands-on to “movies”. Journal of Statistics Education19(2).

Enhancing students’ inferential reasoning: from hands on to movie snapshots

Arnold, P., & Pfannkuch, M. (2010, July). Enhancing students’ inferential reasoning: From hands on to “movie snapshots”. In Data and context in statistics education: Towards an evidence-based society. Proceedings of the Eighth International Conference on Teaching Statistics, Ljubljana, Slovenia.

Back to top


Arnold, P. (2013). Statistical investigative questions. An enquiry into posing and answering investigative questions from existing data. Unpublished doctoral thesis, The University of Auckland.

Aspinwall, L., Haciomeroglu, E., & Presmeg, N. (2008). Students’ verbal descriptions that support visual and analytic thinking in calculus. In O. Figueras, J. Cortina, S. Alatorre, T. Rojano, & A. Sepúlveda (Eds.), Proceedings of the Joint Meeting of PME 32 and PME-NA 30 (July, 2008) (Vol. 2, pp. 97–104). Morelia, Mexico: Cinvestav-UMSNH.

Bakker, A. (2004a). Design research in statistics education: On symbolizing and computer tools. Utrecht, The Netherlands: Freudenthal Institute.

Bakker, A., & Gravemeijer, K. (2004). Learning to reason about distribution. In D. Ben-Zvi & J. Garfield (Eds.), The challenge of developing statistical literacy, reasoning and thinking (pp. 147–168). Dordrecht, The Netherlands: Kluwer.

Bodemer, D., Ploetzner, R., Feuerlein, I., & Spada, H. (2004). The active integration of information during learning with dynamic and interactive visualisations. Learning and Instruction, 14, 325–341. doi: 10.1016/j.learninstruc.2004.06.006

delMas, R. (2004). A comparison of mathematical and statistical reasoning. In D. Ben-Zvi & J. Garfield (Eds.), The challenge of developing statistical literacy, reasoning and thinking (pp. 79–95). Dordrecht, The Netherlands: Kluwer.

Makar, K., & Rubin, A. (2009). A framework to support research on informal inferential reasoning. Statistics Education Research Journal, 8(1), 82–105.

Pfannkuch, M. (2006). Comparing box plot distributions: A teacher’s reasoning. Statistics Education Research Journal, 5(2), 27–45.