Issue 5 – July 2012

Special Issue on Summative Assessment

Assessment is always centre stage in educational politics but traditional approaches to test design are rarely called in doubt. US testing has long emphasised psychometric efficiency, minimizing statistical error but with little or no concern for systematic error, so the performances actually assessed cover only a small part of the declared learning goals. UK examinations show greater concern for the variety of tasks but the emphasis in mathematics remains on fragments of knowledge that reflect the detailed content level criteria of the National Curriculum, while development methods effectively preclude the non-routine tasks that assessing "process" strategies and skills requires. While other countries, including the Netherlands, have happier stories to tell, the problems are generic.

This Special Issue addresses the design and development challenges of producing tests that reflect the broad spectrum of performance goals that international standards set out. It is timely. In the US the Common Core State Standards set out a vision that integrates mathematical practices and content; two interstate assessment consortia are funded to realise that vision in the assessment they offer. The signs on the quality of the likely outcomes are mixed, as they are in the UK where a change of government has called in question a similar move towards broad spectrum examinations.

The four articles in this issue look at the design challenges from different perspectives. The lead is a report from a distinguished working group of ISDDE, which takes a comprehensive look at the design and implementation issues. Daniel Pead expands on the strengths and weaknesses of computer technology in assessment, exposing the naivete of those who think it is capable of assessing substantial chains of reasoning but showing ways in which it can improve the quality of assessment. Betsy Taleporos’ piece looks at how the periodic tests that are being introduced may be used to yield some diagnostic information - a potentially formative use of summative assessment. The final piece, in the "designers speak" series, sets out the assessment design principles that Malcolm Swan and the Shell Centre team have developed over the last 30 years.

As ever, reaction pieces to any of the above would be welcomed by the editors.

A future special issue on formative assessment is being planned. We invite suggestions for contributions.

Hugh Burkhardt
Editor of this special issue.

High-stakes Examinations to Support Policy:
Design, development and implementation

Paul Black, Hugh Burkhardt, Phil Daro, Ian Jones, Glenda Lappan, Daniel Pead, and Max Stephens:
for the ISDDE Working Group on Examinations and Policy

How can we help policy makers choose better exams? This question was the focus of the Assessment Working Group at the 2010 ISDDE Conference in Oxford. The group brought together high-level international expertise in assessment . It tackled issues that are central to policy makers looking for tests that, at reasonable cost, deliver valid, reliable assessments of students’ performance in mathematics and science – with results that inform students, teachers, and school systems.

This paper describes the analysis and recommendations from the group’s discussions, with references that provide further detail. It has contributed to discussions, in the US and elsewhere, on “how to do better”. We hope it will continue to be useful both to policy makers and to assessment designers.

ISDDE (2012) Black, P., Burkhardt, H., Daro, P., Jones, I., Lappan, G., Pead, D., Stephens, M.
High-stakes Examinations to Support Policy. Educational Designer, 2(5).
Retrieved from:

Periodic Assessments and Diagnostic Reports:
Case Studies in Mathematics and Literacy Intervention Program

Betsy Taleporos

This paper discusses the formative use of periodic assessments as they were developed and are in use by America’s Choice Pearson in its mathematics and language arts intervention programs. It is a practical case study of the use of design principles in creating assessments that are useful for classroom teachers and, by the nature of their design, provide diagnostic information that is instructionally relevant. The use of these measures varies with the program but all of them are designed to highlight misconceptions or common error patterns. It is important to recognize that misconceptions occur in both content domains, as they do in other domains. Uncovering misconceptions or error patterns offers tremendous insight into a formative use of assessments, since the reasons behind answering a question incorrectly can directly inform instructional practice. This approach is also underscored by some of the suggestions in the lead article in this issue of ED.

Taleporos, E. (2012) Periodic Assessments and Diagnostic Reports. Educational Designer, 2(5).
Retrieved from:

World Class Tests:
Summative Assessment of Problem-solving Using Technology

Daniel Pead

This article considers how the principled design of interactive, computer-delivered tasks can enable the assessment of problem solving and process skills in ways that would not be possible in a conventional test. The case studied is World Class Tests, a project started by the UK government in 1999, which set out to produce and deliver summative assessment tests that would reveal “submerged talent” in 9 and 13 year-old students who were not being challenged by the regular curriculum. There were two subjects: “Mathematics” and “Problem-solving in Mathematics, Science and Technology”; 50% of the test for each subject was delivered on computer. This article describes the design and development of the computer-based tests in problem-solving, and discusses some implications for the current effort to increase the emphasis on problem-solving and process skills in assessment. The author was the lead designer for the project strand working on computer-based problem solving tasks.

Pead, D. (2012) World Class Tests: Summative Assessment of Problem-solving Using Technology. Educational Designer, 2(5).
Retrieved from:

A Designer Speaks: Malcolm Swan and Hugh Burkhardt

Curricula that value mathematical practices will only be implemented effectively when high-stakes assessments recognise and reward these aspects of performance across a range of contexts and content. In this paper we discuss the challenge of designing such tests, a set of principles for doing so well, and strategies and tactics for turning those principles into tasks and tests that work well in practice.

Swan, M., Burkhardt, H. (2012) A Designer Speaks. Educational Designer, 2(5).
Retrieved from:
ISSN 1759-1325