Assessing the performance of schools: limits and league tables

Similar documents
Effective Pre-school and Primary Education 3-11 Project (EPPE 3-11)

GCSE English Language 2012 An investigation into the outcomes for candidates in Wales

How to Judge the Quality of an Objective Classroom Test

St Philip Howard Catholic School

PETER BLATCHFORD, PAUL BASSETT, HARVEY GOLDSTEIN & CLARE MARTIN,

PUPIL PREMIUM POLICY

Archdiocese of Birmingham

PLEASE SCROLL DOWN FOR ARTICLE. Full terms and conditions of use:

Student Assessment and Evaluation: The Alberta Teaching Profession s View

Measuring Efficiency in English Schools, Techniques, Policy Implications and Practicalities

Consent for Further Education Colleges to Invest in Companies September 2011

Summary results (year 1-3)

Eastbury Primary School

Serious doubts about school effectiveness Stephen Gorard a a

Denbigh School. Sex Education and Relationship Policy

ELM Higher Education Workshops. I. Looking for work around the globe. What does it entail? Because careers no longer stop at the border, students will

Newlands Girls School

Research Update. Educational Migration and Non-return in Northern Ireland May 2008

QUEEN S UNIVERSITY BELFAST SCHOOL OF MEDICINE, DENTISTRY AND BIOMEDICAL SCIENCES ADMISSION POLICY STATEMENT FOR DENTISTRY FOR 2016 ENTRY

RCPCH MMC Cohort Study (Part 4) March 2016

Julia Smith. Effective Classroom Approaches to.

AUTHORITATIVE SOURCES ADULT AND COMMUNITY LEARNING LEARNING PROGRAMMES

Oasis Academy Coulsdon

Information for Private Candidates

Development and Innovation in Curriculum Design in Landscape Planning: Students as Agents of Change

Rethinking the Federal Role in Elementary and Secondary Education

LITERACY ACROSS THE CURRICULUM POLICY

Sector Differences in Student Learning: Differences in Achievement Gains Across School Years and During the Summer

Probability estimates in a scenario tree

Reading Horizons. Organizing Reading Material into Thought Units to Enhance Comprehension. Kathleen C. Stevens APRIL 1983

STUDENT ASSESSMENT, EVALUATION AND PROMOTION

Digital Media Literacy

Applications from foundation doctors to specialty training. Reporting tool user guide. Contents. last updated July 2016

Centre for Evaluation & Monitoring SOSCA. Feedback Information

Initial teacher training in vocational subjects

Audit Documentation. This redrafted SSA 230 supersedes the SSA of the same title in April 2008.

Student-Centered Learning

Reading Horizons. Aid for the School Principle: Evaluate Classroom Reading Programs. Sandra McCormick JANUARY Volume 19, Issue Article 7

School Size and the Quality of Teaching and Learning

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

Alma Primary School. School report. Summary of key findings for parents and pupils. Inspection dates March 2015

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Assessment and Evaluation

Longitudinal Analysis of the Effectiveness of DCPS Teachers

Science Fair Project Handbook

Anglia Ruskin University Assessment Offences

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

DfEE/DATA CAD/CAM in Schools Initiative - A Success Story so Far

I. STATEMENTS OF POLICY

WOODBRIDGE HIGH SCHOOL

BASIC EDUCATION IN GHANA IN THE POST-REFORM PERIOD

Effectiveness of McGraw-Hill s Treasures Reading Program in Grades 3 5. October 21, Research Conducted by Empirical Education Inc.

Providing Feedback to Learners. A useful aide memoire for mentors

5 Early years providers

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering

Quantitative Research Questionnaire

Higher Education Review of University of Hertfordshire

Allington Primary School Inspection report - amended

Language learning in primary and secondary schools in England Findings from the 2012 Language Trends survey

-DOCUMENT RESUME ED UD Sammons, Pam; And Others TITLE AUTHOR

Teacher of English. MPS/UPS Information for Applicants

Abstractions and the Brain

IS USE OF OPTIONAL ATTRIBUTES AND ASSOCIATIONS IN CONCEPTUAL MODELING ALWAYS PROBLEMATIC? THEORY AND EMPIRICAL TESTS

Tun your everyday simulation activity into research

PROPOSED MERGER - RESPONSE TO PUBLIC CONSULTATION

DICTE PLATFORM: AN INPUT TO COLLABORATION AND KNOWLEDGE SHARING

Disabled children s cognitive development in the early years. Samantha Parsons Lucinda Platt

Institutional review. University of Wales, Newport. November 2010

ASSESSMENT GUIDELINES (PRACTICAL /PERFORMANCE WORK) Grade: 85%+ Description: 'Outstanding work in all respects', ' Work of high professional standard'

Special Educational Needs and Disabilities Policy Taverham and Drayton Cluster

Tutor Trust Secondary

Post-16 transport to education and training. Statutory guidance for local authorities

Ending Social Promotion:

Accounting & Financial Management

Functional Skills. Maths. OCR Report to Centres Level 1 Maths Oxford Cambridge and RSA Examinations

Classroom Teacher Primary Setting Job Description

Self-Concept Research: Driving International Research Agendas

Navitas UK Holdings Ltd Embedded College Review for Educational Oversight by the Quality Assurance Agency for Higher Education

ECON 365 fall papers GEOS 330Z fall papers HUMN 300Z fall papers PHIL 370 fall papers

Economics. Nijmegen School of Management, Radboud University Nijmegen

Special Educational Needs & Disabilities (SEND) Policy

Key concepts for the insider-researcher

CENTRE FOR ECONOMIC PERFORMANCE DISCUSSION PAPER NO July 1997

Team Dispersal. Some shaping ideas

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

INTRODUCTION TO TEACHING GUIDE

Gender and socioeconomic differences in science achievement in Australia: From SISS to TIMSS

Assessment booklet Assessment without levels and new GCSE s

Idsall External Examinations Policy

Upper Wharfedale School POSITIVE ATTITUDE TO LEARNING POLICY

Explorer Promoter. Controller Inspector. The Margerison-McCann Team Management Wheel. Andre Anonymous

PROMOTING QUALITY AND EQUITY IN EDUCATION: THE IMPACT OF SCHOOL LEARNING ENVIRONMENT

STRETCHING AND CHALLENGING LEARNERS

Linking the Common European Framework of Reference and the Michigan English Language Assessment Battery Technical Report

Thameside Primary School Rationale for Assessment against the National Curriculum

Inspection dates Overall effectiveness Good Summary of key findings for parents and pupils This is a good school

By Laurence Capron and Will Mitchell, Boston, MA: Harvard Business Review Press, 2012.

South Carolina English Language Arts

A Guide to Adequate Yearly Progress Analyses in Nevada 2007 Nevada Department of Education

Transcription:

(Goldstein, H. (1997). value added tables: the less-than-holy grail. Managing Schools Today 6: 18-19.) Assessing the performance of schools: limits and league tables by Harvey Goldstein Institute of Education London, WC1H 0AL email: h.goldstein@ioe.ac.uk Judging schools by the performances of their students. Examination results from GCSE and A levels, National Curriculum test scores, and findings from international studies are all used to make judgements about schools and teachers. Annual league tables of examination results for all schools and colleges in England and Wales are published and are intended to be used, by parents and others, as criteria for choosing among schools. In this article I shall argue that such uses have little justification. I shall argue that league tables may have a role as screening devices: that they can help to identify the relatively few institutions which (by definition) appear to have highly untypical patterns of performance and I shall discuss how this, limited, role is best realised in practice. When discussing the use of test or examination results we should, of course, remember that educational institutions have a responsibility for encouraging children s learning and development across a much wider range of areas than reasonably can be tested School league tables The systematic publication of performance tables for public examination results, begun in 1992, is now an established feature of the educational system in England and Wales. At GCSE the most prominent feature is the presentation, for each school, of the percentage who achieve 5 or more passes at grades A-C: at A-level an average grade point score is produced for each school. The national and local press are encouraged to present school and college results ranked in terms of these percentages and averages, and the Parents Charter encourages people to use these tables in choosing schools and colleges. While the Tory government has the responsibility for introducing this system, the Labour party has provided scant criticism of it and has given little indication that it would introduce substantial amendments. The principal argument against examination league tables is that the performance of a school is determined largely by the pre-existing achievements of the students when they enter it. Since schools differ markedly in this respect, for example some schools are highly selective, it is impossible to judge the quality of the education within a school solely in terms of such outputs. More recently the Government has accepted the inadequacy of using such crude rankings and has accepted the argument that what is required are so called value added tables in which there is a proper allowance for pre-existing achievements (DFE, 1995): inconsistently, it

continues to promote the use of the existing unadjusted tables. There are also, however problems which apply to value added tables, and I will show how initial expectations that these could provide a more sensitive indicator of school performance have failed to materialise. The first problem with reporting only a single figure such as the overall percentage of high GCSE grades is that schools may be differentially effective. Thus, for example, two schools may perform equally well on average but one may have poor performance in mathematics and good performance in English and vice versa for the other. Likewise, where value added tables are concerned, some schools may exhibit relatively good performance for initially (on intake) poorly achieving students and produce relatively weak performance for initially highly achieving students and vice versa for another school (Goldstein et al. 1993). A second problem, with both raw and value added tables, is that the percentages or scores produced for each school typically have a large margin of error or uncertainty associated with them. This problem is even more acute when individual subjects or departments within schools are the focus of interest, since the sometimes small numbers of students involved means that very little can be said about any individual department s performance with reasonable accuracy. In the extreme case, for some A level subjects there may be only two or three students involved and any generalisation, even over a number of years, from such small numbers is extremely hazardous. The following figure illustrates this general problem. It is taken from a survey of some 400 schools and colleges with A level results where value added scores are calculated by adjusting for the GCSE performance of the candidates (Goldstein and Thomas 1996). The lines represent ranges of statistical uncertainty such that it is possible to judge two schools or colleges as truly having different value added scores only when the lines do not overlap. In this figure, for some three quarters of all possible comparisons of pairs of institutions, it is not possible to make such a separation. In other words, finely graded value added comparisons are of limited value since in most cases we will find no difference. A level scores: pairwise (95%) uncertainty intervals for a random sample of schools and colleges for students in the middle (50%) GCSE score band. 0.6 0.4 0.2 0-0.2-0.4-0.6 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71

Another problem with all of these tables is that inevitably they refer to a cohort of students who began their education at those institutions many years earlier. Thus, the GCSE results published in November 1996 refer to a cohort starting at their secondary schools some five years previously: given that schools can change markedly over time there will be additional uncertainty over the use of those results to predict the performance of future cohorts. A further problem arises from recent research (Goldstein and Sammons, 1996) which shows that the primary school attended by a child exerts an important influence on GCSE performance and that this should therefore be taken into account when producing value added tables. Also, there are other factors, such as sex, ethnic origin and social class background, all of which are known to be associated with performance and progress throughout secondary schooling and which therefore will affect the interpretation of any rankings. Finally, there are several practical problems associated with producing performance tables, perhaps the most important being that during the course of a period of schooling, say from 11 to 16 years, many students will change schools. To ignore such students is likely to induce considerable biases into any comparisons, yet to include them properly would require enormous efforts at tracing them and recording their examination and test results. Taking all these caveats together shows that attempts to rank educational institutions are fraught with difficulty. Even with extensive and good quality information, there are some inherent limitations which preclude the use of rankings other than as initial screening instruments to isolate possibly high or low achieving institutions or departments which can then be further investigated; bearing in mind that the information is historical. These caveats refer not only to the public presentation of comparative tables but also to the use of such information for internal purposes by individual schools as is currently being proposed by the Schools Curriculum and Assessment Authority (SCAA) for Key Stage test results where the problem of student mobility threatens to undermine the enterprise. At the very least, if comparisons among schools are to be attempted, it is very important to provide users with careful descriptions of all the limitations. If this were done it may well be that many users would find little use for league tables. Finally, the existence of league tables within a competitive marketplace has invested them with an extra importance. To have a high rank or to be labelled as improving is seen to be a competitive advantage and there will be pressure for schools and colleges to modify their behaviour to secure such an advantage. Thus, for example, a key statistic in reporting GCSE results is the percentage of subjects passed with grades A-C. By concentrating efforts on those students predicted to obtain GCSE subject grades around the C/D borderline a school may hope to increase the proportion of its grade A-C passes, but only to the detriment of relative neglect of the very low achieving or the very high achieving students. Whether intended or not, such distortions of education are hardly welcome, yet they are an inevitable consequence of such a high stakes accountability system. Likewise, OFSTED inspections are required to take account of test scores and examination results, generally without proper value added information: it is perhaps not surprising that the majority of 'failures' are schools representing educationally disadvantaged populations. Even in OFSTED's own research studies, there is a

poorly understood need to take account both of intake and of the uncertainty surrounding any inferences based upon test scores and examination results (Mortimore and Goldstein, 1996). As one source of information about school performance, league tables can have value, assuming that they are properly contextualised, at the very least by adjusting for intake achievement. They may indicate to LEAs for example where there are potential problems or examples of highly successful schools or departments which could usefully be followed up. They may be able to indicate, over time, where improvements or deteriorations are taking place and they can form a part of continuing research activities studying factors associated with performance. It would be unfortunate if such positive uses were to become obscured by the public promotion of league tables, value added or otherwise, as valid tools per se for judging schools and colleges. Finally, despite the many abuses associated with league tables there is much to be learnt from research into institutional and system differences. We need to know more about the factors associated with success and failure of both students and institutions, but this is a painstaking, long term and complex process. Unfortunately, we appear to be passing through a phase of our culture where those in authority, or who wish to be in authority, have little taste for confronting the complexities of the real world in favour of oversimple interpretations. If such interpretations are not challenged, they may distort and degrade the systems they are supposed to support and describe.

Acknowledgements I am most grateful to Barbara Goldstein and Kate Myers for comments on an early draft. References DFE (1995). GCSE to GCE A/AS value added: briefing for schools and colleges. London, Department For Education. Goldstein, H. and S. Thomas (1996). Using examination results as indicators of school and college performance. Journal of the Royal Statistical Society, A, 159: 149-63. Goldstein, H. and Sammons, P. (1996). The influence of secondary and junior schools on sixteen year examination performance. School Effectiveness and School Improvement, (to appear). Goldstein, H., Rasbash, J., Yang, M., Woodhouse, G., Pan, H., Nuttall, D. and Thomas, S. (1993). A multilevel analysis of school examination results. Oxford Review of Education, 19: 425-433. Mortimore, P. and Goldstein, H. (1996). The teaching of reading in 45 Inner London primary schools: a critical examination of OFSTED research. London, Institute of Education. (http://www.ioe.ac.uk/publications/ofs-crit.html)