How research-informed practice stood up to the pseudo-science of inspection: defending an ungraded approach to the evaluation of teachers


This post tells the story of a university partnership of teacher educators’ experience of an Ofsted inspection of its Initial Teacher Education (ITE) provision in March 2013. Building on a position paper that was written at the time of the inspection, this post outlines how we defended our position on not grading our student teachers and shares some of the underpinning principles of our philosophy. Given the recent shift in Ofsted policy to remove the grading of individual lesson observations from school inspections, this post is very timely as it discusses some of the challenges faced by a department that has not only never used the Ofsted 4-point scale to assess its student teachers during observations, but resisted the use of numerical grading scales across its programmes as a whole.

Few areas of practice have caused as much debate and unrest amongst teachers in recent years as that of lesson observation, particularly graded observations and the way in which they have been used as summative assessments to rank teachers’ classroom performance against the Ofsted 4-point scale. Recent research in the field has described how graded lesson observations have become normalised, highlighting Ofsted’s hegemonic influence and control over education policy and practice (e.g. O’Leary 2013). At the same time, they have been critiqued for embodying a pseudo-scientific approach to measuring performance, as well as giving rise to a range of counterproductive consequences that ultimately militate against professional learning and teacher improvement (e.g. O’Leary and Gewessler 2014; UCU 2013). 


Unlike the vast majority of other university ITE providers in England, the post-compulsory education (PCE) department at the University of Wolverhampton has never used graded observations on its programmes. The underpinning rationale for adopting an ungraded approach to the assessment of our student teachers did not emerge arbitrarily but was developed collaboratively over a sustained period of time. This approach was underpinned by a core set of principles and shared understandings about the purpose and value of our ITE programmes, as well as being informed by empirical research into the use and impact of lesson observations in the Further Education (FE) sector and on-going discussions with our partners and student teachers. Given that our approach went against the grain of normalised models of observation, we knew that our programmes would be subject to heightened scrutiny and interrogation by Ofsted when it was announced that all the university’s ITE programmes would be inspected in March 2013.

The tone was set soon after the arrival of the inspection team on the first day when the lead inspector asked the PCE management team to rate the quality of its provision against Ofsted’s 4-point scale. This was despite the fact that the team had chosen not to apply this grading scale in its self-evaluation document (SED), which all providers were required to complete and submit at the end of each year and to which Ofsted had access before the inspection. But why did the partnership adopt this stance? It is important to emphasise that our resistance to embracing Ofsted’s ‘dominant discourses’ (Foucault 1980) and normalised practice was not based on any wilful refusal to comply or obey their authority as the regulators of quality for ITE provision, but driven by more fundamental concerns regarding the legitimacy and reliability of its assessment framework and the impact of that on teachers in training. Needless to say this epistemological positioning did not sit easily with the inspection team as it presented them with certain challenges that they were unaccustomed to, some of which are discussed further below.

Evaluating performance

It was a strongly held view across our partnership that the use of a metrics-based approach was neither the most appropriate nor the most effective means of fostering our student teachers’ development, nor indeed of measuring the level of performance required to meet the ‘pass’ threshold criteria of our programmes. Our partnership staff comprised largely experienced teacher educators who were comfortable and confident of being able to make judgements about the progress and performance of their students against the pass/fail assessment framework used on the programmes. In some ways this was akin to the notion of ‘fitness to practise’ used by other professions such as health. This ‘fitness to practise’ was initially mapped against the professional standards in use at the time in the FE sector (LLUK 2006) and more recently against the Education and Training Foundation’s (ETF) revised standards (ETF 2014). As the PCE partnership had been actively engaged with these standards through year on year collaborative work to revise and refine their application to its ITE programmes, there was a shared ownership of the assessment by those working on the programme. In contrast, we were not convinced that the Ofsted 4-point scale could be applied with the same rigour, reliability and appropriateness to assess students’ attainment as our existing assessment framework and criteria, whereby students were either judged to have satisfied the criteria or not. In other words, whilst all those teacher educators working on the programmes were clear as to what constituted a pass/fail and were confident in applying these criteria accurately and consistently, the same could not be said about the interpretation and application of Ofsted’s 4-point scale.

In their study into the grading of student teachers on teaching practice placements in Scotland, Cope et al (2003: 682) found that the success of such practice depended on ‘a clearly reliable and valid system of assessment of the practice of teaching’ and concluded that ‘the evidence available suggests that this does not currently exist’. This is not a phenomenon specific to observation as a method of assessment, but reflects widely held beliefs among key researchers in the field of assessment such as Gipps (1994: 167), who argued back in the 1990s that ‘assessment is not an exact science and we must stop presenting it as such.’ The danger, of course, is that the inherent limitations of practice such as numerically grading performance are often overlooked and the resulting judgments are given far more weight and authority than they can realistically claim to have or indeed deserve.

Prioritising teacher development

Our ITE programmes are built on a developmental philosophy in which the student teacher’s growth is prioritised. Staff working on the programmes are committed to helping their students to develop their pedagogic skills and subject knowledge base. It was therefore their belief that judging them against a performative, numerical grading scale of 1-4 would compromise that commitment and jeopardise the supportive focus of the teacher educator and mentor’s relationship with their students. The partnership also benefitted from being involved in and discussing the latest research into lesson observation as one of the university members of staff specialised in this particular area.

As mentioned above, recent research into the use of graded observation in FE reveals how it has become normalised as a performative tool of managerialist systems fixated with attempting to measure teacher performance rather than actually improving it (e.g. O’Leary 2012). The teacher educators and mentors in the PCE partnership saw their primary responsibility as that of helping to nurture their student teachers as effective practitioners rather than having to rank their performance according to a series of judgemental labels (i.e. ‘outstanding’, ‘inadequate’ etc.) that were principally designed to satisfy the needs of external agencies such as Ofsted within the marketised FE landscape and carried with them absolutist judgements that were inappropriate to their isolated, episodic nature. This emphasis on measuring teacher performance was also seen as responsible for what Ball (2003) refers to as ‘inauthenticity’ in teacher behaviour and classroom performance during assessed observations. This is typically manifested in the delivery of the rehearsed or showcase lesson as the high stakes nature of such observations results in a reluctance to want to take risks for fear of being given a low grade. Teachers are thus aware of the need to ‘play the game’, which can result in them following a collective template of good practice during observation. Yet being prepared to experiment with new ways of doing things in the classroom and taking risks in one’s teaching is widely acknowledged as an important constituent of the development of both the novice and experienced teacher.

Furthermore, findings from two separate studies on observation in FE (e.g. O’Leary 2011; UCU 2013) have revealed some of the distorting and counterproductive consequences of grading on in-service teachers’ identity and professionalism. Staff in the PCE partnership, many of whom are FE teachers themselves, were determined to protect their student teachers from such consequences during their time on the programme. This did not mean, however, that they avoided discussing the practice of grading teacher performance with them or confronting some of the challenging themes and issues associated with it. On the contrary, this was a topic that was addressed explicitly through professional development modules and wider discussions about assessment and professionalism as part of the on-going critically reflective dialogues that occurred between teacher educators, mentors and students throughout the programme.

Developing critically reflective teachers

The university’s PCE ITE programmes are underpinned by the notion of critical reflection. Brookfield (1995) argues that what makes critically reflective teaching ‘critical’ is an understanding of the concept of power in a wider socio-educational context and recognition of the hegemonic assumptions that influence and shape a teacher’s practices. The PCE partnership viewed the use of graded observations as an example of one such hegemonic assumption. Thus the perceived or intended outcomes of graded observations (i.e. improving the quality of teaching and learning, promoting a culture of continuous improvement amongst staff etc.) were not always the actual outcomes as experienced by those involved in the observation process. And then, of course, there was the thorny issue of measurement.

The ongoing fixation with attempting to measure teacher performance is symptomatic of a wider neoliberal obsession of trying to quantify and measure all forms of human activity, epitomised in the oft-quoted saying that ‘you can’t manage what you can’t measure’, a maxim that has its roots in a marketised approach to educational improvement and one which seems to shape Ofsted’s inspection framework. During the inspection, it became apparent that the PCE partnership’s ungraded approach was problematic for Ofsted. Although when I asked the lead inspector directly at a feedback meeting if the use of a grading scale was considered an essential feature of being able to measure teachers’ progress and attainment, he categorically stated that was NOT the case nor did Ofsted prescribe such policy, he later contradicted this in his final report by maintaining that as the partnership did not grade, it was ‘difficult to measure student progress from year to year or the value that the training added in each cohort’. In spite of the presentation of interwoven sources of qualitative evidence (tutor/mentor/peer evaluations, self-evaluations, integrated action/development plans, critically reflective accounts etc) illustrating these student teachers’ journeys throughout their programmes of study, the inspection team was reluctant or even unable to conceptualise the notion of improvement unless the outcome was expressed in the form of a number. And why is that? Because, of course, reading such qualitative accounts are more time consuming and ‘messier’ than the reductive simplicity of allocating a number to something, however spurious that number might be. This reveals the extent to which ‘managerialist positivism’ (Smith and O’Leary 2013) has become an orthodoxy and Ofsted its agent of enforcement. Despite that, the partnership team defended its practice and emphasised how the broad range of evidence captured in the combination of formative and summative assessments provided a rich tapestry of these student teachers’ progress and attainment throughout the programme and ultimately one that was more meaningful than the allocation of a reductive number.


Ball, S. (2003) The teacher’s soul and the terrors of performativity, Journal of Education Policy, 18(2), pp. 215-228.

Brookfield, S. D. (1995) Becoming a Critically Reflective Teacher. San Francisco, CA: Jossey-Bass.  

Cope, P., Bruce, A., McNally, J. and Wilson, G. (2003) Grading the practice of teaching: an unholy union of incompatibles. Assessment & Evaluation in Higher Education, 28(6), pp. 673-684.

Education and Training Foundation (ETF) (2014) Professional Standards for Teachers and Trainers in Education and Training – England. Available at:

Foucault, M. (1980) Power/Knowledge – Selected Interviews and Other Writings 1972-1977. Brighton: The Harvester Press.

Gipps, C. (1994) Beyond Testing: Towards a Theory of Educational Assessment. London: Falmer Press.

Lifelong Learning UK (LLUK) (2006) New overarching professional standards for teachers, tutors and trainers in the lifelong learning sector. London: LLUK

O’Leary, M. (2011) The Role of Lesson Observation in Shaping Professional Identity, Learning and Development in Further Education Colleges in the West Midlands, unpublished PhD Thesis, University of Warwick, September 2011.

O’Leary, M. (2012) Exploring the role of lesson observation in the English education system: a review of methods, models and meanings. Professional Development in Education, 38(5), pp. 791-810.

O’Leary, M. (2013) Surveillance, performativity and normalised practice: the use and impact of graded lesson observations in Further Education Colleges. Journal of Further and Higher Education, 37(5), pp. 694-714.

O’Leary, M. & Gewessler, A. (2014) ‘Changing the culture: beyond graded lesson observations’. Adults Learning– Spring 2014, 25: 38-41. 

Smith, R. & O’Leary, M. (2013) New Public Management in an age of austerity: knowledge and experience in further education, Journal of Educational Administration and History, 45(3), pp. 244-266.

University and College Union (UCU) (2013) Developing a National Framework for the Effective Use of Lesson Observation in Further Education. Project report, November 2013. Available at:


Published by


I work as a Reader in Education at Birmingham City University. Prior to this I was the co-founder of the Centre for Research and Development in Lifelong Education (CRADLE) and a principal lecturer in post-compulsory education at the University of Wolverhampton. I have worked as a teacher, teacher educator, head of department and educational researcher for over 20 years in colleges, schools and universities in England, Mexico and Spain. Much of my work and research is rooted in the field of teacher education, particularly exploring the relationship between education policy and the continuous professional development of teachers. I am well known for my work on classroom observation and am regarded as one of the first educational researchers in the UK to investigate and critique the practice of graded lesson observations. I am also the author of the 'Classroom Observation: A Guide to the Effective Observation of Teaching and Learning' (Abingdon: Routledge 2014). Qualifications: PhD in Education (University of Warwick); MA in Applied Linguistics & ELT (King's College London); PGCE in Spanish & ESL (UCE Birmingham), RSA DipTEFLA (King's College London & British Council Mexico City); BA (Hons) in French & Spanish (University of Southampton)

One thought on “How research-informed practice stood up to the pseudo-science of inspection: defending an ungraded approach to the evaluation of teachers”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s