Beyond Levels Part Three: developing a mixed constitution
Assessment is always going to be an imperfect tool. Gaming the system will always be possible; students can cram for tests and then forget things that they really should remember; a test, necessarily, tells us only part of the bigger picture. Part of the problem with assessment, I think, is that we aim for the one perfect system: National Curriculum levels, at least in their deployment, are supposed to provide us with a ‘cradle-to-grave’ (well, not quite) system that we can use to assess pupils throughout their schooling.
Aristotle famously wrote on this problem, though in the context of politics and not education. The ideal form of government may well be that of monarchy in which a virtuous, benevolent individual rules in the best interests of his people. The problem, according to Aristotle, is that this model is easily corrupted into tyranny, where one individual exploits his power for his own gain. No form of government can be ideal and so our best option is to hedge our bets and adopt a mixed constitution. In politics, for Aristotle, this involved a mix of monarchy, aristocracy and constitutional government.
I would suggest the same is true in assessment. The problems of trying to have one, all-encompassing, ideal assessment system are well studied, though I think the key point to bring out is that assessment necessarily serves multiple purposes. We want, as teachers, to give helpful feedback to pupils that allows them to get better at the thing we are teaching them. We want, as parents, to know how well our children are getting on (particularly in comparison to other children!) We want, as schools, to identify pupils who are falling behind so that some kind of intervention can be made. We want, as senior managers, to use data to make judgements about teacher competence. We want, as inspectors or the government, to hold schools to account. We want, as a society, to be able to make decisions (Should I employ this person? Should I let them in to university?) based on prior assessments. I simplify on all these fronts, but it is well recognised that assessment gets dragged in multiple directions and these demands modes of assessment that are not always compatible with one another.
I think the only answer to this is to ditch completely the idea that we might have one, all-encompassing assessment system. National Curriculum levels, for example, worked tolerably well as end of Key Stage assessments that might help with school accountability measures, but they are hopeless for giving formative feedback or providing parents with a sense of how pupils are getting on (so they were a Level 5 at the end of Year 7, they were a Level 5 last term, and they’re Level 5 this term…) There is a great deal of discussion at the moment as to what will replace levels: to my mind, another version of levels would be completely inappropriate. We need something else.
What can this be? I have written a few posts on this now. In one I argued that we need to decouple formative from summative assessment. In another, I argued that we need to use our subject expertise, and not a mark scheme, to give formative feedback. In another, I argued that we need to use task specific mark schemes for marking individual pieces of work. How might this all look in a model? I don’t have the answer here, but I am, every day, getting a clearer sense as to what this might look like, and I rather suspect the kinds of people who could drive this forwards are those reading this post. So here’s my first attempt: if you would like to help me make it better, then drop me an email or, even better, start a conversation on Twitter or in the comments so that everyone else can join in.
A mixed assessment constitution
This is slight development of my model in my article for the Historical Association.
Mode 1: frequent, low-stakes, testing of chronological knowledge
I speak here for history and I’m not sure how it works in other subjects, but I think we should have regular quizzes, timeline tests and so on as part and parcel of our teaching practice. Importantly, these tests should not just cover what was done in the previous lesson or week, but should test pieces of information learnt throughout schooling. Such tests are quick to do and easy to mark.
Purpose: diagnosing where pupils are ‘chronologically lost’.
Data produced: weekly scores out of ten.
Useful for: teacher (to diagnose ‘holes’ and possibly to plan interventions); pupils (what they need to revise).
Mode 2: milestone pieces of work at the end of a sequence of lessons
In history these are typically essays but can involve a number of other pieces of work as well. These should be marked using task-specific mark schemes. The piece can be given a summative mark (see this blog post on task-specific mark schemes) but it should be understood that a mark in this task is not connected to a mark in the previous half-term or next half-term. The marking can also be norm-referenced (i.e. how does a pupil’s work compare to others in this year and previous years).
Data produced: mark (e.g. Pass, Merit, Distinction) for a particular performance (e.g. an essay).
Useful for: parents (how is Jimmy getting on); teacher (what was not understood in the previous scheme of work); pupils (how well can I answer that particular question)
Model 3: end of year exams
I have been having a number of discussions recently about the model of music exams. The comparison is not perfect, but I think the model is good. There is, for example, no understanding that if a pupil gets a distinction at Grade 2 piano then they will necessarily get a distinction at Grade 4. Importantly, too, music exams test more than what was covered in one year: an examiner might well ask someone to perform a simple scale in a Grade 8 exam. Music exams are in themselves mixed constitutions: pupils perform scales, aural and oral tasks, sight-reading and practiced performances. I think this is a model that could work well for history, though I (and those I have been talking to) have yet to work out the fine details.
Data produced: end of year mark (e.g. Pass, Merit, Distinction)
Useful for: schools (particularly in identifying the pupils who fail so that additional support can be provided); parents (a summary of performance at the end of a year). Teachers and pupils probably do not find out anything more from this exercise in addition to what they already knew from Modes 1 and 2.
That’s the model so far. You will immediately notice that I have not included public exams in this. In part this is because I think public exams have dominated what goes on in classrooms for too long: we have ridiculous situations where children begin studying GCSE History (and its narrow range of topics) at the age of 13 for three years before taking the exam, with those three years very heavily focused on exam performance. I do not think it would be too difficult to have the system above with no public exams up to the age of 14.
The problem with my model, I think, is that it does not provide an obvious means for accountability. My end of term exams are measures of attainment and not achievement: just because someone gets a Merit in Year 8 does not mean they should necessarily get a Merit in Year 9. If this were to be turned into an accountability measure (e.g. what percent of pupils get a distinction) then we would be back with a measure that favoured schools with socially and academically selective intakes. If someone can solve this problem for me, then do let me know.
I should add, too, that I am not an expert on the systems used to measure ‘two levels of progress’. I have never been a senior leader, or an inspector, or a data manager. I do have serious concerns about the idea that, because a pupil gets a ‘Level 5’ in Year 6, then they should be getting a ‘Level 7’ in Year 9 and an A at GCSE: at least for history, I cannot see how a meaningful linear progression model could be created which would make this possible. Again, if someone can enlighten me on this, then I’d be most grateful.
I think that providing data for accountability without a public exam is fraught with difficulty because criteria provided for assessing are only meaningful when they are task specific. It isn’t like levels actually ‘worked’ in history but people did have a sense of level even if it was an illusion to think the NC levels were a good description of that sense.
Possibly sample pieces of work for typical tasks could make this sense of level more formal but to hook that into meaningful accountability would require external moderation and that creates all the problems with over-guiding we see with GCSE coursework. I can only really see a return of KS3 SATS as the answer.
On the plus side the systems you describe could be used for tracking progress against the cohort well enough. I know that because my school uses data from termly ‘exams’ for this purpose.
BTW I am really enjoying these posts and think they are enormously useful – thanks!
Yes, I was thinking something similar. Public exams at end of key stage. I suspect that these don’t work unless there’s a common curriculum – but then I think there should be that anyway. I think the key to defining progress probably does rest in asking children to use what they learnt earlier in school alongside what they do now (tricky in history). Comparative questions one way forwards (e.g. How did British Empire differ from Ottoman Empire?) but obvious problems with that approach. Alternatively questions like What is an empire? force children to use range of knowledge – but at risk of becoming generic! So much to work out…
Like the idea of a mixed constitution and agree on importance of avoiding straightjacket of artificially narrowed/linear definition of progression. As I don’t think GCSEs, and perhaps not even A-levels, really measure ‘good’ history successfully, I worry about the idea of more external exams as an even worse straitjacket. The opportunity at KS3 to use our professional expertise to design curriculum and assessment to make our students better at History in a way that works for them and us in each school context is a vital opportunity before external assessment skews it all, I think!
Main thing I’m pondering is that if we can define and see so clearly what makes one student’s work ‘better history’ than another’s, which we can, and must if we are to plan and teach effectively, and can see the same student ‘getting better’ as they go through KS3, albeit in a messy kind of way, it is a shame not to be able to communicate that to the student, the parent, senior leaders, in some sort of meaningful way. I worry about the student who consistently gets a ‘pass’, or who gets a merit on one task but a pass on the next but is still making progress, as you said. A ‘distinction’ is clearly something to celebrate. With progression planned into our enquiries and outcome tasks, as it is and would be, a pass would still represent progress. But would it feel like it? Or to the student and the parent does it just look like underachievement relative to the norms set by the rest of the cohort? I’d like to find a way of expressing that sense of progress but perhaps you think (and perhaps I agree) that can only be done though the formative feedback we give about the student’s work as history?
I’ve been musing on your last comment and I do wonder if it would be less helpful to measure progress by asking students to use what they have learnt before (unless we are measuring progress in terms of cumulative acquisition of knowledge rather than its use.) Of course it is good to compare the British Empire with the Ottoman Empire but I agree there are problems. I struggle with some of the new textbooks which have quite thematic approaches with fairly bite sized chunks of detail on each area to be compared. It is not the student actually making the comparisons and I think Willingham’s argument about ‘deep structure’ is really important here. It takes real expertise in history to identify the common themes between two periods of history and I am not convinced students are making progress simply because a comparison has been drawn to their attention, especially if there is no reason to think their understanding of the new topic is any greater than of the previous.
I see clear progress between my students studying the causes of WW1 in year 9 and those looking at the same topic as part of their AS course. The reason the same student that I taught the basics in year 9 is now able to produce a better essay is partly a better understanding of how to structure essays but largely down to their increased capacity to understand and absorb complex information. However, they gain this greater capacity from general ‘academic maturation’ that comes from much more than their history teaching.
For example, I currently teach a student for AS that did not do history at GCSE. The rather depressing conclusion I have been forced to reach is that it is not really a big deal to do A level history without having done GCSE. My biggest concern for this student is that he has missed out on knowledge of the key themes and concepts (empire, dictatorship etc) he would have grasped if he had done GCSE but missing two years of history teaching has not been a disaster! I also teach A2 Politics to a very bright girl who did not study history after year 8. It is gaps in knowledge that cause her noticeable problems. If she began A level history tomorrow she would have no problems writing good essays, despite four years without history teaching. I seem to have come full circle towards the conclusion that our biggest direct impact is in the knowledge we teach rather than improvement in skills… hmmm.
The mixed constitution end of year exams and potential inspiration from music got me thinking about the model used in Denmark. Although they have reformed the way they structure subject learning at Secondary level (from what I understand it is very interdisciplinary), I believe they have kept their mixed approach to examinations which involve a written exam and an oral exam. This has the potential to move beyond the topics the students have been studying that particular year, allow them to draw on their knowledge from previous years if they feel it is helpful to the discussion (like performing a scale learned for Grade 1 in a Grade 8 exam), and creates a space for the discussion of concepts which may be hard to prompt in a written exam. Audio recording these conversations each year would be extremely interesting in terms of assessing progress across a KS.
I am not sure how the Danish exams are structured in terms of the questions and prompts the teacher presumably has prepared, or what the assessment structure looks like, but it would be interesting to look into it. Perhaps something to investigate as part of my 1c musings… I’ll keep you posted!
Reblogged this on The Echo Chamber.