Unpacking the Primary Assessment and Accountability Reforms

This post examines the Government response to consultation on primary assessment and accountability.

pencil-145970_640It sets out exactly what is planned, what further steps will be necessary to make these plans viable and the implementation timetable.

It is part of a sequence of posts I have devoted to this topic, most recently:

Earlier posts in the series include The Removal of National Curriculum Levels and the Implications for Able Pupils’ Progression (June 2012) and Whither National Curriculum Assessment Without Levels? (February 2013).

The consultation response contrives to be both minimal and dense. It is necessary to unpick each element carefully, to consider its implications for the package as a whole and to reflect on how that package fits in the context of wider education reform.

I have organised the post so that it considers sequentially:

  • The case for change, including the aims and core principles, to establish the policy frame for the planned reforms.
  • The impact on the assessment experience of children aged 2-11 and how that is likely to change.
  • The introduction of baseline assessment in Year R.
  • The future shape of end of KS1 and end of KS2 assessment respectively.
  • How the new assessment outcomes will be derived, reported and published.
  • The impact on floor standards.

Towards the end of the post I have also provided a composite ‘to do’ list containing all the declared further steps necessary to make the plan viable, with a suggested deadline for each.

And the post concludes with an overall judgement on the plans, in the form of a summary of key issues and unanswered questions arising from the earlier commentary. Impatient readers may wish to jump straight to that section.

I am indebted to Warwick Mansell for his previous post on this topic. I shall try hard not to parrot the important points he has already made, though there is inevitably some overlap.

Readers should also look to Michael Tidd for more information about the shape and content of the new tests.

What has been published?

The original consultation document ‘Primary assessment and accountability under the new national curriculum’ was published on 17 July 2013 with a deadline for response of 17 October 2013. At that stage the Government’s response was due ‘in autumn 2013’.

The response was finally published on 27 March, some four months later than planned and only five months prior to the introduction of the revised national curriculum which these arrangements are designed to support.

It is likely that the Government will have decided that 31 March was the latest feasible date to issue the response, so they were right up against the wire.

It was accompanied by:

  • A press release which focused on the full range of assessment reforms – for primary, secondary and post-16.

Shortly before the response was published, the reply to a Parliamentary question asked on 17 March explained that test frameworks were expected to be included within it:

‘Guidance on the nature of the revised key stage 1 and key stage 2 tests, including mathematics, will be published by the Standards and Testing Agency in the form of test framework documents. The frameworks are due to be released as part of the Government’s response to the primary assessment and accountability consultation. In addition, some example test questions will be made available to schools this summer and a full sample test will be made available in the summer of 2015.’ (Col 383W)

.

.

In the event, these documents – seven in all – did not appear until 31 March and there was no reference to any of the three commitments above in what appeared on 27 March.

Finally, the Standards and Testing Agency published on 3 April a guidance page on national curriculum tests from 2016. At present it contains very little information but further material will be added as and when it is published.

Partly because the initial consultation document was extremely ‘drafty’, the reaction of many key external respondents to the consultation was largely negative. One imagines that much of the period since 17 October has been devoted to finding the common ground.

Policy makers will have had to do most of their work after the consultation document issued because they were not ready beforehand.

But the length of the delay in issuing the response would suggest that they also encountered significant dissent amongst internal stakeholders – and that the eventual outcome is likely to be a compromise of sorts between these competing interests.

Such compromises tend to have observable weaknesses and/or put off problematic issues for another day.

A brief summary of consultation responses is included within the Government’s response. I will refer to this at relevant points during the discussion below.

 .

The Case for Change

 .

Aims

The consultation response begins – as did the original consultation document – with a section setting out the case for reform.

It provides a framework of aims and principles intended to underpin the changes that are being set in place.

The aims are:

  • The most important outcome of primary education is to ‘give as many pupils as possible the knowledge and skills to flourish in the later phases of education’. This is a broader restatement of the ‘secondary ready’ concept adopted in the original consultation document.
  • The primary national curriculum and accountability reforms ‘set high expectations so that all children can reach their potential and are well prepared for secondary school’. Here the ‘secondary ready’ hurdle is more baldly stated. The parallel notion is that all children should do as well as they can – and that they may well achieve different levels of performance. (‘Reach their potential’ is disliked by some because it is considered to imply a fixed ceiling for each child and fixed mindset thinking.)
  • To raise current threshold expectations. These are set too low, since too few learners (47%) with KS2 level 4C in both English and maths go on to achieve five or more GCSE grades A*-C including English and maths, while 72% of those with KS2 level 4B do so. So the new KS2 bar will be set at this higher level, but with the expectation that 85% of learners per school will jump it, 13% more than the current national figure. Meanwhile the KS4 outcome will also change, to achievement across eight GCSEs rather than five, quite probably at a more demanding level than the present C grade. In the true sense, this is a moving target.
  • No child should be allowed to fall behind’. This is a reference to the notion of ‘mastery’ in its crudest sense, though the model proposed will not deliver this outcome. We have noted already a reference to ‘as many children as possible’ and the school-level target – initially at least – will be set at 85%. In reality, a significant minority of learners will progress more slowly and will fall short of the threshold at the end of KS2.
  • The new system ‘will set a higher bar’ but ‘almost all pupils should leave primary school well-placed to succeed in the next phase of their education’. Another nuanced version of ‘secondary ready’ is introduced. This marks a recognition that some learners will not jump over the higher bar. In the light of subsequent references to 85%, ‘almost all’ is rather over-optimistic.
  • We also want to celebrate the progress that pupils make in schools with more challenging intakes’. Getting ‘nearly all pupils to meet this standard…’ (the standard of secondary readiness?) ‘…is very demanding, at least in the short term’. There will therefore be recognition of progress ‘from a low starting point’ – even though these learners have, by definition, been allowed to fall behind and will continue to do so.

So there is something of a muddle here, no doubt engendered by a spirit of compromise.

The black and white distinction of ‘secondary-readiness’ has been replaced by various verbal approximations, but the bottom line is that there will be a defined threshold denoting preparedness that is pitched higher than the current threshold.

And the proportion likely to fall short is downplayed – there is apparent unwillingness at this stage to acknowledge the norm that up to 15% of learners in each school will undershoot the threshold – substantially more in schools with ‘challenging intakes’.

What this boils down to is a desire that all will achieve the new higher hurdle – and that all will be encouraged to exceed it if they can – tempered by recognition that this is presently impossible. No child should be allowed to fall behind but many inevitably will do so.

It might have been better to express these aims in the form of future aspirations – and our collective efforts to bridge the gap between present reality and those ambitious aspirations.

Principles

The section concludes with a new set of principles governing pedagogy, assessment and accountability:

  • ‘Ongoing, teacher-led assessment is a crucial part of effective teaching;
  • Schools should have the freedom to decide how to teach their curriculum and how to track the progress that pupils make;
  • Both summative teacher assessment and external testing are important;
  • Accountability is key to a successful school system, and therefore must be fair and transparent;
  • Measures of both progress and attainment are important for understanding school performance; and
  • A broad range of information should be published to help parents and the wider public know how well schools are performing.’

These are generic ‘motherhood and apple pie’ statements and so largely uncontroversial. I might have added a seventh – that schools’ in-house assessment and reporting systems must complement summative assessment and testing, including by predicting for parents the anticipated outcomes of the latter.

Perhaps interestingly, there is no repetition of the defence for the removal of national curriculum levels. Instead, the response concentrates on the support available to schools.

It mentions discussion with an ‘expert group on assessment’ about ‘how to support schools to make best use of the new assessment freedoms’. We are not told the membership of this group (which, as far as I know, has not been made public) or the nature of its remit.

There is also a link to information about the Assessment Innovation Fund, which will provide up to 10 grants of up to £10,000 which schools and organisations can use to develop packages that share their innovative practice with others.

 

Children’s experience of assessment up to the end of KS2

The response mentions the full range of national assessments that will impact on children between the ages of two and 11:

  • The statutory progress check at two years of age.
  • A new baseline assessment undertaken within a few weeks of the start of Year R, introduced from September 2015.
  • An Early Years Foundation Stage Profile undertaken in the final term of the year in which children reach the age of five. A revised profile was introduced from September 2012. It is currently compulsory but will be optional from September 2016. The original consultation document said that the profile would no longer be moderated and data would no longer be collected. Neither of those commitments is repeated here.
  • The Phonics Screening Check, normally undertaken in Year 1. The possibility of making these assessments non-statutory for all-through primary schools, suggested in the consultation document, has not been pursued: 53% of respondents opposed this idea, whereas 32% supported it.
  • End of KS1 assessment and
  • End of KS2 assessment.

So a total of six assessments are in place between the ages of two and 11. At least four – and possibly five – will be undertaken between ages two and seven.

It is likely that early years’ professionals will baulk at this amount of assessment, no matter how sensitively it is designed. But the cost and inefficiency of the model is also open to criticism.

The Reception Baseline

Approach

The original consultation document asked whether:

  • KS1 assessment should be retained as a baseline – 45% supported this and 41% were opposed.
  • A baseline check should be introduced at the start of Reception – 51% supported this and 34% were opposed.
  • Such a baseline check should be optional – 68% agreed and 19% disagreed.
  • Schools should be allowed to choose from a range of commercially available materials for this baseline check – 73% said no and only 15% said yes.

So, whereas views were mixed on where the baseline should be set, there were substantial majorities in favour of any Year R baseline check being optional and following a single, standard national format.

The response argues that Year R is the most sensible point at which to position the baseline since that is:

‘…the earliest point that nearly all children are in school’.

What happens in respect of children who are not in school at this point is not discussed.

There is no explanation of why the Government has disregarded the clear majority of respondents by choosing to permit a range of assessment approaches, so this decision must be ideologically motivated.

The response says ‘most’ are likely to be administered by teaching staff, leaving open the possibility that some options will be administered externally.

Design

Such assessments will need to be:

‘…strong predictors of key stage 1 and key stage 2 attainment, whilst reflecting the age and abilities of children in Reception’.

Presumably this means predictors of attainment in each of the three core subjects – English, maths and science – rather than any broader notion of attainment. The challenge inherent in securing a reasonable predictor of attainment across these domains seven years further on in a child’s development should not be under-estimated.

The response points out that such assessment tools are already available for use in Year R, some are used widely and some schools have long experience of using them. But there is no information about how many of these are deemed to meet already the description above.

In any case, new criteria need to be devised which all such assessments must meet. Some degree of modification will be necessary for all existing products and new products will be launched to compete in the market.

There is an opportunity to use this process to ratchet up the Year R Baseline beyond current expectations, so matching the corresponding process at the end of KS2. The consultation response says nothing about whether this is on the cards.

Interestingly, in his subsequent ‘Unsure start’ speech about early years inspection, HMCI refers to:

‘…the government’s announcement last week that they will be introducing a readiness-for-school test at age four. This is an ideal opportunity to improve accountability. But I think it should go further.

I hope that the published outcomes of these tests will be detailed enough to show parents how their own child has performed. I fear that an overall school grade will fail to illuminate the progress of poor children. I ask government to think again about this issue.’

The terminology – ‘readiness for school’ is markedly blunter than the references to a reception baseline in the consultation response. There is nothing in the response about the outcomes of these tests being published, nor anything about ‘an overall school grade’.

Does this suggest that decisions have already been made that were not communicated in the consultation response?

.

Timeline, options, questions

Several pieces of further work are required in short order to inform schools and providers about what will be required – and to enable both to prepare for introduction of the assessments from September 2015. All these should feature in the ‘to do’ list below.

One might reasonably have hoped that – especially given the long delay – some attempt might have been made to publish suggested draft criteria for the baseline alongside the consultation response. The fact that even preliminary research into existing practice has not been undertaken is a cause for concern.

Although the baseline will be introduced from September 2015, there is a one-year interim measure which can only apply to all-through primary schools:

  • They can opt out of the Year R baseline measure entirely, relying instead on KS1 outcomes as their baseline; or
  • They can use an approved Year R baseline assessment and have this cohort’s progress measured at the end of KS2 (which will be in 2022) by either the Year R or the KS1 baseline, whichever demonstrates the most progress.

In the period up to and including 2021, progress will continue to be measured from the end of KS1. So learners who complete KS2 in 2021 for example will be assessed on progress since their KS1 tests in 2017.

Junior and middle schools will also continue to use a KS1 baseline.

Arrangements for infant and first schools are still to be determined, another rather worrying omission at this stage in proceedings.

It is also clear that all-through primary schools (and infant/first schools?) will continue to be able to opt out from the Year R baseline from September 2016 onwards, since the response says:

‘Schools that choose not to use an approved baseline assessment from 2016 will be judged on an attainment floor standard alone’.

Hence the Year R baseline check is entirely optional and a majority of schools could choose not to undertake it.

However, they would need to be confident of meeting the demanding 85% attainment threshold in the floor standard.

They might be wise to postpone that decision until the pitch of the progress expectation is determined. For neither the Year R baseline nor the amount of progress that learners are expected to make from their starting point in Year R is yet defined.

This latter point applies at the average school level (for the purposes of the floor standard) and in respect of the individual learner. For example, if a four year-old is particularly precocious in, say, maths, what scaled scores must they register seven years later to be judged to have made sufficient progress?

There are several associated questions that follow on from this.

Will it be in schools’ interests to acknowledge that they have precocious four year-olds at all? Will the Year R baseline reinforce the tendency to use Reception to bring all children to the same starting point in readiness for Year 1, regardless of their precocity?

Will the moderation arrangements be hard-edged enough to stop all-through primary schools gaming the system by artificially depressing their baseline outcomes?

Who will undertake this moderation and how much will it cost? Will not the decision to permit schools to choose from a range of measures unnecessarily complicate the moderation process and add to the expense?

The consultation response neither poses these questions nor supplies answers.

The future shape of end KS1 and end KS2 assessment

.

What assessment will take place?

At KS1 learners will be assessed in:

  • Reading – test plus teacher assessment
  • Writing – test (of grammar, punctuation and spelling) plus teacher assessment
  • Speaking and listening – teacher assessment
  • Maths – test plus teacher assessment
  • Science  – teacher assessment

The new test of grammar, punctuation and spelling did not feature in the original consultation and has presumably been introduced to strengthen the marker of progress to which four year-olds should aspire at age seven.

The draft test specifications for the KS1 tests in reading, GPS and maths outline the requirements placed on the test developers, so it is straightforward to compare the specifications for reading and maths with the current tests.

The GPS test will include a 20 minute written grammar and punctuation task; a 20 minute test comprising short grammar, punctuation and vocabulary questions; and a 15 minute spelling task.

There is a passing reference to further work on KS1 moderation which is included in the ‘to do’ list below.

At KS2 learners will be assessed in

  • Reading – test plus teacher assessment
  • Writing – test (of grammar spelling and punctuation) plus teacher assessment
  • Maths – test plus teacher assessment
  • Science  – teacher assessment plus a science sampling test.

Once again, the draft test specifications – reading, GPS, maths and science sampling – describe the shape of each test and the content they are expected to assess.

I will leave it to experts to comment on the content of the tests.

 .

Academies and free schools

It is important to note that the framing of this content – by means of detailed ‘performance descriptors’ – means that the freedom academies and free schools enjoy in departing from the national curriculum will be largely illusory.

I raised this issue back in February 2013:

  • ‘We know that there will be a new grading system in the core subjects at the end of KS2. If this were to be based on the ATs as drafted, it could only reflect whether or not learners can demonstrate that they know, can apply and understand ‘the matters, skills and processes specified’ in the PoS as a whole. Since there is no provision for ATs that reflect sub-elements of the PoS – such as reading, writing, spelling – grades will have to be awarded on the basis of separate syllabuses for end of KS2 tests associated with these sub-elements.
  • This grading system must anyway be applied universally if it is to inform the publication of performance tables. Since some schools are exempt from National Curriculum requirements, it follows that grading cannot be derived directly from the ATs and/or the PoS, but must be independent of them. So this once more points to end of KS2 tests based on entirely separate syllabuses which nevertheless reflect the relevant part of the draft PoS. The KS2 arrangements are therefore very similar to those planned at KS4.’

I have more to say about the ‘performance descriptors’ below.

 .

Single tests for all learners

A critical point I want to emphasise at this juncture – not mentioned at all in the consultation document or the response – is the test development challenge inherent in producing single papers suitable for all learners, regardless of their attainment.

We know from the response that the P-scales will be retained for those who are unable to access the end of key stage tests. (Incidentally, the content of the P-scales will remain unchanged so they will not be aligned with the revised national curriculum, as suggested in the consultation document.)

There will also be provision for pupils who are working ‘above the P-scales but below the level of the test’.

Now the P-scales are for learners working below level 1 (in old currency). This is the first indication I have seen that the tests may not cater for the full range from Level 1-equivalent to Level 6-equivalent and above. But no further information is provided.

It may be that this is a reference to learners who are working towards level 1 (in old currency) but do not have SEN.

The 2014 KS2 ARA booklet notes:

‘Children working towards level 1 of the national curriculum who do not have a special educational need should be reported to STA as ‘W’ (Working below the level). This includes children who are working towards level 1 solely because they have English as an additional language. Schools should use the code ‘NOTSEN’ to explain why a child working towards level 1 does not have P scales reported. ‘NOTSEN’ replaces the code ‘EAL’ that was used in previous years.’

The consultation document said:

‘We do not propose to develop an equivalent to the current level 6 tests, which are used to challenge the highest-attaining pupils. Key stage 2 national curriculum tests will include challenging material (at least of the standard of the current level 6 test) which all pupils will have the opportunity to answer, without the need for a separate test’.

The draft test specifications make it clear that the tests should:

‘provide a suitable challenge for all children and give every child the opportunity to achieve as high a standard…as possible.’

Moreover:

‘In order to improve general accessibility for all children, where possible, questions will be placed in order of difficulty.’

The development of single tests covering this span of attainment – from level 1 to above level 6 – tests in which the questions are posed in order of difficulty and even the highest attainers must answer all questions – seem to me to be a very tall order, especially in maths.

More than that, I urgently need persuading that this is not a waste of high attainers’ time and poor assessment practice.

 .

How assessment outcomes will be derived, reported and published

Deriving assessment outcomes

One of the reasons cited for replacing national curriculum levels was the complexity of the system and the difficulty parents experienced in understanding it.

The Ministerial response to the original report from the National Curriculum Expert Panel said:

‘As you rightly identified, the current system is confusing for parents and restrictive for teachers. I agree with your recommendation that there should be a direct relationship between what children are taught and what is assessed. We will therefore describe subject content in a way which makes clear both what should be taught and what pupils should know and be able to do as a result.’

The consultation document glossed the same point thus:

‘Schools will be able to focus their teaching, assessment and reporting not on a set of opaque level descriptions, but on the essential knowledge that all pupils should learn.’

However, the consultation response introduces for the first time the concept of a ‘performance descriptor’.

This term is defined in the glossaries at the end of each draft test specification:

Description of the typical characteristics of children working at a particular standard. For these tests, the performance descriptor will characterise the minimum performance required to be working at the appropriate standard for the end of the key stage.’

Essentially this is a collective term for something very similar to old-style level descriptions.

Except that, in the case of the tests, they are all describing the same level of performance.

They have been rendered necessary by the odd decision to provide only a single generic attainment target for each programme of study. But, as noted back in February 2013, the test developers need a more sophisticated framework on which to base their assessments.

According to the draft test specifications they will also be used

‘By a panel of teachers to set the standards on the new tests following their first administration in May 2016’.

When it comes to teacher assessment, the consultation response says:

‘New performance descriptors will be introduced to inform the statutory teacher assessments at the end of key stage one [and]…key stage two.’

But there are two models in play simultaneously.

In four cases – science at KS1 and reading, maths and science at KS2 – there will be ‘a single performance descriptor of the new expected standard’, in the same way as there are in the test specifications.

But in five cases – reading, writing, speaking and listening and maths at KS1; and writing at KS2 :

‘teachers will assess pupils as meeting one of several performance descriptors’.

These are old-style level descriptors by another name. They perform exactly the same function.

The response says that the KS1 teacher assessment performance descriptors will be drafted by an expert group for introduction in autumn 2014. It does not mention whether KS2 teacher assessment performance descriptors will be devised in the same way and to the same timetable.

 .

Reporting assessment outcomes to parents

When it comes to reporting to parents, there will be three different arrangements in play at both KS1 and KS2:

  • Test results will be reported by means of scaled scores (of which more in a moment).
  • One set of teacher assessments will be reported by selecting from a set of differentiated performance descriptors.
  • A second set of teacher assessments will be reported according to whether learners have achieved a single threshold performance descriptor.

This is already significantly more complex than the previous system, which applied the same framework of national curriculum levels across the piece.

It seems that KS1 test outcomes will be reported as straightforward scaled scores (though this is only mentioned on page 8 of the main text of the response and not in Annex B, which compares the new arrangements with those currently in place).

But, in the case of KS2:

‘Parents will be provided with their child’s score alongside the average for their school, the local area and nationally. In the light of the consultation responses, we will not give parents a decile ranking for their child due to concerns about whether decile rankings are meaningful and their reliability at individual pupil level.’

The consultation document proposed a tripartite reporting system comprising:

  • A scaled score for each KS2 test, derived from raw test marks and built around a ‘secondary readiness standard’. This standard would be set at a scaled score of 100, which would remain unchanged. It was suggested for illustrative purposes that a scale based on the current national curriculum tests might run from 80 to 130.
  • An average scaled score in each test for other pupils nationally with the same prior attainment at the baseline. Comparison of a learner’s scaled score with the average scaled score would show whether they had made more or less progress than the national average.
  • A national ranking in each test – expressed in terms of deciles – showing how a learner’s scaled score compared with the range of performance nationally.

The latter has been dispensed with, given that 35% of consultation respondents disagreed with it, but there were clearly technical reservations too.

In its place, the ‘value added’ progress measure has been expanded so that there is a comparison with other pupils in the learner’s own school and the ‘local area’ (which presumably means local authority). This beefs up the progression element in reporting at the expense of information about the attainment level achieved.

So at the end of KS2 parents will receive scaled scores and three average scaled scores for each of reading, writing and maths – twelve scores in all – plus four performance descriptors, of which three will be singleton threshold descriptors (reading, maths and science) and one will be selected from a differentiated series (writing). That makes sixteen assessment outcomes altogether, provided in four different formats.

The consultation response tells us nothing more about the range of the scale that will be used to provide scaled scores. We do not even know if it will be the same for each test.

The draft test specifications say that:

‘The exact scale for the scaled scores will be determined following further analysis of trialling data. This will include a full review of the reporting of confidence intervals for scaled scores.’

But they also contain this worrying statement:

‘The provision of a scaled score will aid in the interpretation of children’s performance over time as the scaled score which represents the expected standard will be the same year on year. However, at the extremes of the scaled score distribution, as is standard practice, the scores will be truncated such that above and below a certain point, all children will be awarded the same scaled score in order to minimise the effect for children at the ends of the distribution where the test is not measuring optimally.’

This appears to suggest that scaled scores will not accurately describe performance at the extremes of the distribution, because the tests will not accurately measure such performance. This might be describing a statistical truism, but it again begs the question whether the highest attainers are being short-changed by the selected approach.

.

Publication of assessment outcomes

The response introduces the idea that ‘a suite of indicators’ will be published on each school’s own website in a standard format. These are:

  • The average progress made by pupils in reading, writing and maths. (This is presumably relevant to both KS1 and KS2 and to both tests and teacher assessment.)
  • The percentage of pupils reaching the expected standard in reading, writing and mathematics at the end of key stage 2. (This is presumably relevant to both tests and teacher assessment.)
  • The average score of pupils in their end of key stage 2 assessments. (The final word suggests teacher assessment as well as tests, even though there will not be a score from the former.)
  • The percentage of pupils who achieve a high score in all areas at the end of key stage 2. (Does ‘all areas’ imply something more than statutory tests and teacher assessments? Does it mean treating each area separately, or providing details only of those who have achieved high scores across all areas?)

The latter is the only reference to high attainers in the entire response. It does not give any indication of what will count as a high score for these purposes. Will it be designed to catch the top-third of attainers or something more demanding, perhaps equivalent to the top decile?

A decision has been taken not to report the outcomes of assessment against the P-scales because the need to contextualise such information is perceived to be relatively greater.

And, as noted above, HMCI let slip the fact that the outcomes of reception baselines would also be published, but apparently in the form of a single overall grade.

We are not told when these requirements will be introduced, but presumably they must be in place to report the outcomes of assessments undertaken in spring 2016.

Additionally:

‘So that parents can make comparisons between schools, we would like to show each school’s position in the country on these measures and present these results in a manner that is clear for all audiences to understand. We will discuss how best to do so with stakeholders, to ensure that the presentation of the data is clear, fair and statistically robust.’

This suggests inclusion in the 2016 School Performance Tables, but this is not stated explicitly.

Indeed, apart from references to the publication of progress measures in the 2022 Performance Tables, there is no explicit coverage of their contribution in the response, nor any reference to the planned supporting data portal, or how data will be distributed between the Tables and the portal.

The original consultation document gave several commitments on the future content of performance tables. They included:

  • How many of a school’s pupils are amongst the highest attaining nationally, by showing the percentage of pupils achieving a high scaled score in each subject.
  • Measures to show the attainment and progress of learners attracting the Pupil Premium.
  • Comparison of each school’s performance with that of schools with similar intakes.

None are mentioned here, nor are any of the suggestions advanced by respondents taken up.

Floor standards

Changes are proposed to the floor standards with effect from September 2016.

This section of the response begins by committing to:

‘…a new floor standard that holds schools to account both on the progress that they make and on how well their pupils achieve.’

But the plans set out subsequently do not meet this description.

The progress element of the current floor standard relates to any of reading, writing or mathematics but, under the new floor standard, it will relate to all three of these together.

An all-though primary school must demonstrate that:

‘…pupils make sufficient progress at key stage 2 from their starting point…’

As we have noted above, all-through primaries can opt to use the KS1 baseline or the Year R baseline in 2015. Moreover, from 2016 they can choose not to use the Year R baseline and be assessed solely on the attainment measure in the floor standards (see below).

Junior and middle schools obviously apply the KS1 baseline, while arrangements for infant and first schools have yet to be finalised.

What constitutes ‘sufficient progress’ is not defined. Annex C of the response says:

‘For 2016 we will set the precise extent of progress required once key stage 2 tests have been sat for the first time.’

Presumably this will be progress from KS1 to KS2, since progress from the Year R baseline will not be introduced until 2023.

The attainment element of the new floor standards is for schools to have 85% or more of pupils meeting the new, higher threshold standard at the end of KS2 in all of reading, writing and maths. The text says explicitly that this threshold is ‘similar to a level 4b under the current system’.

Annex C clarifies that this will be judged by the achievement of a scaled score of 100 or more in each of the reading and maths tests, plus teacher assessment that learners have reached the expected standard in writing (so the GPS test does not count in the same way, simply informing the teacher assessment).

As noted above, this a far bigger ask than the current reference to 65% of learners meeting the expected (and lower 4c) standard. The summary at the beginning of the response refers to it as ‘a challenging aspiration’:

‘Over time we expect more and more schools to achieve this standard.’

The statement in the first paragraph of this section of the response led us to believe that these two requirements – for progress and attainment respectively – would be combined, so that schools would be held account for both (unless, presumably, they exercised their right to opt out of the Year R baseline assessment).

But this is not the case. Schools need only achieve one or the other.

It follows that schools with a very high performing intake may exceed the floor standards on the basis of all-round high attainment alone, regardless of the progress made by their learners.

The reason for this provision is unclear, though one suspects that schools with an extremely high attaining intake, whether at Reception or Year 3, will be harder pressed to achieve sufficient progress, presumably because some ceiling effects come into play at the end of KS2.

This in turn might suggest that the planned tests do not have sufficient headroom for the highest attainers, even though they are supposed to provide similar challenge to level 6 and potentially extend beyond it.

Meanwhile, schools with less than stellar attainment results will be obliged to follow the progress route to jump the floor standard. This too will be demanding because all three domains will be in play.

There will have been some internal modelling undertaken to judge how many schools would be likely to fall short of the floor standards given these arrangements and it would be very useful to know these estimates, however unreliable they prove to be.

In their absence, one suspects that the majority of schools will be below the floor standards, at least initially. That of course materially changes the nature and purpose of the standards.

To Do List

The response and the draft specifications together contain a long list of work to be carried out over the next two years or so. I have included below my best guess as to the latest possible date for each decision to be completed and communicated:

  • Decide how progress will be measured for infants and first schools between the Year R baseline and the end of KS1 (April 2014)
  • Make available to schools a ‘small number’ of sample test questions for each key stage and subject (Summer 2014)
  • Work with experts to establish the criteria for the Year R baseline (September 2014)
  • KS1 [and KS2?] teacher assessment performance descriptors to be drafted by an expert group (September 2014)
  • Complete and report outcomes of a study with schools that already use Year R baseline assessments (December 2014)
  • Decide how Year R baseline assessments will be moderated (December 2014)
  • Publish a list of assessments that meet the Year R baseline criteria (March 2015)
  • Decide how Year R baseline results will be communicated to parents and to Ofsted (March 2015)
  • Make available to schools a full set of sample materials including tests and mark schemes for all KS1 and KS2 tests (September 2015)
  • Complete work with Ofsted and Teachers to improve KS1 moderation (September 2015)
  • Provide further information to enable teachers to assess pupils at the end of KS1 and KS2 who are ‘working above the P-scales but below the level of the test’ (September 2015)
  • Decide whether to move to external moderation of P-scale teacher assessment (September 2015)
  • Agree with stakeholders how to compare schools’ performance on a suite of assessment outcomes published in a standard format (September 2015)
  • Publish all final test frameworks (Autumn 2015)
  • Introduce new requirements for schools to publish a suite of assessment outcomes in a standard format (Spring 2016)
  • Panels of teacher use level descriptors to set the standards on the new tests following their first administration in May 2016 (Summer 2016)
  • Define what counts as sufficient progress from the Year R baseline to end KS1 and end KS2 respectively (Summer 2016)

Conclusion

Overall the response is rather more cogent and coherent than the original consultation document, though there are several inconsistencies and many sins of omission.

Drawing together the key issues emerging from the commentary above, I would highlight twelve key points:

  • The declared aims express the policy direction clumsily and without conviction. The ultimate aspirations are universal ‘secondary readiness’ (though expressed in broader terms), ‘no child left behind’ and ‘every child fulfilling their potential’ but there is no real effort to reconcile these potentially conflicting notions into a consensual vision of what primary education is for. Moreover, an inconvenient truth lurks behind these statements. By raising expectations so significantly – 4b equivalent rather than 4c; 85% over the attainment threshold rather than 65%; ‘sufficient progress’ rather than median progress and across three domains rather than one – there will be much more failure in the short to medium term. More learners will fall behind and fall short of the thresholds; many more schools are likely to undershoot the floor standards. It may also prove harder for some learners to demonstrate their potential. It might have been better to acknowledge this reality and to frame the vision in terms of creating the conditions necessary for subsequent progress towards the ultimate aspirations.
  • Younger children are increasingly caught in the crossbeam from the twin searchlights of assessment and accountability. HMCI’s subsequent intervention has raised the stakes still further. This creates obvious tensions in the sector which can be traced back to disagreements over the respective purposes of early years and primary provision and how they relate to each other. (HMCI’s notion of ‘school readiness’ is no doubt as narrow to early years practitioners as ‘secondary readiness’ is to primary educators.) But this is not just a theoretical point. Additional demands for focused inspection, moderation and publication of outcomes all carry a significant price tag. It must be open to question whether the sheer weight of assessment activity is optimal and delivers value for money. Should a radical future Government – probably with a cost-cutting remit – have rationalisation in mind?
  • Giving schools the freedom to choose from a range of Year R baseline assessment tools also seems inherently inefficient and flies in the face of the clear majority of consultation responses. We are told nothing of the perceived quality of existing services, none of which can – by definition – satisfy these new expectations without significant adjustment. It will not be straightforward to construct a universal and child-friendly instrument that is a sufficiently strong predictor of Level 4b-equivalent performance in KS2 reading, writing and maths assessments undertaken seven years later. Moreover, there will be a strong temptation for the Government to pitch the baseline higher than current expectations, so matching the  realignment at the other end of the process. Making the Reception baseline assessment optional – albeit with strings attached – seems rather half-hearted, almost an insurance against failure. Effective (and expensive) moderation may protect against widespread gaming, but the risk remains that Reception teachers will be even more predisposed to prioritise universal school readiness over stretching their more precocious four year-olds.
  • The task of designing an effective test for all levels of prior attainment at the end of key stage 2 is equally fraught with difficulty. The P-scales will be retained (in their existing format, unaligned with the revised national curriculum) for learners with special needs working below the equivalent of what is currently level 1. There will also be undefined provision ‘for those working above the level of the P-scales but below the level of the test’, even though the draft test development frameworks say:

‘All eligible children who are registered at maintained schools, special schools, or academies (including free schools) in England and are at the end of key stage 2 will be required to take the…test, unless they have taken it in the past.’

And this applies to all learners other than those in the exempted categories set out in the ARA booklets. The draft specifications add that test questions will be placed in order of difficulty. I have grave difficulty in understanding how such assessments can be optimal for high attainers and fear that this is bad assessment practice.

  • On top of this there is the worrying statement in the test development frameworks that scaled scores will be ‘truncated’ at the extremes of the distribution’. This does not fill one with confidence that the highest and lowest attainers will have their test performance properly recognised and reported.
  • The necessary invention of ‘performance descriptors’ removes any lingering illusion that academies and free schools have significant freedom to depart from the national curriculum, at least as far as the core subjects are concerned. It is hard to understand why these descriptors could not have been published alongside the programmes of study within the national curriculum.
  • The ‘performance descriptors’ in the draft test specifications carry all sorts of health warnings that they are inappropriate for teacher assessment because they cover only material that can be assessed in a written test. But there will be significant overlap between the test and teacher assessment versions, particularly in those that describe threshold performance at the equivalent of level 4b. For we know now that there will also be hierarchies of performance descriptors – aka level descriptors – for KS1 teacher assessment in reading, writing, speaking and listening and maths, as well as for KS2 teacher assessment in writing. Levels were so problematic that it has been necessary to reinvent them!
  • What with scaled scores, average scaled scores, threshold performance descriptors and ‘levelled’ performance descriptors, schools face an uphill battle in convincing parents that the reporting of test outcomes under this system will be simpler and more understandable. At the end of KS2 they will receive 16 different assessments in four different formats. (Remember that parents will also need to cope with schools’ approaches to internal assessment, which may or may not align with these arrangements.)
  • We are told about new requirements to be placed on schools to publish assessment outcomes, but the description is infuriatingly vague. We do not know whether certain requirements apply to both KS1 and 2, and/or to both tests and teacher assessment. The reference to ‘the percentage of pupils who achieve a high score in all areas at the end of key stage 2’ is additionally vague because it is unclear whether it applies to performance in each assessment, or across all assessments combined. Nor is the pitch of the high score explained. This is the only reference to high attainers in the entire response and it raises more questions than it answers.
  • We also have negligible information about what will appear in the school performance tables and what will be relegated to the accompanying data portal. We know there is an intention to compare schools’ performance on the measures they are required to publish and that is all. Much of the further detail in the original consultation document may or may not have fallen by the wayside.
  • The new floor standards have all the characteristics of a last-minute compromise hastily stitched together. The consultation document was explicit that floor standards would:

‘…focus on threshold attainment measures and value-added progress measures’

It anticipated that the progress measure would require average scaled scores of between 98.5 and 99.0 adding:

‘Our modelling suggests that a progress measure set at this level, combined with the 85% threshold attainment measure, would result in a similar number of schools falling below the floor as at present.’

But the analysis of responses fails to report at all on the question ‘Do you have any comments about these proposals for the Department’s floor standards?’ It does include the response to a subsequent question about including an average point score attainment measure in the floor standards (39% of respondents were in favour of this against 31% against). But the main text does not discuss this option at all. It begins by stating that both an attainment and a progress dimension are in play, but then describes a system in which schools can choose one or the other. There is no attempt to quantify ‘sufficient progress’ and no revised modelling of the impact of standards set at this level. We are left with the suspicion that a very significant proportion of schools will not exceed the floor. There is also a potential perverse incentive for schools with very high attaining intakes not to bother about progress at all.

  • Finally, the ‘to do’ list is substantial. Several of those with the tightest deadlines ought really to have been completed ahead of the consultation response, especially given the significant delay. There is nothing about the interaction between this work programme and that proposed by NAHT’s Commission on Assessment. Much of this work would need to take place on the other side of a General Election, while the lead time for assessing KS2 progress against a Year R baseline is a full nine years. This makes the project as a whole particularly vulnerable to the whims of future governments.

I’m struggling to find the right description for the overall package. I don’t think it’s quite substantial or messy enough to count as a dog’s breakfast. But, like a poorly airbrushed portrait, it flatters to deceive. Seen from a distance it appears convincing but, on closer inspection, there are too many wrinkles that have not been properly smoothed out

GP

April 2014

 

 

Advertisements

4 thoughts on “Unpacking the Primary Assessment and Accountability Reforms

  1. Excellent and thorough as usual.
    One note: Annabel Burns of the DfE told an Optimus conference that the P-levels would be updated to reflect the new curriculum content. No further detail, but definitely stated… So watch that space, maybe?

  2. That’s interesting Michael. I’d noted from a 2013 PQ that they were considering ‘refreshing the P-scales’ (Col344W) but then they republished the P-scales guidance unchanged – – cross-referencing to the 2013 National Curriculum Orders (see final para of covering page). I inferred that they’d had second thoughts.

    On a related matter, was any further light thrown on the question which learners will be exempted from the new tests? Conceivably they could exempt all below the equivalent of L3, which would narrow the range somewhat and reduce the difficulties faced by the test developers (though even accommodating L3 equivalent to L6 equivalent will be a tall order in my opinion). I’m really surprised there hasn’t been more concern expressed about this aspect of the plan.

    GP

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s