This post examines the comparative performance of high achievers in recent international comparisons studies, principally the 2011 TIMSS and PIRLS assessments.
More specifically, it compares:
- The proportion of learners in selected countries who achieve the highest ‘advanced’ benchmarks in TIMSS 2011 maths and science assessments at Grades 4 and 8 respectively and in the PIRLS 2011 reading assessment at Grade 4;
- How selected countries have performed on each of these measures over the period in which TIMSS and PIRLS have been administered, identifying positive and negative trends and drawing inferences about current relative priorities in different countries;
- Selected countries’ overall ranking on each of these TIMSS and PIRLS assessments (based on the average score achieved across all learners undertaking the appropriate assessment), contrasted with their ranking for the proportion of learners achieving the highest ‘advanced’ and the lowest ‘low’ benchmarks, considering the associated implications for their national education policies; and
- The results from TIMSS and PIRLS 2011 with those from PISA 2009, exploring whether these different studies provide a consistent picture of countries’ relative strength in educating their highest achievers and, to the extent that there are inconsistencies, how those might be explained.
The post also reviews recent publications and speeches about England’s performance in TIMSS and PIRLS 2011, with a particular focus on the aspects set out above and the high achievers’ perspective. Finally, it draws together some significant recent contributions which ask interesting questions about the nature of these assessments and their outcomes.
This is therefore a companion piece to my December 2010 post ‘PISA 2009: International Comparisons of Gifted High Achievers’ Performance’.
There is limited reference within it to the relative strengths and weaknesses of international comparisons studies of this kind. Some time ago I published the first part of a separate post on that subject.
For the purposes of this publication my pragmatic assumption is that, while such studies have significant shortcomings and should on no account be used as the sole source of evidence for educational policy-making, they do provide useful steers which, when combined with other sources of quantitative and qualitative evidence, can offer a useful guide to current strengths and weaknesses and potential future priorities.
This is therefore a ‘health warning’: some of my conclusions below do need to be treated with a degree of caution. They are broad indicators rather than incontrovertible statements of fact.
History and Development of TIMSS and PIRLS Assessments
The Trends in International Mathematics and Science Study (TIMSS) has provided assessments of national achievement in these subjects since 1995, focused principally on two cohorts: Grade 4 (age 9/10) and Grade 8 (age 13/14).
Its companion exercise, the Progress in International Reading Study (PIRLS) was introduced in 2001 to assess reading comprehension at Grade 4.
There is a parallel TIMSS Advanced assessment of maths and physics achievement in the final year of secondary school. This was undertaken in 1995 and 2008 and is scheduled for 2015. A less difficult PrePIRLS study, providing assessment for those not yet reading confidently, was introduced for the first time in 2011.
The main TIMSS assessment has been repeated on a four-year cycle and PIRLS on a 5-year cycle making 2011 the first year in which both studies were conducted together.
- In 1995, TIMSS was undertaken for the first time, featuring assessment at five different Grades (3,4.7,8 and the final year of secondary education through the Advanced study). Altogether there were forty-five participating countries.
- In 1999 TIMSS was repeated at Grade 8 only, with thirty-eight countries participating, twenty-six of them participants in the original 1995 cycle.
- In 2003 the number of TIMSS participants increased to forty-nine, all but one of which undertook the Grade 8 assessments (though only twenty-six completed the Grade 4 assessments).
TIMSS 2011 lists sixty-three and PIRLS 2011 lists forty-eight participating countries (I have excluded from these figures those countries and parts of countries participating solely for benchmarking purposes.) Altogether though, around 600,000 learners participated in TIMSS and about half as many in PIRLS.
Assessment Frameworks and Benchmarks
The separate assessment frameworks for Maths and Science within TIMSS are similarly constructed. There is
- A Grade-specific content dimension specifying the subject matter to be assessed eg algebra, physics, geometry, chemistry; and
- A cognitive dimension, capturing the knowing, applying and reasoning processes that are deployed by the learner.
The assessment framework for reading within PIRLS is slightly different. The focus of the assessment is described as ‘reading literacy’, defined thus:
‘The ability to understand and use those written language forms required by society and/or valued by the individual. Young readers can construct meaning from a variety of texts. They read to learn, to participate in communities of readers in school and everyday life, and for enjoyment.’
Two principal aspects are assessed:
- Two purposes for reading – for literary experience and to acquire and use information; and
- Four comprehension processes – focus on and retrieve explicitly stated information; make straightforward inferences; interpret and integrate ideas and information; and examine and evaluate content, language and textual elements.
Both TIMSS and PIRLS use achievement scales with a range from 0-1000, though most learners score between 300 and 700 and 500 – the midpoint of the scales – remains constant across different cycles, so trend-related data is relatively reliable.
Four points on this scale are specified as international benchmarks: Advanced at 625, High at 550, Intermediate at 475 and Low at 400. These benchmarks are defined differently for each subject and Grade.
The Advanced benchmark definitions are as follows:
- Maths Grade 4:
‘Students can apply their understanding and knowledge in a variety of relatively complex situations and explain their reasoning. They can solve a variety of multi-step word problems involving whole numbers, including proportions. Students at this level show an increasing understanding of fractions and decimals. Students can apply geometric knowledge of a range of two- and three-dimensional shapes in a variety of situations. They can draw a conclusion from data in a table and justify their conclusion.’
- Science Grade 4:
‘Students apply knowledge and understanding of scientific processes and relationships and show some knowledge of the process of scientific inquiry. Students communicate their understanding of characteristics and life processes of organisms, reproduction and development, ecosystems and organisms’ interactions with the environment, and factors relating to human health. They demonstrate understanding of properties of light and relationships among physical properties of materials, apply and communicate their understanding of electricity and energy in practical contexts, and demonstrate an understanding of magnetic and gravitational forces and motion. Students communicate their understanding of the solar system and of Earth’s structure, physical characteristics, resources, processes, cycles, and history. They have a beginning ability to interpret results in the context of a simple experiment, reason and draw conclusions from descriptions and diagrams, and evaluate and support an argument.’
- Reading Grade 4:
‘When reading Literary Texts, students can:
- Integrate ideas and evidence across a text to appreciate overall themes
- Interpret story events and character actions to provide reasons, motivations, feelings, and character traits with full text-based support
When reading Informational Texts, students can:
- Distinguish and interpret complex information from different parts of text, and provide full text-based support
- Integrate information across a text to provide explanations, interpret significance, and sequence activities
- Evaluate visual and textual features to explain their function.’
- Maths Grade 8:
‘Students can reason with information, draw conclusions, make generalizations, and solve linear equations. Students can solve a variety of fraction, proportion, and percent problems and justify their conclusions. Students can express generalisations algebraically and model situations. They can solve a variety of problems involving equations, formulas, and functions. Students can reason with geometric figures to solve problems. Students can reason with data from several sources or unfamiliar representations to solve multi-step problems.
- Science Grade 8:
‘Students communicate an understanding of complex and abstract concepts in biology, chemistry, physics, and earth science. Students demonstrate some conceptual knowledge about cells and the characteristics, classification, and life processes of organisms. They communicate an understanding of the complexity of ecosystems and adaptations of organisms, and apply an understanding of life cycles and heredity. Students also communicate an understanding of the structure of matter and physical and chemical properties and changes and apply knowledge of forces, pressure, motion, sound, and light. They reason about electrical circuits and properties of magnets. Students apply knowledge and communicate understanding of the solar system and Earth’s processes, structures, and physical features. They understand basic features of scientific investigation. They also combine information from several sources to solve problems and draw conclusions, and they provide written explanations to communicate scientific knowledge.’
PISA Frameworks and Benchmarks
PISA is a triennial study of 15 year-olds’ performance (so Grade 9) also in maths, science and reading. A different subject is the main focus in each cycle – in 2009 it was reading. Sixty-five countries took part in PISA 2009.
There is significant overlap with TIMSS/PIRLS participants – some 40 countries involved in TIMSS 2011 also undertook PISA 2009 – but a significant proportion of countries undertake one or the other.
My previous post sets out the definitions of Reading, Mathematical and Scientific Literacy used in the PISA 2009 study and I will not repeat them here.
PISA divides student performance into six different proficiency levels. The highest (Level 6) are defined in terms of the tasks which learners successfully perform, or the skills and competences they must display.
It is interesting to compare the emphases in these descriptions with those in the parallel TIMSS/PIRLS definitions above.
- In reading, Level 6 tasks:
‘typically require the reader to make multiple inferences, comparisons and contrasts that are both detailed and precise. They require demonstration of a full and detailed understanding of one or more texts and may involve integrating information from more than one text. Tasks may require the reader to deal with unfamiliar ideas, in the presence of prominent competing information, and to generate abstract categories for interpretations. Reflect and evaluate tasks may require the reader to hypothesise about or critically evaluate a complex text on an unfamiliar topic, taking into account multiple criteria or perspectives, and applying sophisticated understandings from beyond the text. A salient condition for access and retrieve tasks at this level is precision of analysis and fine attention to detail that is inconspicuous in the texts.’
- In maths Level 6 learners can:
‘conceptualise, generalise, and utilise information based on their investigations and modelling of complex problem situations. They can link different information sources and representations and flexibly translate among them. Students at this level are capable of advanced mathematical thinking and reasoning. These students can apply this insight and understandings along with a mastery of symbolic and formal mathematical operations and relationships to develop new approaches and strategies for attacking novel situations. Students at this level can formulate and precisely communicate their actions and reflections regarding their findings, interpretations, arguments, and the appropriateness of these to the original situations.’
- And in science, they can:
‘consistently identify, explain and apply scientific knowledge and knowledge about science in a variety of complex life situations. They can link different information sources and explanations and use evidence from those sources to justify decisions. They clearly and consistently demonstrate advanced scientific thinking and reasoning, and they use their scientific understanding in support of solutions to unfamiliar scientific and technological situations. Students at this level can use scientific knowledge and develop arguments in support of recommendations and decisions that centre on personal, social or global situations.’
This 2011 IPPR Report on Benchmarking the English School System explains the somewhat different approaches of these two assessment suites:
‘PISA puts less emphasis on whether a student can reproduce content, and focuses more on their ability to apply knowledge to solve tasks…
…TIMSS…focuses on curriculum and as a result tends to test pupil’s content knowledge rather than their ability to apply it…
…PIRLS…assesses…knowledge and content of the curriculum.’
In a recent paper on the PISA 2009 results, Jerrim marks the distinction between PISA and TIMSS in slightly different terms:
‘Whereas TIMSS focuses on children’s ability to meet an internationally agreed curriculum, PISA examines functional ability – how well young people can use the skills in “real life” situations. The format of the test items also varies, including the extent to which they rely on questions that are “multiple choice”. Yet despite these differences, the two surveys summarise children’s achievement in similar ways…
…This results in a measure of children’s achievement that (in both studies) has a mean of 500 and a standard deviation of 100. However, even though the two surveys appear (at face value) to share the same scale, figures are not directly comparable (eg a mean score of 500 in PISA is not the same as a mean score of 500 in TIMSS). This is because the two surveys contain a different pool of countries upon which these achievement scores are based…Hence one is not able to directly compare results in these two surveys (and change over time) by simply looking at the raw scores.’
With these similarities and distinctions in mind, let us turn to analysis of the data.
High Performance At Advanced Benchmarks in TIMSS and PIRLS 2011
Table One below shows the top ten countries in each of the five TIMSS and PIRLS assessments at the Advanced benchmark of 675: Maths Grade 4, Science Grade 4, Reading Grade 4, Maths Grade 8 and Science Grade 8. I have also included some countries of interest that fell outside one or more of the ‘top tens’.
|Rank||Maths 4||%||Science 4||%||Reading 4||%||Maths 8||%||Science 8||%|
|3||Hong Kong||37||Finland||20||N Ireland||19||Korea||47||Korea||20|
|6||N Ireland||24||US||15||Hong Kong||18||Russia||14||England||14|
|Hong Kong||9||Hong Kong||9|
|N Zealand||4||N Zealand||5||N Zealand||5||N Zealand||9|
Table One: Top Ten Countries at Advanced Benchmarks, TIMSS and PIRLS 2011
Several important points can be drawn from this initial analysis.
- Singapore is by some margin the most successful country in terms of the percentage of its pupils achieving the Advanced benchmark. It tops the rankings in all but Maths Grade 8, where it is a close second to Taiwan. In all the remaining assessments, it has a 4 or 5 percentile point lead over its nearest rival, and in Science Grade 8, an astonishing lead of 16 percentile points.
- But the proportion of Singaporean learners achieving the Advanced benchmark varies significantly, from just under a quarter in Reading to just under half in Maths Grade 8. Singapore is much closer to the PIRLS median (+16%) in Reading so, arguably, that is a relative weakness at this level.
- Other outstanding performers include: Korea and Japan (apart from Reading which they did not undertake); Hong Kong (apart from Science at Grades 4 and 8 where it was outside the top 10); Taiwan (though it was outside the top 10 for Reading); Finland (though it was let down in the Maths Grade 8 assessment), Russia and England.
- The top-ranked countries in TIMSS – Singapore, Korea, Hong Kong, Taiwan, Japan – typically secure a significantly higher proportion of Advanced level achievers in Maths than in Science. The reverse is broadly true in a second group of countries including Finland, the US, Russia and New Zealand. England and Australia are significantly atypical, in that Maths leads the way at Grade 4 while Science is in the ascendant at Grade 8.
- When PIRLS is factored in, it is clear that a group of countries including Finland, Russia, the US, New Zealand and Israel secure larger proportions at the Advanced benchmark in Reading than in both Maths and Science. The same is almost true of England, though the percentages are equal for Reading and Maths at Grade 4. Unsurprisingly, the outstanding Asian TIMSS performers tend to achieve a significantly lower level in Reading. The relative reading difficulty of native languages are bound to have an impact here.
- Interestingly, England outscored or equalled Finland on all but one assessment (Science Grade 4). It exceeded the median comfortably on all five assessments: Maths Grade 4 (+14%); Science Grade 4 (+6%), Reading (+10%); Maths Grade 8 (+5%); and Science Grade 8 (+10%). (It was however outscored by Northern Ireland on Maths Grade 4 and Reading.)
- On the basis of these differentials, Science Grade 4 and Maths Grade 8 are England’s areas of relative weakness amongst high achievers though, if the analysis is undertaken on the basis of the gap between England and the world leader for each assessment, the incontrovertible priority is Maths Grade 8 where there is a 41 percentile point chasm between England and Taiwan.
Trends Over Time in Performance Against TIMSS and PIRLS Advanced Benchmarks
Tables 2A to 2E below show how the percentage achieving the Advanced benchmark has changed over time in each country within the top 10 in each assessment in 2011 (excluding those for which there is insufficient data).
Where the percentage has declined between cycles of the assessment, the figure is emboldened. Each table also shows for each country the percentage change between the first assessment and that undertaken in 2011.
Table 2A: TIMSS Maths Grade 4 – Trend in Percentage Achieving Advanced Benchmark
Table 2B: TIMSS Science Grade 4 – Trend in Percentage Achieving Advanced Benchmark
|Country||2001||2006||2011||Improvement Since 2001|
Table 2C: PIRLS Reading – Trend in Percentage Achieving Advanced Benchmark
|Country||1995||1999||2003||2007||2011||Improvement Since 1995|
Table 2D: TIMSS Maths Grade 8 – Trend in Percentage Achieving Advanced Benchmark
|Country||1995||1999||2003||2007||2011||Improvement Since 1995|
Table 2E: TIMSS Science Grade 8 – Trend in Percentage Achieving Advanced Benchmark
This trend-based data throws a different complexion on the performance of several leading countries.
- Though Singapore has managed impressive double-digit improvements in four of the five assessments, its improvement in Grade 4 Maths is far less spectacular, at a mere 5%. Moreover, Singapore’s performance actually declined on both Grade 8 Maths and Science in 2007, though it has reversed that trend in 2011 (and quite spectacularly so in Science).
- The rate of improvement in some other countries has exceeded that of Singapore. At Grade 4 in Maths, Hong Kong, Taiwan, Korea, England and Japan are all improving at a significantly faster rate. The same is true of Korea, Taiwan and Hong Kong at Grade 8. Singapore has comfortably the fastest rate of improvement in Grade 4 and Grade 8 Science. In Reading though, Russia and Hong Kong outscore Singapore on this metric.
- There have also been some significant declines in performance over the period that these assessments have been conducted. Both England and the United States have suffered a decline of four percentile points in Grade 4 Science, while Taiwan’s Grade 8 Science result has fallen by three percentage points and Bulgaria’s Reading score by six percentage points.
- Within TIMSS, most of the leading countries – including Korea, Hong Kong, Taiwan, England, Russia and the US – have improved significantly more on Maths than they have on Science. However, the reverse is true in Singapore (perhaps suggesting that Singapore science is a potentially stronger export than Singapore maths). Japan is also atypical in that there has been an improvement in Maths at Grade 4 but in all other assessments there has been no improvement or a slight decline.
- Where countries have achieved improvements within TIMSS assessments, these are typically stronger at Grade 4 than Grade 8, though the reverse is true in Maths in Singapore and Korea, while both Russia and the US present a more balanced scorecard in this respect.
- When PIRLS is factored in, one notices that improvements in Reading tend to be less strong than in each country’s fastest improving TIMSS subject but stronger than in its slower improving TIMSS subject. Russia is the obvious outlier, with outstanding improvement in Reading relative to both Maths and Science. In England the decline in Reading is similar to that in Science.
- Considered from this perspective, Singapore should be prioritising Grade 4 Maths, while Korea and Hong Kong should concentrate on Grade 8 Science. The US must look at Grade 4 Science and, to a lesser extent, Grade 8 Science. England’s priorities would also be Grade 4 and Grade 8 Science plus Reading. Maths is strong at Grade 4, though relatively less so at Grade 8.
Overall Rankings Compared With Rankings for Achievement of Advanced and Low Benchmarks
The next set of Tables examines how countries’ rankings differ for the overall assessment (based on the median score of learners from that country), the percentage achieving the highest ‘Advanced’ benchmark and the percentage achieving the lowest ‘Low’ benchmark.
This provides an indicator of whether each country’s highest achievers are outperforming the average achievers in comparative terms – and to what extent (if at all) the lowest achievers are lagging behind.
To make this manageable I have again confined the analysis to the top ten countries in each assessment against the ‘Advanced’ Benchmark.
Table 3A: TIMSS Grade 4 Maths – Comparison of Rank for Achievement of Advanced Benchmark, Overall and for Achievement of Low Benchmark
Table 3B: TIMSS Grade 4 Science – Comparison of Rank for Achievement of Advanced Benchmark, Overall and for Achievement of Low Benchmark
Table 3C: PIRLS Reading – Comparison of Rank for Achievement of Advanced Benchmark, Overall and for Achievement of Low Benchmark
Table 3D: TIMSS Grade 8 Maths – Comparison of Rank for Achievement of Advanced Benchmark, Overall and for Achievement of Low Benchmark
Table 3E: TIMSS Grade 8 Science – Comparison of Rank for Achievement of Advanced Benchmark, Overall and for Achievement of Low Benchmark
These Tables show that, particularly at the top end of the distribution, there is a very close correlation between ranking on the basis of average score and on the basis of the proportion achieving the Advanced benchmark.
There is also a fairly close correlation with the proportion achieving the Low benchmark, but this is not quite so pronounced and there are some outliers with relatively ‘long tails’ of low achievement.
- In Maths at Grade 4 the top five countries get very high percentages of pupils past the Low Benchmark, but the next five are relatively less successful and, of these, England is least successful. It has a relatively ‘long tail’, while its highest achievers do comparatively better than the overall measure. The latter is also true of Russia and the United States, but the reverse is the case in Finland. This is arguably evidence that England, Russia and the US should prioritise the lower end of the distribution while Finland should pay more attention to the top end.
- In Maths at Grade 8 the pattern is broadly similar though, with the exception of Israel, the ‘long tail’ for the countries just below the top rank is not quite so pronounced. This might suggest that earlier efforts to bring younger low achievers up to a higher standard – and to narrow national achievement gaps – have been at least partly successful.
- In Science at Grade 4 these variations are once again more substantial, while tending to narrow at Grade 8, so giving a similar pattern. Singapore’s rankings suggest relatively greater priority is required at the lower end of the achievement distribution. Romania is clearly the worst in this respect, though England and Hungary are not too far behind. The’ ranking gap’ in England is broadly similar for Maths and Science at Grade 4 and at Grade 8 respectively.
- In Science at Grade 8, Israel again has the longest tail, comparable with the situation in Maths Grade 8. Finland is again remarkable for bucking the general trend, suggesting perhaps that it is too much focused on lifting everyone up to a relatively high standard and too little focused on stretching those at the top.
- In Reading there is relatively more volatility throughout the table, at the top as much as the bottom of the top ten. Russia, Finland and the United States have relatively ‘flat profiles’, while Hong Kong assumes the ‘reverse profile’ more typically associated with Finland in respect of Maths and Science. Several countries have a pronounced tail, including Singapore, Northern Ireland, England, Ireland, Israel and New Zealand. The latter two have the biggest issue in this respect. There is clearly an issue here for Singapore to address.
Broad Comparisons Between TIMSS and PIRLS 2011 and PISA 2009
Finally in this data analysis section, it is worthwhile to compare the top-ranking countries in terms of the proportions achieving the most demanding benchmarks, to identify broad similarities and differences.
Of course the results are not strictly comparable because the assessments are substantively different, the assessed learners are older on PISA, and the cohort of countries competing with each other is not the same.
Nevertheless, the exercise is instructive.
For the purpose of the comparison I have used the Grade 8 Maths and Science assessments (because the learners taking them are almost the same age as those undertaking PISA), but I have also included PIRLS, as the only comparison available for reading.
On this occasion, however, I have included the top 20 ranked institutions in each assessment
Table 4: Top 20 Rankings for Highest Benchmark in TIMSS, PIRLS and PISA
In PISA results are reported for the UK as a whole, but the figures for Level 6 achievement in England are almost identical (only in maths is there a noticeable difference, with England’s result 0.1% lower than that reported for the UK).
England is ranked 29th on PISA Maths, the only column in the table in which neither England nor the UK appears.
The rankings show that a handful of the ‘usual suspects’ are highly placed on both TIMSS/PIRLS and PISA. Singapore is ubiquitous.
Some countries perform relatively better on the PISA side of the equation – New Zealand is an obvious example – while England is a comparatively better performer on TIMSS/PIRLS, as is the United States.
It is interesting to hypothesise whether these differences reflect different strengths in national education systems. Other things being equal, do those countries performing best on PISA pay relatively more attention their high achiever’s problem-solving and the application of content knowledge? Do those performing better on TIMSS/PIRLS emphasise content knowledge above ‘real life’ problem-solving?
Perhaps high-achieving learners in countries more successful in PISA are simply more familiar with assessment instruments that feature such problem-solving. Or perhaps much of the difference is explainable by more mundane variations in the assessment process. There are likely to be several different factors in play.
The countries that appear most frequently on these lists are amongst the global leaders in educating high-achieving learners. Whether there is a significant correlation with the scope and efficacy of their gifted education programmes is less certain.
We know from previous posts on this Blog that Singapore, Korea and Hong Kong have some of the best developed gifted education programmes in the world. Israel also falls into this category, as did England in the period up to 2011.
It would be a reasonable hypothesis that their investment at the top end of the ability range is having a positive effect in terms of educational outcomes as measured by these assessments, but I am not aware of any research that attempts to establish such causality.
And it is important to note that the percentages achieving the highest benchmarks in PISA/TIMSS and PIRLS vastly exceed the proportions admitted into leading countries’ gifted education programmes whereas, in England, the proportion achieving the highest benchmarks is significantly lower than the percentage in the former national gifted education programme.
|Assessment||Leading country||% at highest BenchmarkIn leading
|% at highestBenchmarkIn England||% at highest benchmarkAverage for Assessment*|
|TIMSS Maths G4||Singapore||43||18||4|
|TIMSS Maths G8||Taiwan||49||8||5|
|TIMSS Science G4||Singapore||33||11||3|
|TIMSS Science G8||Singapore||40||14||4|
*Averages for PISA are OECD countries only
Table 5: Percentage Achieving Highest Benchmark in TIMSS, PIRLS and PISA – Comparison of Leading Country and England
Table 5 shows that the gaps between England and the leading country can be highly variable between assessments.
- In TIMSS Grade 4 Maths, Singapore achieves more than twice as many as England at the highest benchmark but, at Grade 8, Taiwan manages over six times as many.
- In PISA Maths the difference between Shanghai and England is enormous – over 15 times as many Shanghai learners achieve the benchmark.
- In TIMSS Grade 4 Science, Singapore has exactly three times as many at the highest benchmark while, at Grade 8, it has slightly less than that.
- In PISA Science, slightly more than twice as many Singaporean learners achieve the highest benchmark.
- In PIRLS reading the difference is much smaller, with Singapore only 25% ahead but
- In PISA Reading, the gap between England and New Zealand is once again close to a multiple of three.
So, while the majority of assessments show the international leader having a two- or threefold greater proportion achieving the highest benchmark, there are three conspicuous outliers: TIMSS Grade 8 Maths and especially PISA Maths (where England performs significantly worse); and PIRLS Reading (where England scores significantly better.
At the same time though, England is significantly ahead of the average for each assessment, with the sole exception of PISA maths.
While there is a significant gap between England and the world’s leaders on all these assessments, its performance is comparatively respectable in all but PISA Maths/TIMSS Grade 8 Maths. This suggests a particular problem with secondary maths for the highest achievers in England.
Domestic Analysis of England’s Performance in TIMSS and PIRLS 2011
- In all four TIMSS assessments, the attainment difference between the highest and lowest performing learners was just short of 300 TIMSS scale points.
- The best-performing countries typically have similar or smaller ranges of attainment, though there were exceptions (Taiwan for Grade 8 maths and Singapore for Grade 4 and Grade 8 Science). The variation tends to be greater for those below average than for those above.
- Whereas at Grade 4 in Maths England’s performance can be seen as at the low end of the highest performing countries, at Grade 8 it ‘has more in common with the performance of the majority of countries than with the highest performing countries’.
- At Grade 4 in Science ‘England is in a group of countries with relatively low proportions of pupils at the advanced benchmark’ and, despite the good showing in the rankings the profile at Grade 8 ‘differs from those of the highest scoring countries’.
- In the PIRLS Reading assessment ‘the most able readers [in England] were among the best readers in the survey’. They reached levels similar to Singapore’s high achievers and ‘higher than the most able readers in the three top performing countries (Hong Kong, the Russian Federation and Finland)’.
The TES ran a story in which Andreas Schleicher of the OECD – the man responsible for PISA – took an idiosyncratic position, arguing that good results in TIMSS and PIRLS would actually be bad news, because:
‘Pisa – which suggests a recent decline in England’s international standing – tests children at an older age than Timss and Pirls. Mr Schleicher claimed that a good performance from England in the latter two tests, after its fall from grace in Pisa, would therefore suggest that the performance of pupils is actually deteriorating as they progress through school.
“If you put the three surveys together – I don’t think you can strictly compare them, but if you sort of use them as approximations – in my view it makes the picture a lot more worrying,” he said. “Because the message you get is that the earlier the year in school that you test kids in the UK, the better the performance internationally.
“In other words, parents and society do a great job in children getting to school but then year after year the schools system adds less value than we see across (other) countries.”…
…”It is probably true that the UK system is actually quite good in primary education, in the early years, but then afterwards it peters out – you can see the high dropout, you can see the 14-18 problem and so on,” Mr Schleicher said. “If you look at the three surveys together you don’t get a very encouraging picture. It is a more worrying picture than if you look at them one by one.”’
This statement rather ignores the fact that only a single year separates PISA participants from those undertaking the TIMSS 8th Grade studies. From the evidence above, it is not consistently borne out by performance at the highest benchmarks, especially in Science.
There are likely to be several different factors responsible for England’s relatively better performance on TIMSS/PIRLS (including in the 8th Grade assessments).
Many have been identified through research studies, the majority of them associated with technical differences in the nature of the assessments. There will also be factors associated with the systems being assessed, but I have seen no substantive evidence to back up Schleicher’s claim.
On 11 December 2012, Education Minister Elizabeth Truss gave a speech about the evidence from TIMSS and PIRLS. Towards the beginning, she advances the oft-repeated truism (not entirely borne out by the evidence above) that:
‘In the past, and still today, this country has excelled at educating a small minority of its children to the very highest level.’
In fact, the minority is relatively large compared with most other countries.
Strangely, although the speech concentrates on the raft of reforms being introduced to improve performance in reading, maths and science, there is no reference at all to those which specifically benefit the highest achievers: the introduction of Level 6 assessment at Key Stage 2 and the development of a cadre of selective specialist 16-19 maths and science free schools.
The timing of these assessments was problematic for a Government elected to power in 2010. This BBC story includes a grudging reaction to the mixed bag of results from a Government spokesman:
‘These tests reflect progress between 2006 and 2011 and were taken only a year after the election.
So to the limited extent the results reflect the effect of political leadership, Labour deserves the praise for the small improvement in reading and the blame for the stagnation in maths and the decline in science. The tests say nothing, good or bad, about what we have done.’
Meanwhile the Opposition spokesman says:
‘These results show schools in England are some of the best in Europe – thanks to the hard work of teachers and pupils. The Labour government’s reforms saw reading results improve thanks to better teaching, smaller class sizes and Labour’s National Literacy Strategy.
However, we need to understand why East Asian countries outperform us in key skills – particularly science and maths.’
This analysis aims to exemplify how careful analysis of performance against the highest benchmarks in TIMSS, PIRLS and PISA assessments can offer broad indicators of the comparative strengths and weaknesses of education systems as far as their high achievers are concerned.
It acknowledges the significant weaknesses of an evidence base derived entirely from international benchmarking studies, although it does not address directly the problems associated with such studies which tend to call the findings into question.
It does not draw out the implications for each country – readers can do that for themselves – but I hope it does reveal that even the most celebrated international examples cannot afford to rest on their laurels. To take just three national examples:
- Singapore tops almost every assessment but it performs less well on PIRLS Reading than on the four TIMSS studies. Other countries are improving their Reading performance at the Advanced benchmark at a much faster rate, while there has also been limited improvement over time at in Maths, especially at Grade 4. Perhaps Singapore is beginning to approach a maths ‘ceiling’, preventing the proportion of high achievers from being much further improved. In both Reading and Science there is evidence to suggest that the lower end of the achievement distribution requires somewhat greater attention.
- Despite its stellar performance in PISA 2009 and strong showing in the overall TIMSS/PIRLS rankings, Finland is not amongst the world leaders in maximising the proportion of high achievers in these studies. It outperformed England only on Science at Grade 4, probably England’s main area of weakness. While Finland may have made strong progress in eradicating ‘long tails of low achievement’, there is evidence here to suggest that it is falling behind at the top end.
- England’s outperformance of Finland – so often held up as the model for us to emulate – deserves to be more widely known and celebrated. The situation is nowhere near as bad as the Sutton Trust’s recent report on the Highly Able might suggest. But there is no room for complacency. There are still big gaps to make up in Maths at Grade 4 and in Science at Grade 8. The trend over time is disappointing in Science at Grades 4 and 8 and also in Reading. While attention is clearly needed to shorten ‘long tails’ in Reading, Maths and Science (especially at Grade 4), this must not be at the expense of the high achievers, or England risks falling into the Finnish trap.