top of page
  • Varsha Ramineni

Modelling in the Public Sector: A level Algorithm

Modelling in the public sector has its unique challenges. The recent disruption to A level examinations in the UK, and the subsequent attempt to model student grades reveals the need for greater transparency and understanding of the models that could shape our future.

The reversal of the Ofqual’s A level grade predictions this August back to the teacher submitted predicted grade for each student (the centre assessed grade, or CAG set by schools) has led to a lot of discourse around algorithmic bias and public trust in using models. Statistical and algorithmic models have great power to aid decision making and exploring Ofqual’s model in detail raises some important considerations of utilising statistics within the public sector.

A model in simple terms takes in some inputs and gives out some hopefully useful and accurate outputs. The inputs of Ofqual’s model were the historical performance of each school, the prior attainment of the students, and additionally the CAGs and a rank ordering of every single student in the school both of which were submitted by teachers. The model will then return some outputs which in this case was the prediction of A level grades for each student that hopefully match what the student would have achieved had they taken the exams as usual. [3, p. 5]

One important detail of the final model used by Ofqual, called the Direct Centre Performance model (DCP), was that it did not directly predict individual student’s A level grades. It instead predicted the cumulative percentage grade distribution for a given school. The rank order provided by each school was then over-layed such that the proportion of students awarded each grade within the school matched the predicted distribution closely. [3, p. 7] [3, p. 93]

The DCP algorithm placed more weight on historical performance of each school along with the ranking of students and less weight was given to 1 the CAGs submitted. The reasoning behind this was that the CAGs overestimate student’s grades, and that the rank order given submitted by each school being an ordinal judgement is less likely to lead to grade inflation. However, there were expectations; the model put greater weight on CAGs for schools and colleges which have smaller cohort sizes (less than 15 students). They noted that small teaching groups are more common for AS and A level classes than for GCSE, and given that CAGs tend to be more on the optimistic side it means that the outcomes in some AS and A level subjects would be higher this year [3, p. 7]. Despite pointing this out, they failed to consider that independent schools also have smaller teaching groups in comparison to state schools, and hence will receive more optimistic predictions. Additionally, although the ranking may indeed be more accurate than the CAGs submitted by the teachers, there is no information regarding the spread of grades within those given rankings. [1]

One of the most critiqued elements of Ofqual’s model, was the testing and validation process; this is a vital part of modelling data to determine the accuracy of the model you have created. The testing of a statistical or algorithmic model is ideally done with an independent data set, so you are able compare the predictions that your model outputs to the real observed values. Ofqual evaluated accuracy using the 2019 exam grade data and seeing how well they matched to the awarded exam grades. However naturally, the 2019 exam data does not contain a teacher submitted ranking of the students in each school. To tackle this Ofqual used as a replacement the actual rank order within the centre based on the grades achieved in 2019 [3, p. 49]. This essentially means that Ofqual used the 2019 observed grade data to predict the 2019 grade data, which would have most definitely led to an overestimation in accuracy of the model [1]. Additionally, as mentioned above, the final algorithm used was actually a mixture of using the predicted grade distribution with student ranking and other schools being predicted on the CAGs submitted, however the report does not comment on what happens to the accuracy when this component is added in.

There is no question that this was an enormous challenge for any group to undertake. Unlike in other industries the modelling for calculating grades was brand new with urgency and student’s futures at stake. The algorithms did not have the chance to be rigorously tested and refined over many years [4]. However, many of the issues which came up could have been quickly noticed by statistical and modelling experts had Ofqual been more transparent and inclusive during their process [2]. This could have led to more scepticism about whether Ofqual’s statutory objective to maintain standards over time was the correct approach [3, p. 21]. Ethical considerations when modelling and in technology generally are also being increasingly adopted, with the University of Oxford recently announcing a new Institute for Ethics in 2 AI. If models are to ever be trusted by the public; there needs to be much greater transparency; statisticians, social science and industry experts (such as the highly trained and experienced teachers in this instance) should be invited and included in discussions throughout the modelling process.

Author; Varsha Ramineni

MSc Statistical Science

Keble College, Oxford








References:

[1] Sophie Bennett. On a levels, ofqual and algorithms, August 2020. URL https://www.sophieheloisebennett.com/posts/a-levels-2020/.

[2] Elliot Jones and Cansu Safak. Can algorithms ever make the grade?, August 2020. URL https://www.adalovelaceinstitute.org/ can-algorithms-ever-make-the-grade/.

[3] Ofqual. Awarding GCSE, AS, A level, advanced extension awards and extended project qualifications in summer 2020: interim report, 2020.

[4] Rob. In defence of algorithms, August 2020. URL https://medium. com/@rob_francis/in-defence-of-algorithms-44164c71a2ee.

99 views0 comments
bottom of page