Close but no cigar in the pre-Christmas fixtures as the model missed four results by just one goal. We got the games at Swansea, Liverpool and Palace correct and the controversial calls of Man U to draw and Man City to struggle against Fulham turned out to be inaccurate, rather than clever. Fulham did give Man City a brief scare and – as the model suggested they would – scored more than once, but it’s results that count.
I’ve been using Christmas train journeys to make some improvements to the way that the model works. There’s a big list of changes that I’d like to try and I’m slowly working through it one step at a time… This week’s change is a fairly major one and just on last weekend, would have seen the model switch predictions to get the Man U and Spurs results right, though obviously I do validate changes over a lot more than one week’s fixtures!
Passing is the core of my EPL model. Players move the ball around between each other and occasionally shoot, so modelling those passes as accurately as possible is absolutely critical. The first step is easy. Say Rooney has the ball for Manchester United and the model says – based on what he does most often in real life – that he should pass. What happens next? The model checks to see if it’s a successful pass, using Rooney’s Opta stats which say that 79.2% of his passes find a team-mate. If it’s not a successful pass, we give the ball to the opposition. If it’s successful then we now need to work out who he passed to and this is the algorithm, which I’ve been improving.
In an ideal world, we’d have a matrix of data available on which players pass to each other most often, but we don’t have one of those.
The previous version of the model only used data on how often each player sees the ball across a match. It would say that the most likely thing to happen to Rooney’s pass is that it went to Michael Carrick (assuming he’s playing) because Carrick sees the ball a lot. On average this works, but it doesn’t adjust very well for the opposition and the fact that players change their passing direction depending on who they’re playing against. Sometimes you just can’t get the ball to your strikers and the model was never very good at that, which partly explains odd predictions like Fulham to beat Man City. The model couldn’t see that against Man City, Fulham’s strikers would see very little of the ball.
In the latest version of the model, players now pass forwards, backwards and sideways, using Opta data on their true behaviour and all adjusted for the opposition quality. We should see effects like teams not being able to move the ball through their midfield and towards their strikers against hard-pressing opponents. Overall it gives the model a much more realistic pattern of passes and possession.
Enough about the development work. Here are this week’s predictions, realistic passing and all.
The model continues its (accurate so far) predictions that Man City will beat just about anybody 5-1 at home and has gone against the bookies odds for the games at Villa and Cardiff.
And here are the deeper possession and shooting stats, so you can decide for yourself if the model’s being realistic.
One thing that this table does draw attention to is that scoring frequencies in the model are too high across the board. This is one that I’ve had my eye on for a while and is the reason we don’t use this model for correct score or ‘both teams to score’ predictions. It’s high on the list to investigate and hopefully solve.
If you’re betting…
Hull City v Manchester United – Away win
Aston Villa v Crystal Palace – Away win
Cardiff City v Southampton – Home win
Chelsea v Swansea City – Home win
Everton v Sunderland – Home win
Newcastle United v Stoke City – Home win
Norwich City v Fulham – Home win
Tottenham Hotspur v West Bromwich Albion – Home win
West Ham United v Arsenal – Away win
Manchester City v Liverpool – Home win
The weekend fixtures up will be after a very hasty run of the simulator on Friday.