The model really showed what it can do over the Christmas period, putting in a solid set of predictions for rounds 18 to 20 (Boxing Day to New Year) and bringing back a very nice return if you were using it to bet. Overall, it picked 18 results from the 30 games and finished with an average profit of 34% per bet. If you’d staked £10 per game – £300 in total for the three rounds – you’d have won £102 by the evening of New Years’ Day.
The model made three very big calls on New Year’s Day: Swansea to beat Man City, Southampton to beat Chelsea and Spurs to beat Man United at Old Trafford. I said in the post that three out of three was very unlikely, but one or more would be a good result and that’s what happened. Swansea ran Man City close (to my eye anyway) before struggling towards the end of the game, which finished 3-2. Chelsea’s win over Southampton looked very comfortable at 3-0, but shooting charts show that the story wasn’t so simple. 11tegen11 produces these lovely charts of shots over 90 minutes and calculates expected goals from those shots.
This one for Southampton v Chelsea shows that the game was closer than the final score suggests and Southampton were arguably unlucky not to score at least once.
We did get Tottenham’s win at Manchester United correct and also very nearly got Stoke to win over Everton too, before an injury time penalty took the game to a draw.
Across the whole season so far, the model is just about turning a profit. It’s called 49% of results correctly and is up just over 3% if you’ve bet the same stake on every call. This is an ok, but slightly below expectations performance and we should do much better in the second half of the season because the Opta data now has enough weeks’ history to be stable. I started these posts in week ten and at that point, some of the shooting frequency and accuracy data in particular, was still quite volatile. Back then, the stats still said things like Southampton’s defence was virtually invincible, but with more data, we get much better predictions. Any betting model that isn’t losing is a good base to start from though and the EPL model has turned a small profit so far.
I’ve also been making improvements to the model’s algorithm that should see more accurate percentages in 2014. I described a few weeks ago how the passing model had changed, to be able to predict how players will pass forward, backward and sideways and give a better picture of whether a team is able to move the ball up the pitch to its strikers. Over the FA Cup weekend, I took the chance to have a look at shooting…
A few people had spotted that the version of the EPL Model I’ve been using, over-predicts numbers of goals. I’ve known for a long time that goals were scored too often in the model, but hadn’t been able to fix it without damaging the results prediction accuracy. When a player receives the ball in a simulated game, the computer rolls a dice to see what he will do next; he could pass and depending on his Opta stats, most of the time that’s what happens. He could also shoot, or create a chance, or dribble with the ball, again based on the frequency with which his Opta stats say this happens in real life.
If he shoots, then we check his shooting accuracy to find the probability that he’s scored. Add up all of these shots and you get a sensible number of goals.
The player could also create a chance and if he does, then somebody else will have to shoot. This is the problem, because now the model’s shot frequency is too high. You’ve got all of the shots described by Opta, plus all of the chances created – which also lead to shots – and that double counts a lot of attempts. The model will score too many goals.
Unfortunately, you can’t fix this at an individual player level because we don’t have stats for each player on how many chances they create for themselves, against how many of their shots resulted from a chance created by somebody else. You can make a fairly good estimate by team of the overall size of the double counting though, and remove it, which is what I’ve done. In terms of the predictions, two things happen which we will need to keep an eye on.
1. The number of goals scored drops by around 30%. Good, that’s what we were hoping would happen.
2. Predicted draw percentages will increase, because you get more 0-0 and 1-1 results.
I’ll come back to the draw percentages when we pick the bets for this week. Here are the new and improved model’s percentages for round 21:
And the passing and (reduced frequency) shooting numbers.
One big upset and a couple of smaller ones this week…
Manchester City to lose away at Newcastle? The model is still fairly negative about Man City’s away performances, due to their early season form and the fact that Aguero is missing. It uses stats based on the whole season so far, so whether you believe in this prediction depends on whether you believe that early away form was a blip, which has been remedied, or the symptom of a genuinely lower away performance, which will cause further losses this season. I’ll be trusting 100% to the model as usual and putting my money on Newcastle.
The model also thinks that West Ham and West Brom have much better away chances than the bookies odds give them. As always, we’ll have to wait and see.
If you’re betting:
Hull City v Chelsea – Away win
Cardiff City v West Ham United – Away win
Everton v Norwich City – Home win
Fulham v Sunderland – Too close to call. Away win
Southampton v West Bromwich Albion – Away win
Tottenham Hotspur – Crystal Palace – Home win
Manchester United v Swansea City – Home win
Newcastle United v Manchester City – Home win
Stoke City v Liverpool – Away win
Aston Villa v Arsenal – Away win
No draw picks this week using the usual 25% chance rule, because the new shooting methodology means it’s no longer a 25% rule, it’s a 27% rule. Fewer goals are scored in the model, so the modelled chances of a draw have gone up slightly in every game. It’s now optimal to bet on the draw if the model says its probability is above 27%.
One last note… Even though the model’s goals scored accuracy is better than it was, I still wouldn’t recommend using it to bet on correct scores. It will only get about one in ten exact scores correct as they show a lot of randomness and are extremely difficult to predict. I know some of you will do it anyway, but you’ve been warned! Good luck and see you next week.