Last week was the first outing on EPL Index for my results prediction model. If you missed it at the time, today’s post is a look at the model’s performance on round 10’s games (link).
Overall, we did ok! Although I’ll confess to feeling a little twitchy on Saturday night with only three correct predictions from the day’s games. Target for the season is to get more than half of predictions correct, because that’s the line – or close to it – where the model will start to beat the bookies’ odds.
Coming out of Saturday, it had correctly picked West Brom to beat Palace, Man City to easily beat Norwich and Stoke’s draw with Southampton (thanks Asmir!). Barring the Stoke result though, that’s hardly ground breaking stuff. You don’t get many points for saying Man City will win at home when they go on to manage a 7-0 victory.
On Sunday, the model picked both results correctly – Everton to draw with Spurs and Cardiff to win the South Wales Derby. Back to 50% correct predictions, an overall profit for the weekend of £28 with Bet365 on £10 staked per game and a sigh of relief that I hadn’t made a complete fool of myself in the first week.
As usual in modelling, the forecasts that went well aren’t the interesting ones – we can learn most from the ones that went badly, and in particular Fulham v Man U. What was that prediction all about? I don’t ‘tweak’ the outputs from my model – you get them raw – and it seemed to be pretty much the only place where you could find the opinion that Fulham would win. Personally, I didn’t believe a word of it; hope you didn’t either. So what happened?
There were a few things driving this result.
First of all, in terms of the key stats the model uses, Fulham didn’t actually look that bad. They hadn’t conceded loads of goals (12, the same as Man U), they pass ok and they shoot relatively frequently. With the limited stats that the model takes from the EPL Index database, they looked like an average mid-table team and so did Man U (don’t shoot the messenger yet – more on this in a minute). In that context, a 45% home win percentage isn’t out of order at all.
I mentioned that Fulham’s shooting frequency is ok, but do those shots go in? This is a massive issue for the model in the early stages of the season because we don’t have a lot of goals yet to be able to judge. When the model doesn’t know if a player has good shooting accuracy, it will give him average performance for all the EPL players in his position. It knows strikers are better at shooting than defenders, but it won’t be able to give Berbatov an accurate conversion rate for example, for a few weeks yet. This has the effect of equalising teams to an extent, because the majority of players will be modelled with the same shooting ‘quality’ (though not frequency) until we have more data.
Then we come to defending… Defending in an agent based model is hard because it’s not something that shows up clearly in individual player stats – it’s a team performance. Stats like number of tackles aren’t useful because one, there aren’t actually that many tackles per game and two, tackling a lot could be a good thing or a bad thing. More tackles are likely to mean poor possession and desperation defending, indicating more goals against, not fewer.
As part of my defensive modelling, I use the number of goals a team has conceded, as a percentage of all the shots they have faced. The model’s quite good at how often teams will manage to shoot, but it needs some help with whether they’ll score. Here’s what that stat looked like ahead of last week’s games:
Prior to last weekend, 13.5% of shots that Man U faced away from home were goals. For Fulham at home it was only 8.7%. This is a huge part of why the model said Fulham were likely to win and why I said earlier that Man U had mid-table stats, because on this measure at least, it was true. To the model it seemed significantly more likely that Fulham would score.
As an aside, Man U rank 15th in that table for conceding goals as a percent of shots faced, but since 08/09 across whole seasons, they’ve ranked 1st, 2nd, 4th, 1st, 1st. The first nine games this season were a massive under-performance vs. previous averages.
Goal concession rates take some time to stabilise – usually around ten games – and I split them home and away so this key stat won’t settle down completely until the second half of the season. Bet with extreme caution! I’ll flag up in the predictions where I think this may be an issue.
When the goal concession numbers do stabilise, they do so at between 6-7% (best teams) and 13-14% (worst teams). This makes Southampton’s performance so far especially interesting as they’re currently running at a level of 3.4%. It won’t last, and neither will Tottenham’s away percentage of 2.6%.
As for the other games, I’m reasonably happy. Liverpool to beat Arsenal was an unrealistic percentage but a realistic call given what we knew ahead of the game – I’m certainly not the only one who thought a Liverpool win was a strong possibility. The high win percentage was driven by the fantastic start that Suarez and Sturridge have had.
Hull v Sunderland was called as a draw for betting but the model had Hull to win so that’s reasonable and it was only 1-0. Villa could easily have won (to my MOTD trained eye anyway) and Chelsea were shocked by Newcastle.
Not too bad for a first week and we’re up at the bookies, which is always a bonus. This weekend’s games will be posted as soon as I can run the simulations and hopefully we can watch the prediction percentages settle down as some of these key statistics start to stabilise!