HomeZ OLD CATEGORIESArsenal (NN)Results Prediction Model: Premier League forecasts | Arsenal, Liverpool, Chelsea

Results Prediction Model: Premier League forecasts | Arsenal, Liverpool, Chelsea

Last January, I started work on a model that would use EPL Index’s Opta stats to predict the results of football games. It ran during the latter part of the 2012-13 season on my own blog, calling 55% of results correctly from March to May and turning a tidy overall profit at the bookies of £166.50, playing £10 stakes on every game.

Premier League Predictor

This season it’s going to run again, but with a new home for the predictions on EPL Index! Barring any mishaps, I plan to have a new set of predictions up each week on Thursday or Friday, ahead of that weekend’s round of games. Running the model was a lot of fun last season and I learned a lot, so hopefully we can do even better this year…

The model itself takes an agent-based approach, which means it simulates individual players rather than using the overall average stats for different teams. For any Football Manager fans out there, you’ve already seen this sort of model – it’s what powers that game. We set up twenty two artificial players, using their real stats to make them behave as realistically as possible and then simulate a game of football. Within the sim the players kick off, pass to each other, shoot, lose the ball and score goals in the way that their real-life stats say they should, then at the end we get a result for that game.

One predicted scoreline is ok, but not all that useful. There’s a lot of randomness in the model – as there is in a real game of football – so we need to run the sim quite a few times and see how often each team wins. As an example, Michael Carrick has a passing accuracy at home of 86%, so for every pass he plays in the simulation there’s an 86% chance of completing it and a 14% chance he gives the ball to the opposition. In one simulation he might pass, create a chance and Man U score; in another he gives the ball away and it stays at 0-0. There are around 800 events per game, so we need a lot of runs to get a good average read from the model.

A thousand runs for each single fixture gives us a percentage prediction for the likelihood of a home win, away win or draw.

If you’re interested in more detail on how the model works, I wrote quite a lot about it on my own blog last season (link) and also posted a bit about about what the software looks like (link).

Let’s see it in action with last weekend’s games.

Post predictions 26-10

The predictions are bound to be a bit wobbly at this early stage of the season because we’ve only got nine games’ worth of data, but this looks pretty encouraging. The Swansea percentages are a bit wild but other than that, the model has turned out a fairly sensible set of odds for each game. So what can we do with them?

In a football club, a model like this could be used to test different potential starting line-ups ahead of the game: What happens to your odds if you swap one player for another? Or to test strategies: Is it worth giving the ball away more often in order to create more chances? This is one of the huge advantages of an agent based simulation because it deals with individual players and lets you swap them or change their performances individually.

But what good is it to us?

Well we can bet with it.

I followed a simple rule last season – bet on the most likely team to win, unless the chances of a draw are over 25%, in which case back a draw. This is the strategy that turned a profit between March and May.

Following that strategy with £10 bets and using Bet365’s odds (from football-data.co.uk), last week we would have had:

Liverpool v West Brom (home win) £4.40
Norwich v Cardiff (draw) £24
Man U v Stoke (home win) £3.30
Crystal Palace v Arsenal (away win) £3.60
Southampton v Fulham (lose) (-£10)
Aston Villa v Everton (away win) £13
Swansea v West Ham (lose) (-£10)
Sunderland v Newcastle (lose) (-£10)
Chelsea v Man City (home win) £13.80
Tottenham v Hull (home win) £4.40

Starting with £100 staked (£10 per game) we’d now be sat on £136.50. Great result.

Rest assured it doesn’t always go quite that well, or I wouldn’t be doing this in public! I’d be doing it quietly from my yacht.

A couple of words of warning here based on last season… I’m not a big gambler and built this model for fun, but when simulations suggested it would win, I backed it. It’s early in the season and I don’t have good indicators for the model’s expected performance with the small amounts of data we have so far. Last weekend looked great but the next might be horrible. The model’s also never had a 100% correct weekend, so accumulators are quite likely to go wrong. Finally, don’t back correct scores based on this model, because you’ll lose. I’ll explain the issues with doing this in a future post.

That said, I’ll be placing small bets and tracking how we do across the season.

Neck on the line time. Here are the percentages for this weekend’s games…

Predictions 2-11

You need a predicted starting line-up to know which players to include in the simulations and I take mine from fantasy football scout.

And if you’re betting:

Arsenal v Liverpool – Away win
West Ham v Aston Villa – Away win
Cardiff v Swansea – Home win
Newcastle v Chelsea – Away win
West Brom v Crystal Palace – Home win
Everton v Tottenham – Draw
Fulham v Man U – Home win
Hull v Sunderland – Draw
Man City v Norwich – Home win
Stoke v Southampton – Draw

Couple of controversial ones there, but this wouldn’t be any fun if it was too easy. See you next week!

datamonkey
datamonkeyhttp://www.wallpaperingfog.co.uk
Statistical analyst and econometrician, working for a large marketing agency. Football stats are much more interesting.
More News

5 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here