I’ll begin this throwaway article with a more prose-ular recap of a few of my tweets in the last 24 hours.
When I was writing my Statsbomb article on Burnley, I used the percentage of occasions that they had more than 4 defenders between a shot and the goal rather than using the average number. I also used the % of times they had fewer than 2 defenders between shot and goal.
This was mainly because the mean averages were very similar for every team in the league, which seemed false. Teams defend differently – we know this. We also know that some teams pack a lot of men behind the ball, and some are vulnerable on the counter. Hence the split.
It’s easy – though perhaps at times overly distracting – to compare football to other sports when discussing analytics, so forgive me if I’m cliched for a moment. In baseball and basketball there are a hell of a lot more games and a hell of a lot more ‘scoring events’ that can be counted. In American football, despite the lack of games, these matches are divided into numerous plays, each one becoming a running or a passing play.
Football is a low scoring game with relatively few matches (in the league, which are the only games used in the stats generally, for reasons best discussed in another article entirely). A lot of the game doesn’t actually matter very much in relation to scoring and isn’t easily delineated into attempts to score – apart from shots, which don’t happen that much, and even then it’s difficult to work out from the stats how they happened.
All this is to say that I think splitting the game up into sequences might be a helpful thing to do. Part of this is to better analyse the game, better separate this fluid mass of an object into something that can stand on its own structured feet. Part of it is because when we average the few stats we have easily available together, we end up smushing things and losing some valuable meaning.
A graph in Bobby Gardiner’s article on Leicester shows that Liverpool of 2015/16 had quite a lengthy average possession – over 9 seconds long and the 7th longest in the league. Yet we also know that Jurgen Klopp’s Liverpool are a team who like to play in the transitions, to attack when the opposition is most vulnerable.
But they also regularly came up against, and still do, sides who sit back in a low block to avoid being picked apart. Liverpool were still very effective on the counter-attack and very much liked to play this way, but some of this information is lost in the average. (To clarify, this is a criticism of averages, not Bobby).
All of that is a rambling build up to dumping some numbers. Earlier in the season (before dissertation deadlines and exams truly loomed) I was collecting a bunch of stats on centre-backs, part of which was splitting up actions made by where on the pitch they were made – deep (furthest back 20 yards or so), a middle area (around 20-maybe 45 yards), and high up the pitch.
I thought it’d be interesting today to go back and look at them, splitting behaviours up in the two main (deep and middle) zones.
The ‘behaviours’ (I can’t think of a better word) are ‘Activity’ and a ‘Front foot’ number. Activity is all the stuff I was collecting (tackles, missed tackles, interceptions, ball recoveries, blocks, and fouls); ‘front foot’ was the stuff that you do (tackles, missed tackles, interceptions, and fouls) against the stuff that you are slightly more passive and standoffish in doing (ball recoveries and blocks).*
*[Side note: I’m conflicted about putting in the actual numbers below. On the one hand I feel I should, because they’re numbers; on the other, without detailing the context they’re a little meaningless. I converted them to percentile numbers in the ol’ spreadsheet, but just sticking numbers out of 100 seems a bit arbitrary too. I eventually decided to stick the /100 percentile numbers in in brackets, so I hope you read this paragraph. Low numbers are inactive/back foot]
Splitting them up between the two zones is fascinating. Toby Alderweireld and Jan Vertonghen have similar features (in Spurs’ system that isn’t surprising); both were a lot less active in the ‘middle’ sector than the rest of the league (18 and 28), but in the defensive sector they bumped up to just above averagely busy players (57 and 58). Their more back foot defending style was consistent through both, though (Aldi: 18 in the middle, 26 in defensive; Jan: 14, 8).
Nicolas Otamendi is active and front foot in both zones (middle sector 95 activity and 72 front foot; defensive 70 and 91), with Stones the complete opposite (middle: 34 active and 9 front foot; defensive: 16 and 15). Kolarov, in his minutes as centre-back, appeared to be active but restrained in the middle sector (62 and 26), and quieter but more front foot in the defensive sector (31 and 85).
Last one (of centre-backs whose games I feel comfortable enough with to comment on). Wes Morgan and Robert Huth, both pretty inactive comparitive to other centre-backs in the league in both sectors (Middle sectors: 27 and 27; defensive: 22 and 30). Huth stays as a comfortably back foot defender in both (11 and 23), but when Morgan gets into the middle sector he goes all front foot on you (defensive sector: 35, middle sector: 84).
What’s the application for all this? Well, it opens up stats to be both general – in that they’re grouping together a bunch of stuff – and situational – in that they’re differentiating between phases of football which could justifiably be separated.
Perhaps this matters more in defence than in attack, where passing is a much more important part of their game, and they actually set up and take shots.
But in truth I’m probably more interested in how this can be used in more of a media-focussed context, and this (providing accessibility of data etc) would be a nice start in looking at defenders in a way that actually means something.
Can’t think of a pithy way to close out this ramble. Stats can be hard, they can be easy, but, at the end of the day, I think that football’s the real problem.