Defensive Correlations: The Return


A couple of years ago I made some posts on correlations between defensive actions and how many shots were conceded while that central defender was playing. A helpful commenter aided me, but my stats know-how was very poor and I recently took the posts down to try and avoid someone stumbling on bad work in their search for solid stats.

However, I’ve since gone back to the data I collected (by hand, from the very excellent StatsZone app, in the days before I’d even heard of a data scraper, not that I know what to do with them now I’ve heard of them).

I went back to the data from the 2014/15 season, across the Premier League, Bundesliga, La Liga, and Ligue 1. I took the guys who’d played more than 900 minutes (141 of them) and did some correlations for defensive actions per 90 against shots conceded per 90.

Tackles Attempted Tackles Interceptions Clearances Ball Recoveries
-0.201 -0.160 -0.040 0.440 -0.359


  • None of the correlations are very strong, so defensive actions don’t seem to ‘explain’ the number of shots conceded by a lot.
  • Attempted tackles are correlated less strongly than tackles won (I’m not sure of the exact definition, WhoScored currently splits tackles between Tackles Made, Dribbled Past, and Attempted Tackles (the previous two added together basically; Squawka uses Tackles Won and Tackles Lost, which I believe are different). This makes intuitive sense I think; a centre-back might attempt a bunch of tackles but keep getting dribbled past – it’s the tackles they actually make that count.
  • Interceptions is such a low correlation. My guess for this would be that interceptions are more due to the style of a particular centre-back, and this interferes more than it does for tackles.
  • Clearances are positively correlated with conceding more shots. This suggests to me that if you make a lot of clearances you’re on a team that’s usually under a lot of pressure and therefore generally face more shots.
  • Ball Recoveries has a higher correlation, but this stat is dependent on the team completing passes after the ball is recovered, so part of this is likely to be because teams who keep the ball better are generally better teams, and therefore being on a better team naturally lends itself to a higher ball recovery count.

If you possession adjust the stats (so you’re defensive stats get boosted if your team sees less of the ball, and get dampened if your team doesn’t have a lot of the ball), you get this:

Tackles Attempted Tackles Interceptions Clearances Ball Recoveries
-0.395 -0.362 -0.290 0.221 -0.537


  • I’ll get the ball recoveries thing out of the way first. Same thing as last time, that better teams likely to have more of these because of the way the stat is collected.
  • Tackles and interceptions, particularly interceptions, shoot up. The correlations still aren’t strong, but they’re a lot stronger than before.
  • The amount that Interceptions shoots up by interests me. I don’t have a working theory about it at the moment though.
  • The Tackles-Attempted Tackles disparity is still there.
  • Reservations on clearances still the same as in the first lot of takeaways.


Finally, if you look at correlations between tackles+interceptions versus shots conceded:

  • Raw (non possession adjusted) = -0.145
  • Possession adjusted = -0.413

So, hopefully this will show why it’s difficult to use defensive statistics to assess a centre-back’s quality. If you REALLY want something, then possession adjust tackles+interceptions (it’s not hard; you take the team’s average possession as a decimal, multiply it by 2, and then multiply *that* number by the tackles+interceptions number).

However, this isn’t really an article to give advice on assessing CB quality with stats, it’s basically an information dump.

It’s also a call out to anyone with data. I’ve only used data from 1 season, with only 141 data points. I haven’t been able to look at whether these trends continue over much larger samples, or been able to look at to what extent defensive action numbers continue season on season.

Really key (for me), if you are looking to do this kind of thing, is that these shots conceded per90 stats are for the minutes that a player was on the pitch. Not for the games they started or featured, not team averages in the season. I’m not sure of easy ways to find this, which is why I mention it explicitly.

Anyway, thanks for reading. Lemme know your thoughts.


12 thoughts on “Defensive Correlations: The Return

  1. nathp89

    I take it this is just for individual players – are there any significant correlations between these stats and shots conceded for a 3/4/5 man defence? Or a whole team?

  2. Pingback: Squawka’s Eric Bailly comparison matrix, and what’s wrong with it | Every Team Needs A Ron

  3. Pingback: Team Defensive Correlations: Opposition shots vs defensive actions | Every Team Needs A Ron

  4. Pingback: Our lexicon for defending in football isn’t good enough | Every Team Needs A Ron

  5. Nikita Vasyukhin

    Excuse me, do you maybe have a mistake in your formula?
    Your text: “it’s not hard; you take the team’s average possession as a decimal, subtract it from 1 to get the opponent’s, multiply the opponent’s decimal by 2, and then multiply that number by the tackles+interceptions number”.
    Owing to that we should get adjusted stats, but I have real issue:
    Matic played for Chelsea-2014/15: 54,1% possesion, 3,7 tackles, 5,9 attempted tackles, 2,1 interceptions. If we adjust his stats. using your algorithm, we will get: (1-0,541)*2*3,7/5,9/2,1 = 3,4/5,4/1,9. But Chelsea possessed more than 50% that means that Matic would had better stats, if he played for team which possessed less than 50%.
    Maybe we should divide instead of multiplication? For example: Chelsea’s % = 0,541 => opposition’s % = 0,459. Multiply by 2 = 0,918, and after divide stats by 0,918. Therefore we get 4 tackles, 6,4 attempted tackles and 2,3 interceptions. In looks more logical: adjusted Matic’s stats are “better”, because his team possessed more than 50%.

    I hope, that my English is good enough for understanding my point. Thank you.

    1. Mark Thompson Post author

      You’re absolutely right, this is my mistake. You should take the team’s possession and multiply it (which is essentially the same as your suggestion of dividing by the opponent’s possession). Thanks very much for pointing this out

  6. Pingback: Football Analytics – Part Seven: How (Not) To Use Stats | One Short Corner

  7. Pingback: Top Defensive Action Premier League Football Data Analysis - Wednesday 26th April 2017

  8. Pingback: Top Defensive Action Football Data Analysis - Wednesday 26th April 2017

  9. Pingback: Football Data Analysis: Back to the Future - Tuesday 1st August

  10. Pingback: ‘Things I learned’ – blogging about the bad things I did while doing public analytics so that you don’t have to do them | Every Team Needs A Ron

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s