Al Ahly Al Ittihad Al Masry Arab Contractors Gouna ENPPI Hodood Ismaily Petrojet Smouha Wadi Degla Zamalek Arsenal FC Basel Hull City AS Roma Lierse SK SC Braga Nacional Madeira Sporting Lisbon

StatAttack: Who will win the Egyptian Premier League?

By
Posted on February 22, 2014

 Egyptian National Team - EFA

KingFut dives into shot statistics for the to look at who has been the best team through 8 games in Egypt using advanced stats.

Which four teams do advanced stats have as the best in Egypt? Which teams have underperformed/over-performed? Who will win the league?!

Full disclosure: I’ve watched a total amount of zero domestic games this season. I live abroad and I haven’t been able to find a live stream of the Egyptian League. As a result, all I’ve had to go on are our excellent match reports here at KingFut. After 8 weeks of getting restless however, I decided to dig into some data to try and make sense of the Egyptian League table. Were Smouha really the best side in the country? What about Petrojet? Are both Ahly and on the decline & perhaps going to miss the playoffs? All numbers are of Friday, February 21, 2014

What statistic shows the best teams?

The problem I was facing was that there was no statistic that could accurately sum up performance. How could I tell which teams were performing better.

There are two parts to finding a useful statistic:

1) Is it useful? (how linked it is to winning)

2) Is it repeatable? (can teams consistently do it or is it luck)

There are a lot of statistics that are useful, but not repeatable. For example, if I told you that Team X was awarded 3 penalties this match, the probability of them winning that match would probably be close to a 100%. That is a useful statistic, as it directly resulted in a win. However, what are the chances of Team X getting 3 penalties in another match? Is winning penalties a skill that can be repeated every match? Obviously, winning a penalty is very luck-based and not skill-based, therefore this statistic is not repeatable. The fact that Team X won 3 penalties tells us nothing about their skill level and predictions for their next matches! Since we want to predict the future, repeatability is key.

Repeatability

My first thought was to look at goals. Goals are definitely useful, as they lead to winning, and they seem pretty skill-based. The better goal difference a team has the better it is, right? Well, not really. It turns out that although Goal Difference is the most useful statistic in football (is the best predictor of points), it is actually not the most repeatable as we expected. It is actually tied for 2nd most repeatable! Goals don’t happen enough in football matches, and therefore they are controlled a little bit by luck. A silly own-goal, a defending error, all will lead to wins but are not the most repeatable skills! How do I know all this? Below is a chart by the excellent James Grayson (whose blog can be found here) showing how repeatable each statistic is using data from the EPL over more than 1 season.

Metric % Skill % Luck
Total shots ratio 86 14
Total shots differential 86 14
Goal ratio 83 17
Goal difference 83 17
Total shots against 82 18
Total shots for 80 20
Goals for 75 25
Goals against 66 34
% of total shots that are on target (%TSOT) for 53 47
%TSOT for + %TSOT against 52 48
PDO (penalties excluded) (1) 46 54
% of total shots that are on target (%TSOT) against 44 56
PDO 44 56
sh% 43 57
sv% 38 62
sh% on shots from inside the box (2) 37 63
sh% (penalties excluded) (1) 36 64
sv% (penalties excluded) (1) 32 68
sv% on shots from inside the box (2) 24 76
sv% on shots from outside the box (2) 23 77
Penalties awarded differential (penalties awarded for minus penalties awarded against) (1) 9 91
Having penalties awarded against (1) 9 91
Penalty differential (penalty goals for minus penalty goals against) (1) 8 92
sh% on shots from outside the box (2) 8 92
Being awarded penalties (1) 4 96
Penalty goals conceded (1) 3 97
Penalty goals scored (1) <1 >99

So Goal Difference & Goal Ratio are 83 % skill, or 83% repeatable! The only statistic that improves upon that is TSR (Total Shots Ratio). Penalties, despite being useful, are 1% skill and 99% luck!

So we have two statistics that are heavily repeatable to look at. This will give us a statistic that is good at predicting future performances.

Usefulness

So are these statistics useful?

Any football fan can tell you that goal difference is obviously related to points. A quick look by Grayson for Goal Difference vs Points in the last ten years leads to this chart:

Points vs GD EPL

Those of you familiar with statistics, the R-squared value is what shows how related to factors are to each other. The R-squared here is .9281 which means that GD explains 92.81% of points, a very strong number!

What about TSR?

Although it is more repeatable, TSR is actually less useful than points, according to Grayson’s research.

TSRvsPoints Grayson

So TSR is actually 66% correlated with points, which sounds like a low number but is actually quite significant. Both statistics have been proven to be useful & repeatable.

TSR vs. Goal Difference; which will we use to predict?

TSR is more repeatable, goals are more useful. So which should we use to predict? Although it seems like goals are more useful due to their high usefulness and repeatability, there is 1 factor that sways the argument towards TSR. At the time we collected data, most clubs in the Egyptian Premier League had only played 8 games. So we need to figure out, which statistic is more accurate after 8 games?

Again, the brilliant Mr.Grayson comes to the rescue with his heavy lifting:

Correlation vs Games

After around 8 games, TSR is a lot more accurate than Goals Ratio (Goal Difference). TSR is at 80% accuracy after 10 games, which is really remarkable. As a result, it is more useful to use as a predictive tool than GD, especially so early in the season!

What is TSR?

I realize it’s taken me a while to get here, so thanks for the patience! I’m trying my best to keep the math to a minimum and the football to a maximum! So let’s get down to business. What is TSR?

TSR stands for Total Shots Ratio. I gave a brief introduction in my piece on the new advanced statistics in football.

Total Shots Ratio = (Total Shots for)/(Total Shots For/Total Shots Against).

For example, if a team shot 9 times in match and its opponent had 1 shot, then their TSR would be 90% or 0.9.

We’ve already covered the good things about TSR. It’s useful, and related to points. It’s very repeatable, with only 14% being luck. It also stabilizes fast, going above 80% after only 10 games!

So what are some of the bad things?

Well, it’s obviously not bullet-proof! Using TSR to predict, Tottenham would’ve won the league this year, which really doesn’t look like happening. The biggest drawback on TSR is basically shot quality. A team that takes 20 shots from the halfway line will have a higher TSR than a team that shoots 1 shot from inside the six-yard box! However, since players are rational ( and don’t want to get dropped!) the differences are rarely so drastic. However, a closer look at Tottenham this year shows that a ton of their shots come from outside the box, which inflates their TSR without giving them goals! TSR does also not take into account the skill of the opposition goalkeeper.

So if a team has higher point total than TSR would suggest then there are two options:

1) The team has been lucky and will start dropping more points soon!

2) The team is taking higher quality shots than other teams.

A look at the Egyptian Premier League 

Using data from Koora.com, I’ve aggregated all the teams TSRs over the first 8 games. Gouna were omitted due to their being a lack of data for their matches!

Here are the results:

TeamsTotal Shots RatioTSR RANKPoints Per GamePPG Rank
ElMinya0.35210.521
GhazlMahala0.41190.6320
ElGeish0.45140.6319
Entag0.40200.7818
Telephonat0.43180.8617
Raja0.47120.8916
Dakhleyah0.4811115
Qanah0.44171.1114
ENPI0.47131.1113
Haras-ElHdood0.44161.1312
Elshorta0.5571.2511
Makasa0.5281.3410
AlMasry0.45151.389
Itihad0.6131.678
Wadi Degla0.49101.757
Ahly0.6221.786
Zamalek0.6811.865
Ismaily0.6041.884
Mokawleen0.5661.893
Petrojet0.5852.222
Smouha0.4992.381

This table takes a closer look at the results, team by team:

TSR-EPL 

Points Per Game (PPG) are used due to some teams playing only 7 teams. Again, Gouna are not on the table due to a lack of shots data!

Observations:

The Top Six:

The most obvious observation from the table is that there are a clear top 6 in Egypt: Al Zamalek, , Ittihad of Alexandria, Ismaily, Petrojet and Arab Contractors.

Smouha:

The most surprising data point would be Smouha, who lead the league in PPG but are not part of these dominant six in TSR. This could mean 1 of 2 things:

1) Smouha have been lucky and will probably not keep up this league-leading form

2) Smouha take better quality shots than everyone else

Even if Smouha were to take better quality shots, the disparity seems too large and Smouha seems to be an ideal candidate for drop-off according to TSR!

The Mido Effect:

Another important observation is the dominance of Al Zamalek, the only club with a TSR of above .6500. It seems as though Zamalek have been performing well and have been a little  unlucky, or perhaps they need to improve their shot quality!

Teams under the line have been relatively unlucky: Zamalek, Tala’a El-Gaish

Teams above the line have been relatively lucky: Smouha, Wadi Degla

Predicting the Final Table:

Using the formula Grayson has for TSR correlation with points ( instead of our formula due to the fact that he has substantially more data), I built a model of what the final EPL table would look like taking into accounts how many games each team has left:

TeamPoints NowExpected PointsTotal Points
Zamalek1328.6841.7
Petrojet2018.6138.6
Ahly1621.0937.1
Ismaily 1521.6936.7
Itihad1520.5235.5
Mokawleen1717.8734.9
Smouha1915.4334.4
Wadi Degla1415.3829.4
ElShorta1019.0329.0
Makasa1215.8527.9
AlMasry1113.1224.1
ENPI1013.0423
Dakhleyah913.4522.5
Haras El Hedood912.7021.7
Qanah1011.4821.5
Raja813.3521.4
Telephonat613.1119.1
ElGeish513.1418.1
Entag79.4316.4
GhazlMahala510.6415.6
ElMinya47.3511.4

As you can see, this table does not take into account the 2 groups which are split. Furthermore, the margins are extremely small for error and success. Obviously the model does not take into strength of schedule, as well as the fact that the league is actually split into two groups.

The model has Zamalek and Petrojet qualifying from Group 2, and Itihad rebounding and qualifying with Al Ahly in Group 1. The model has Smouha collecting around 15 points from its last 14 games, and missing the playoffs due to their statistical weakness!  Zamalek & Ismaily capitalize on their game-in-hand and TSR dominance and collect 28.7 & 21.7 points respectively in their last 13 matches!

The playoffs would be:

Zamalek vs Itihad

AlAhly vs. Petrojet 

TSR is actually terrible at predicting single games, and so we won’t sue it to predict who the final winner is!

Ismaily get the short end of the stick here, despite finishing with more points than Itihad, due to them being in an extremely competitive Group 2, they miss out by around 2 points. As for Group 1, Arab Contractors have a real shot, as the model has them missing out by only a point!

As you can tell, the margins are tiny so I shouldn’t be held accountable for any results not falling the way I predicted them to! TSR is still a limited tool, but we’ll check back at the end of the season to see how accurate our model was!

Leave your predictions for the end of season table in the comments & join the discussion! I apologize if there was too much math for your taste! 

5 Comments

  1. Houssam Gooner Elokda

    February 22, 2014 at 1:09 PM

    Aywa el zamalek hay7otto 3aleiko ya ahlaweya

  2. Mohamed

    February 22, 2014 at 5:02 PM

    Great article, this is better than 50 year olds predicting scores on TV just becase they once played football, at least it’s supported by data.

  3. NK

    February 22, 2014 at 8:09 PM

    The breakdown in this article is great! Was able to keep up, disregarding my total lack of mathematical ability

  4. Omar El Far

    February 22, 2014 at 9:34 PM

    If you think that’s an indication for what’s gonna happen next then I can see no difference between you and people you are making fun of who predict data on TV or at least you’ve never played organized football…

  5. lolz

    February 27, 2014 at 10:33 AM

    great job keep it up

Leave a Reply