If Super Rugby is to become and remain the best club rugby competition in the world, SANZAAR’s referee quality assessment must lead the world.

Evaluating referee performance is complex. Rugby union’s laws at the breakdown, set piece, and even in general play are replete with split second adjudications.
As in all of sport, a ‘home field advantage’ persists which may never be mitigated, even if we designed a referee robot immune to social pressures.
Notably, rugby does not have as much of a home field win advantage as most other sporting codes. Prior studies of European football leagues, American college basketball, and rugby (league and union) suffered from the conflation of disproportionate home wins (to be expected, even if referees were infallibly fair) with ‘referee bias’ a loaded term.
This article, by The Roar’s Carlos the Argie and yours truly, seeks a different path. We agreed that the awarding of penalties is a relatively unfettered referee power. We wondered if we could learn anything about the Super Rugby’s referees’ performance by studying one season’s penalty count in detail.
Any statistical analysis of rugby referees is just a start to a conversation. This article is not intended to prove ultimate conclusions about referees. Referees can only judge what is played. Fights happen. Scrums can crumble. A team can decide to play negatively, for good reason.
Nevertheless, over the course of 125 games, adjudicated by a relatively even and constant number of referees who know and are well known by the players of three countries well-versed in each other’s styles, may begin a useful dialogue to improve Super Rugby.
Harry: I compiled a table of all 125 games of Super Rugby in 2015, including home team, away team, respective penalty counts, and the name and origin of the referee. We chose 2015 because the national origin of the teams and referees were more evenly distributed. I gathered my numbers from official statistics, compared them to game day reports, and read newspaper articles from the winning and losing teams’ hometowns. After triple-checking it game by game, I sent it to Carlos.
Carlos: I compared the penalty rate for home teams to that of away teams. I used the t-statistic, with a two-sided distribution and assumed unequal variances; a conservative model. This may be a bit pedantic for the Roarers but it is important to disclose the methods we used.
The average penalty rate against home teams for the entire tournament was 9.136 penalties per game, while that against away teams was 10.192. The difference is highly significant, with a probability that the difference is due to chance of only 0.28 percent.
Using percentage of penalties, the home team only suffered from 47.54 percent of all penalties. The confidence interval does not cross the 50 percent mark, which again sustains the significance of the result.
So, Harry, home teams get on average one more penalty per game. Do you think this is fair or bias? Is this the result of bias or game strategy?
Harry: I was unimpressed. But I’m just a rugby poet.
When Carlos told me the home team gets an average of a one-penalty edge on the visitors over the course of a season, I thought about the Stormers in 2015.
In 2015, the Stormers had a scrum that simply held up better under examination than any other. Penalties at scrum time are rife. Many referees feel the way to encourage a stable scrum platform is to ping the weaker scrum.
Carlos: I had to educate Harry on a few statistical definitions, but I was inclined to agree with him (that many factors lead to lopsided penalty counts) and we did not rush to any sort of hypothesis or conclusion. Instead, we mined more numbers.
On average, there were 19.3 penalties per game in 2015 Super Rugby. We decided to look for ‘outliers’. We agreed outlier matches two standard deviations from the mean were (and this is a technical term) ‘quirky’.
By statistical definition, about ten percent of games will be quirky. Thus, matches with more than 26 or less than 13 penalties were of games of interest.
Another way of looking at this was by observing the outliers in the percentage distribution of penalties. In this scenario, any match with 70 percent of penalties favouring the home team or less than 25 percent favouring the home team was seen as quirky.
A third way to look at outliers was at the difference in the number of penalties, which would give results similar to the percentages.
But Harry, is it acceptable that the range of penalties from the average could be as high as nine per match? I would have expected a more narrow range for this.
Harry: I was fine with approach to first find the quirky game and then find the refs in the quirky games, because I expected quirky games to create quirky results.
Some rugby matches are just strange. Think of one Rugby World Cup semi-final in England last year: on as neutral a field as possible, Kieran Read was penalised by a French referee only one less time (five penalties for Read) than the entire Springbok team (six penalties). It was a strange game: 87 kicks from hand, heavy rain, and very little space.
The All Blacks simply refused to let the Boks develop any flow; they pulled a reverse trick by smothering South Africa. Give them three, but never seven. It worked. But the winning team were penalised seven more times than the loser.
So, I was just going to look at each game Carlos identified as quirky, and find the real life reason for the anomaly.
Carlos: I kept my opinions on a shelf. I was just trying to frame the issue for our study. And Harry is hard to manage! He kept sending exotic stats and rugby musings.
Back to our study: outliers did not equate to bias, and non-outliers were not necessarily non-biased. I was only trying to define a quirky game and find consensus with my co-author.
The following table shows the penalty count by team when playing at home or away. You will notice that most teams are penalised less when playing at home.
Because of the sample sizes, it is unlikely that any of these results are significant and they are most likely related to regression to the mean. However, you can observe that some teams are penalised less. Harry will tell you why.
Table 1 – Average penalty counts at home or away (from most penalised to least)
Team Home Away

Chiefs 11.125 12.556

Brumbies 9.750 10.700

Highlanders 9.556 9.800

Hurricanes 9.800 9.375

Crusaders 11.125 9.750

Bulls 10.250 10.375

Cheetahs 8.625 11.500

Reds 9.625 10.250

Waratahs 7.667 10.750

Force 8.500 10.875

Blues 9.375 9.875

Sharks 8.375 10.500

Lions 8.625 9.125

Rebels 7.250 9.250

Stormers 7.556 7.875

 
Harry: Stormer prop Steven Kitshoff was only penalised six times in the 2015 season. Playing fewer minutes, Ben Franks (22), Reg Goodes (21), and Benn Robinson (19) were penalty-producing machines for opposition teams. I assumed that Kitshoff scrummed better at Newlands, on turf he knows better, after he slept in his own bed the night before, and ate his mother’s porridge.
Also, the 2015 Stormers held the ball in the scrum to milk the penalty, scored more penalties than any other team, and conceded the least.
So, if a team plays for penalties, I assumed they executed that strategy better at home; and if a team had problems with infringement, I assumed they would infringe more on the road, in front of a hostile crowd cheering for their failure.
On the other hand, a high-penalty team like the Chiefs, who led the penalties conceded list in 2015 and are only behind the Sunwolves this year, treats penalties like a power forward in the NBA treats his six fouls: use them all, get your money’s worth.
In 2015, four of the top ten penalty-conceding players were Chiefs, and three of those were loosies (in a category normally dominated by props).
The Chiefs make a mess of the breakdown, while the fetcher-free Stormers stand off from the rucks. This is a philosophical choice, and both good and weak refs will react, but it is unlikely the Stormers in 2015 were ever going to be highly penalised week in and week out, and the Chiefs were fine with it, as long as their opponents had messy rucks and gave up two or three turnovers a game for the Chiefs’ elusive backs to exploit.
There is more than one reason a team concedes a high rate of penalties, of course. An overpowered team like the Sunwolves is conceding the most in 2016 Super Rugby, but it’s definitely not their game plan. As Tolstoy wrote in Anna Karenina, “All happy families are alike; each unhappy family is unhappy in its own way.”
Carlos: Whatever. Basically, Harry ignores statistics and rationalises everything. That’s why he feels at home on The Roar!
Harry: Actually, I saw the points made, and I knew Carlos was on to something here.
Carlos: We then looked at the individual referees. Based on the average counts by team, it is easy to assume that the distribution would be similar. There were only 15 referees (Steve Walsh retired during the season and the other referees had similar workloads).
But there are important differences.
Table 2 – Average Penalty Counts by referee against home or away teams
 Home Away Total pen count % Pen Pen difference

Berry (SA) 8.375 10.500 18.875 0.459 -2.125

Joubert (SA) 9.111 10.778 19.889 0.452 -1.667

Briant (NZ) 9.250 12.167 21.417 0.436 -2.917

Fraser (NZ) 9.222 9.444 18.667 0.495 -0.222

Gardner (NZ) 7.909 9.545 17.455 0.462 -1.636

Hoffmann (OZ) 8.300 10.400 18.700 0.442 -2.100

Jackson (NZ) 9.818 10.818 20.636 0.480 -1.000

Lees (OZ) 9.400 9.200 18.600 0.510 0.200

O’Brien (OZ) 11.500 10.250 21.750 0.528 1.250

O’Keeffe (NZ) 11.000 9.400 20.400 0.542 1.600

Peyper (SA) 8.231 10.615 18.846 0.436 -2.385

Pollock (NZ) 9.444 9.444 18.889 0.506 0.000

van Heerden (SA) 10.000 9.833 19.833 0.506 0.167

vd Westhuizen (SA) 8.800 9.000 17.800 0.493 -0.200

Walsh (OZ) 10.000 9.000 19.000 0.525 1.000

 
There are a couple of figures, even within the averages, that should be of interest to everyone.
One is the penalty difference per match by referee. The average for the entire cohort was a difference of 1.1 penalties per match.
Nick Briant (New Zealand), Jaco Peyper (South Africa) and Stuart Berry (South Africa) had the highest differential, well above two penalties per game. In 2015, a home team refereed by any of them could have reasonably expected a two-penalty advantage; a significant plus, in most games.
Harry, a good referee should have very few outliers per season. Statistically, it is safe to assume that most referees would have outliers but that the number of outliers per ref should be small too.
Harry: I agree. But by nature I am skeptical. Also, even though I accept the premise that a referee should not be on the aforementioned list (this game is not all about referees;,a fact some of them forget), I thought Berry was mostly just out of his depth.
I remembered a Stormers-Rebels game in Cape Town in 2015 where the Rebels were only penalised by Berry four times (and the Stormers – the least-penalised team in the competition – were penalised 11 times).
Berry loses control of matches at times, is tentative in his rulings at first, and then is too stubborn when things go pear-shaped. This is a poor combination for a referee (they should run a tight ship at first, and then allow play to flow later).
So, I was open to the facts.
Carlos: Following our philosophy of finding the quirky games and referees prone to quirkiness, I found that the highest penalty counts per game were:
1. Nick Briant awarded 32 in a Bulls-Force game (ten against Bulls, 22 against Force),

2. Kiwi Glen Jackson with 29 (12 against Reds, 17 against Waratahs)

3. Rohan Hoffmann with 26 (13 against the Force and 14 against the Waratahs).
The lowest count in a match was 12 penalties for Fraser (five against the Force and seven against the Stormers).
Harry: The Force (at home) and the Stormers (anywhere) in 2015 were trying not to play much rugby, not concede penalties, and generally bore fans to death. But then what about that Bulls-Force game? I couldn’t explain that one away. It’s super quirky!
Carlos: Certain referees’ names started to be cropping up a lot, as we drilled down on the outliers. For instance, in unusual penalty distribution, we see the names of Messrs Berry, Craig Joubert and Andrew Lees multiple times.
Table 3 – Matches with unusual penalty distribution
Home Pen Away Pen Referee Penalty %

Sharks 4 Crusaders 14 Berry (SA) 0.222

Lions 3 Highlanders 10 Joubert (SA) 0.231

Stormers 5 Brumbies 17 Berry (SA) 0.227

Stormers 5 Cheetahs 16 Joubert (SA) 0.238

Stormers 3 Brumbies 13 Peyper (SA) 0.188

Bulls 12 Hurricanes 6 Lees (OZ) 0.667

Reds 12 Brumbies 6 Lees (OZ) 0.667

Chiefs 17 Cheetahs 8 Joubert (SA) 0.680

Hurricanes 14 Crusaders 6 Fraser (NZ) 0.700

Cheetahs 12 Stormers 5 Joubert (SA) 0.706

Stormers 11 Rebels 4 Berry (SA) 0.733

Crusaders 13 Hurricanes 6 OâKeeffe (NZ) 0.684

Stormers 11 Lions 5 Berry (SA) 0.688

 
A slightly different way of looking at this information is by the outliers in penalty differential. You will see that it overlaps significantly with the prior table.
Table 4 – Largest Penalty Differential between teams
Home Pen Away Pen Referee Difference Penalty %

Chiefs 17 Cheetahs 8 Joubert (SA) 9 0.680

Hurricanes 14 Crusaders 6 Fraser (NZ) 8 0.700

Brumbies 8 Reds 17 Gardner (OZ) -9 0.320

Stormers 8 Blues 17 Joubert (SA) -9 0.320

Cheetahs 6 Sharks 15 Pollock (NZ) -9 0.286

Bulls 10 Force 22 Briant (NZ) -12 0.313

Sharks 5 Chiefs 15 Gardner (OZ) -10 0.250

Sharks 4 Crusaders 14 Berry (SA) -10 0.222

Stormers 5 Brumbies 17 Berry (SA) -12 0.227

Stormers 5 Cheetahs 16 Joubert (SA) -11 0.238

Reds 8 Chiefs 17 Fraser (NZ) -9 0.320

Stormers 3 Brumbies 13 Peyper (SA) -10 0.188

The first match in this table is interesting. The local team, the Chiefs, got penalised nine times more than the visiting Cheetahs, and the referee was Joubert.
Harry: Well, that game in Hamilton was an ill-tempered affair, played on a greasy field. Quirky games can result in quirky penalty counts. So, in this game, the Chiefs scored in two minutes through some sublime play by Sonny Bill Williams on his return, ending in a ten-phase try by Michael Leitch, and before 20:00 Liam Messam scored again right up the middle.
The Chiefs, up by two tries, started to niggle and chatter, and the Cheetahs were all too ready to lose focus (they are the attention-deficit champs of South Africa).
The Cheetahs mauled and scrummed their way back in with Ben Tameifuna suffering a leg injury, and the Chiefs having no answer to 2015-style mauls by the visitors. Tameifuna and Messam were yellow carded in a first half with 12 Chief penalties.
The game was a tale of two halves – penalty-wise – because the only yellow card in the second half went to Heinrich Brussow, for serial ruck infringements, and the Cheetahs were pinged five times at scrum time. With Brussow out, the Chiefs scored two quick tries to take the cake.
I think if you watch the match, the decisions were sound, although Joubert can get a little fussy with the maul.
In fact, I think all our quirky games are explicable when reviewed, except for Berry’s, and even some of his are examples of strange play by the players. But I do think some of his quirky games are truly outside normal.
Carlos: I mentioned before that any particular referee should have only few outliers. I would suspect that a referee with many outliers has match control problems. I am sure you have reasons why each case is unique and should be discarded.
But as I tell my kids, when one teacher hates you in school, the teacher is a problem. When all teachers hate you, you are the problem. So, to have one outlier is not a problem, to have many outliers, you are the problem!
Look at Berry’s matches:
Home Pen Away Pen Difference Ref neutral or non-neutral Match penalty count Penalty %

Cheetahs 11 Bulls 12 -1 N 23 0.478

Hurricanes 11 Rebels 8 3 N 19 0.579

Sharks 4 Crusaders 14 -10 NN 18 0.222

Lions 8 Sharks 10 -2 N 18 0.444

Stormers 5 Brumbies 17 -12 NN 22 0.227

Cheetahs 6 Highlanders 14 -8 NN 20 0.300

Stormers 11 Rebels 4 7 NN 15 0.733

Stormers 11 Lions 5 6 N 16 0.688

Harry: Yes. Berry finds trouble. That being said, Berry drew a few tough games. The most dramatic was the one where the Crusaders absolutely hammered the Sharks 52-10 (eight tries to one). The original referee scheduled, Marius van der Westhuizen, had to go to hospital with a burst appendix, and Berry filled in at short notice.
In games like that, the embarrassed side (particularly a physically abrasive team like the Sharks) being booed by their home fans, can become irate. Also, the ascendant team can lose control, too (settling old scores and coming up with the cutest chirps). Berry handed out four cards, including a straight red to Jean Deysel. They had to get more chairs for the sin bin.
The thing in that game was that the Saders went up 28-3 by 35:00 and then all hell broke loose. During the last five minutes of the half, there were at least three obvious shoulder charges by the visitors and a very dangerous (and stupid) knee to the head by Deysel, just when the Sharks might have clawed their way back into the game (15 versus 12).
I think all his card decisions were fair (the only one that was 50-50 was a card against Nemani Nadolo for kicking a ball away after a Sharks’ penalty). Of the penalties, almost all were non-controversial, except I think the Crusaders were a bit hard done by at the ruck (eight penalties) and the offside line (three times).
As to the other Berry games, the trend I see is that he tends to lose control of bad-tempered games. For instance, the Brumbies came to Cape Town with a clear plan to step up physically to the rugged Stormers and there were at least ten episodes of handbags, a tip tackle by Jordan Smiler on Schalk Burger, and many late hits. Still, if Christian Lealiifano had kicked the simplest of conversions over at the end, it would have been a 26-25 win for the visitors.
Also, Berry penalised the Stormers (at home) seven more times than the Rebels in Game 99 and I cannot see that his penalty decisions ever definitely gave a loss to a team that should have won in 2015.
But these numbers, allied with my own observation of the quirky games, cause me to believe that Berry has a big question mark on him and both Lees and Peyper might want to groove their practices more in line with their cohorts, or risk being seen as outliers.
Carlos: What can we conclude? We can only state unequivocally that home teams had a 1.1 penalty advantage over the visiting team in 2015.
Is this fair? Is this the result of different tactics when playing at home or away? Or is this home bias?
We cannot make those pronouncements. This analysis does not address cause, only relationships.
The other thing that we can conclude is there are a few referees that appear to have unusual penalty counts and/or distribution more often than most.
If I were a coach, I would not want one of those referees in my match.
Harry: But, Carlos, there’ll always be outliers, no? We’re just talking about degree, now? We want to get rid of outlandish referee decisions, but there’s no way to homogenise all these games and emotions and tactics and strange days at the office, no?
Carlos: What can or should we expect?
Again, can we have 70 per cent of matches finish with a penalty count differential of plus or minus 9? I personally think that at the Super Rugby level this range is way too large.
Most matches should be expected to have a narrower range. I also think that every referee will have an outlier match, and as you rationalised, they may have more than one. But if a referee has too many outliers, then something is not right.
Finally, do we accept that home teams do get one more penalty favouring them per game? What do you do with referees that have a consistently three penalty difference?
Harry: Great work, Carlos. My mind is totally open to the facts; I just make sure things run through a gauntlet of skepticism first. What do all you Roarers think? How would you answer Carlos’ questions?