*Forward only, at a preset rate
**Via her hands, usually
***When the water is contained in a glass
Latest posts by Doktor Zymm (see all)
- Something about Football and the Redacteds and maybe this year or whatever – August 22, 2019
- The Landover Snyder’s Team Preview 2018 – September 4, 2018
- The Drunken Trade War Friday Open Thread – June 29, 2018
Earlier this year, the NFL set a record for most accepted penalties through week 3. There has been speculation that this increase in penalties is due to fewer full-contact practices as mandated in the CBA, and further speculation that the refs are idiots that can’t manage a game clock and have no idea what a catch is.
I know that I’ve done my fair share of bitching about officiating this season, but I also recall complaining about calls every other season. This led me to wonder, is this year really any different? I know, I can use the power of STATISTICS!!
First, a round of applause (or applesauce if you prefer) and acknowledgment to the fine folk at NFL Savant for compiling play by play data and providing it free of charge to anyone with the skills to use a mouse.
This post will compare penalties from weeks 1-8 of the 2014 season to penalties from weeks 1-8 of this season. Rulings as to what constitutes a catch, or the refs attempts to travel through time using the game clock as a Tardis are not covered. This is ALL ABOUT THE FLAGS BABY!
Now, ONWARDS! TO THE DATA!
|Games : 121||Games : 119|
|Plays : 21736||Plays : 21678|
|Penalties : 1925||Pentalties : 1971|
|Plays per Game : 179.6364||Plays per Game : 182.1681|
|Penalties per Play : 0.08856275||Penalties per Play : 0.09092167|
At first things don’t look so hot. With fewer games and fewer plays than last year, there are already 46 more penalties. People frequently look at the number of penalties per game, but imo that is a dumb thing to do. Penalties are assessed on a per-play basis. During (or after) any given play, a player either does, or does not get called for a penalty. Therefore it makes more sense to look at penalties per play, or the percentage chance of a penalty being called on (or after) any particular play. Since games do not all have the same number of plays, looking at penalties per game is not an apples to apples comparison.
Looking at penalties per play evens things out a bit. There are more plays per game this year. This could be due to more teams using a hurry-up offense, but it could also be due to more teams repeating downs due to penalties. Whatever the cause of the increase, more plays gives more opportunities for penalties to be called. (I should note that a play is listed twice in the data if two penalties are called on the same play. This could bias the data, but given the infrequent nature of multi-penalty plays I’m ignoring it for now.) Looking at the penalties per play number, those look pretty close. Given the large number of plays, is there a significant difference between the two numbers? For the craps playing degenerates here, the chance of a penalty being called on a play is about the same as making a 6 or 8 hardway.
The standard tool for comparing summary statistics is called a t-test. Technically, it’s called the Student’s t-test, named after a guy who wrote under the pseudonym “A Student.” It’s a way to test simple hypothesis about the data. Here the hypothesis that we are testing is that the difference between the 2014 value of penalties per play and the 2015 value is 0.
Expand for more nerdy nerdness, you nerd
I mentioned earlier that the penalties per play number can be considered the probability of a penalty being called on any particular play. Thinking about it this way means the penalty data will be best fit using a Bernoulli distribution, with the sample value of penalties per play as the event probability. If you are remembering your Stats 101 class, you may think that we can’t use a t-test here, since the data is not normally distributed. We don’t actually need the data to be normal to apply a t-test, rather we need the value of the statistic we are testing to follow a normal distribution. Since we have a very large sample size here, the penalty probability value we are looking at will follow an approximate normal distribution.
Running a Welch Two Sample t-test on the penalty data for each year gives a p-value of 0.3899, which basically means we can’t reject the idea that the probability of a penalty is the same in both years. That there is weaselly statistics talk for “Same old shit, this year and last.” In general, people look for a 95% confidence level when making comparisons like this. That p-value gives the confidence level, though backwards from what you might think, we would want to see a p-value of <0.05 before considering the difference to be statistically significant.
“But Zymm! It’s not just about the overall number of penalties, what about penalty yardage? What about the types of penalties?”
Excellent points Other Zymm! Let’s take a deeper look, shall we?
Previously, I was only looking at whether a penalty was called, without considering whether or not it was accepted. When looking at yardage, we limit ourselves to looking only at accepted penalties, as no penalty yards are assessed if the penalty is declined. It turns out this doesn’t really matter, as penalties were declined at basically the same rate both years. At this point last year, 13522 penalty yards were assessed for an average of 8.034462 yards per penalty. So far this year there have been 14254 penalty yards assessed for an average of 8.229792 yards per penalty.
Super-Secret Made-up Bonus Statistic!
If we assume that the refs are awarded a touchdown for every 100 yards they assess in penalties, the 2015 officials are leading the 2014 officials 994-945. The 2014 officials are in field goal range, but they’ll need to make some halftime adjustments if they want to win this!
Due to the way penalty yards are assigned, it’s a little more difficult to compare yards/penalty year over year. Instead, I looked at the portion of penalties over 5 yards, over 10 yards, and over 15 yards (so basically pass interference calls only). There was no significant difference between any of these groups. This kinda makes me think that there won’t be a huge difference in the types of calls either. Have I been going “WTF IS UP WITH ALL THESE OFFENSIVE PI CALLS!?” unnecessarily all year?
|BLOCKED INTO PUNTER||1||NA|
|DEFENSIVE 12 ON-FIELD||25||28|
|DEFENSIVE DELAY OF GAME||0||2|
|DEFENSIVE PASS INTERFERENCE||124||117|
|DELAY OF GAME||79||75|
|FACE MASK (15 YARDS)||40||46|
|FAIR CATCH INTERFERENCE||1||4|
|HORSE COLLAR TACKLE||9||6|
|ILLEGAL BLINDSIDE BLOCK||4||5|
|ILLEGAL BLOCK ABOVE THE WAIST||44||66|
|ILLEGAL FORWARD PASS||3||2|
|ILLEGAL TOUCH KICK||1||NA|
|ILLEGAL TOUCH PASS||3||5|
|ILLEGAL USE OF HANDS||119||86|
|INELIGIBLE DOWNFIELD KICK||2||2|
|INELIGIBLE DOWNFIELD PASS||10||15|
|INTERFERENCE WITH OPPORTUNITY TO CATCH||1||NA|
|INVALID FAIR CATCH SIGNAL||0||1|
|NEUTRAL ZONE INFRACTION||56||67|
|OFFENSIVE 12 ON-FIELD||6||4|
|OFFENSIVE PASS INTERFERENCE||59||65|
|OFFSIDE ON FREE KICK||12||8|
|PLAYER OUT OF BOUNDS ON PUNT||3||11|
|ROUGHING THE KICKER||2||1|
|ROUGHING THE PASSER||47||54|
|RUNNING INTO THE KICKER||5||9|
There are some data issues here, the main one being the “Personal Foul” category in 2014. We can probably assume these are all unnecessary roughness calls. There are also a fair number of penalties that are only called a handful of times, which we can’t really do much with, so while it’s antecdotally interesting that there have been almost 4x as many “Player out of bounds on the punt” calls this year, there’s not really much we can say about that.
Methodology and Sample Size notes. Exciting!
When comparing the types of penalties called, we’re getting much more specific, so our sample size is decreasing. There are two factors to consider when deciding if your sample size is sufficient to confidently use a t-test, the overall number of observations and the frequency of the event. For the more common penalties, we can continue to use a t-test, though for the less common penalties we can’t assume the distribution is close enough to normal to use the t-test. In this case, the events will follow a Poisson distribution. We can still compare the ratio of two events, and test the hypothesis that the ratio is 1 (i.e. that the events occur at the same rate) but now we’ll be using an exact test comparing our test statistic with the binomial distribution.
Let’s look at some of the more common calls. It’ll probably surprise no one that the most common call is offensive holding. Just eyeballing it, it appears there are quite a few more offensive holding calls this year, oddly enough, counterbalanced by fewer defensive holding calls. Surprisingly enough, the difference in offensive holding isn’t significant, but the defensive holding difference is! So, our first significant result is that the refs are calling defensive holding less frequently than they did last year. Why? Who the hell knows, I don’t have a theory on that one.
There’s no significant difference in OPI or PI calls, which is probably good news, since these particular calls usually have a pretty large impact on a drive.
There are really only two other penalties that show a significant difference from last year. Defensive offside calls have significantly increased this year, so maybe there’s something to all that “Aaron Rodgers is a genius with his hard count, blah blah blah” stuff, though I’m too lazy to actually go through and look at offside by team. The other one is the decrease in Illegal Use of Hands calls. Feel free to speculate on the reason for that one.
There’s no significant change in Roughing the Passer penalties, so despite all those “OMG, he touched the QB’s helmet, throw ALL THE FLAGS” calls, they were doing that last year too.
Last but not least, a quick break down of when penalties are called, by quarter and down.
|Q1 : 408||Q1 : 412|
|Q2 : 564||Q2 : 586|
|Q3 : 463||Q3 : 460|
|Q4 : 483||Q4 : 503|
|OT : 7||OT : 10|
|No Down(Kickoffs, Extra Points) : 86||No Down(Kickoffs, Extra Points) : 78|
|1st Down: 625||1st Down: 662|
|2nd Down: 494||2nd Down: 498|
|3rd Down: 478||3rd Down: 469|
|4th Down: 242||4th Down: 264|
The main thing I find interesting here is the data for the second quarter. There are significantly more penalties called in the second quarter than any other quarter. My interpretation here, the second quarter is frequently the most competitive part of the game. It’s rare that a team is totally out of it by the half, but the urge to keep the score close going into half time might lead to more bending of the rules. The same patter doesn’t emerge in the 4th quarter due to garbage time.
I’m not really going to go into the down data. The large number for 1st down is a bit misleading, as there are more 1st downs than 2nd, 3rd and 4th.
So this seems long enough already. To conclude, officiating is pretty much as annoying this year as last. If you want to feel smarter than your friends, complain loudly next time there’s an offside call.