The Stat Trap: Football, Analytics and the Legacy of Perspective
Advanced Stats, Metrics, Analytics, Next Gen Stats… by any other name, it’s all the same and it’s everywhere you look. We see it flashed on the screen and posted on Twitter. We hear this number thrown about and that stat thrown out. We talk 4th down calls on Monday morning and 4th down calls on Tuesday morning. They should’ve or shouldn’t. And it all just goes to show how much now we’re even seeing it in the game itself on a weekly basis.
More than a decade ago Bill Belichick was questioned for going on 4th and 2. Laughed at and dismissed. Now it’s a standard question where you can draw ire either and any which way, but is often more encouraged than discouraged. Meanwhile, Belichick has a gone from a loose cannon, radical to an old fuddy-duddy who won’t go for it on 4th inside his own territory. Back then Sean McDermott would’ve been skewered on call in radio shows and maybe even fired for refusing to kick a field goal at the 4 yard line with 22 seconds left in the game. Completely crazy!
Now? Well, it was probably the right call. What can you do? The guy slipped…but the probability was in their favor.
Insidiously “analytics” has not just worked it’s way into the game in the modern age, it’s taken over the psyche of the fans, announcers, commentators and pundits who debate these topics under the watchful eye of win probabilities and points expectation as they sit in judgment as the watchdogs of the way the game is to be played. Twenty years ago-- these were not even questions. So, how did we get here?
Well, it really starts with one man in a completely different sport: Bill James.
That may be a little too simplistic, but not by that much. James self-published annuals The Bill James Baseball Abstracts began asking fundamental questions about how to analyze statistics within the game starting with the key question: what is really important to success? What are the numbers we should be looking at? What are the true identifiers of individual production and how does that translate to team?
His groundwork eventually sparked a statistical revolution that slowly grew through the sport and, largely aided by a number of very sharp-minded individuals (such as a slew of writers at Baseball Prospectus led by Nate Silver), the “statistical” analytics trend exploded throughout all levels of the game. A concept once roundly laughed at and dismissed (sound familiar?) became the center-piece ideology of the sport. So widely accepted is it now that there are no longer vehement arguments by nearly anyone against the “pencil pushers” and “stat-geeks”-- something that was a very true reality not much more than 10 years ago, never mind 15-20 years ago. Sabermetrics have, in fact, become as commonly used and accepted as any baseball statistic such as batting average and saves.
In the words of The Maestro: Everybody knows the war is over, everybody knows the good guys lost.
And how did that happen? Well, simply, over time the evidence… the mathematical evidence... became too overwhelming, too widely accepted. The concepts were proven both on paper and on the field. It was simply undeniable to the point that you’d were looked at as a “Flat-Earther” if you argued differently. The numbers told a fundamental truth that could be constructed and deconstructed.
Suddenly a fringe concept was commonly accepted fact. And after it spread top to bottom through baseball, these analytical concepts jumped to applications in other sports where, because of their infallibility in baseball, they have become very widely accepted at face value.
The problem? They are not even as close to being as “proven”, or even sound conceptual, as they are in baseball. Despite all of the numbers, systems, stats and averages we see in the “advanced systems” for other sports (especially football), the key ingredient that many of them lack is a demonstrated link of causality. The fundamental question of: How do we know this translates to success? How do we prove that A is better than B? How do we know this stat, philosophy or measure leads to yards, improves chances for wins. That fundamental link that had to be and was established in baseball, has given way to stuff that pretty much ‘seems about right’ and ‘looks cool’ so there must be something to it And the validity that was earned over 20, 30, 40 years in baseball has basically been passed to any type of system or stat that is passed off in any other sports format now without that application of those rigorous standards. No one wants to really argue against it. Better to be thought a fool than open your mouth and remove all doubt. Or, worse yet, there really isn’t anything tangible to argue against.
Recently, ESPN did a survey of 22 people working in football analytics for teams around the NFL and the results are very interesting. While the entire survey has great insights into modern analytics in football, there are two Q & A’s that really stand out to hit this point succinctly. The first of which is:
When it comes to analytics, the average NFL team is ____ years behind the average MLB team
Average response: 9.8 years
Range of responses: 5-15 years
I would tend agree with both the 5 and 15 year timelines-- for different analytics.
When talking “game” and “tools” analytics, a 5 year difference seems very reasonable. By “game analytics” I am referring to how historical information is applied to the management of the game (such as EPA/ WPA). By “tools analytics” I am talking measuring physical traits that apply to game situations and tend to lead to player production (such as closing speed in football or spin rate in baseball).
In both instances, the NFL seems to be using, applying and accessing information in a similar way to what has been done in baseball. Both EPA (Expected Points Added) and WPA (Win Probability Added) are becoming commonplace in coaching decisions such as the aforementioned 4th down calls. NFL teams are going on 4th down at a record rate so far this season based on this information.
Tools analytics has been in football for decades-- starting with a little thing called the scouting combine. We know there are certain physical attributes that translate at the next level. And some teams have taken and placed more importance on some of these measurements than others (such as the Patriots emphasis on three cone times). Now, we are seeing the marriage of video, computer technology and algorithms to read these traits at work on the field and better provide in-game player breakdowns across a multitude of positions using real-time information that is impervious to ‘the trick of the eye’. .
However, as far as statistical analytics that use actual game result information to translate into useful player performance evaluation tool-- football is at least 15 years behind baseball. And that should even include the fact that little actual statistical work of significant relevance has been done in baseball in nearly 10 years. The statistical concepts were already refined to the point-- there simply wasn’t much left.
Therefore, as I posited before, we are much more in the Bill James days. Throwing out ideas and tinkering with theories. Somewhere closer to a beginning than we are anywhere near an end. Yet, instead of a lone gunman in a guard shack self-publishing his manifestos, we are now in the Infinite Internet Expert Theorem where a thousand monkeys at a thousand keyboards may just come up with something that is perhaps useful, if not Jamesian-- never mind Shakespearean.
There is no Bill James Godfather. There is no Baseball Prospectus type team out there. It has, so far, been a fairly disjointed and unorganized effort scattered across corners of the internet with real ingenuity applied, clear vision or uniformly established baseline.
And the experts inside the system, they seem to know it too as the second question that caught my attention was:
Which player-level metric in the public sphere is most useful for player evaluation?
EPA-based metrics/Total QBR (6)
Pro Football Focus grades/WAR (3)
Pressure statistics (2)
Approximate Value (2)
Target rate (1)
Yards per attempt (1)
Seven voters abstained.
Key phrase here is: public sphere. And this seems to convey two messages very distinctly 1) NFL teams are not using the same metric based information that is readily available and being spoon fed to the general public and 2) They don’t think very much of any of the information that is out there. The largest selection in this was essentially (7) for “None of the Above”.
What this tends to suggest is that most teams are primarily focusing on proprietary analytics that they want to control within the organization. Which is entirely understandable as gaining an advantage by ‘building a better mousetrap’ is a leg up on the other teams. However, this direct inverse proportionality to the systems built in baseball, which came from the outside in and then were tinkered with individually by organizations, means that much of the work that currently exists is unknown, unshared and unchecked.
The MLB model was entirely ‘open source’. It benefited by having everyone, who wanted it, access to it. That allowed it to become a ‘mathematical problem’ for baseball fans and non-fans alike and the research was shared as it was developed. And, no, there was no coincidence that it really burgeoned at the dawn of the internet age when it was a Wild West with most sites having no idea how they were ever going to make money and not generally concerned about that.. When the world began to become connected and these ideas could become ‘crowd sourced’ instead of ‘crowd focused’.
This combined with the easy to use, impossible to verify, highly (and perhaps deceptively) market machine, network/ league backed ‘pseudo-Analytics’ that have begun to corner the market have fairly well stifled open invention and actual work in the field. Blazing a trail into a new frontier not widely employed inside the sport at the time and educating peple along the way is one thing. Convincing a community that is already being trained on money backed and marketed “new concepts” that your concepts are more sound while each team itself is already investing heavily on developing and applying their own concepts is substantially a futile endeavor.
Meanwhile, the world of advanced stats in the “public sphere” has been sanitized, dumbed down and corporatized. It’s no longer about finding validity, it’s about selling your system with the assurance that it is already valid. And why is it valid? Because anything that comes branded ‘analytics’ is immediately validated in every doubters mind simply because it is analytical. Not like by the decades of work it took in baseball to prove the doubters wrong, even if that same rigorous and strenuous validation process has never even remotely been applied-- and just as these subscription services also choose to keep their processes and formulas ‘proprietary’ and thus unusable in application by anyone else they are also unable to be checked by anyone else for validity. What was designed and flourished by the sharing of ideas is now something akin to the Coca-Cola formula or KFC recipe. Which has really only served to make it a commodity. Want to feel like you have access to the “inside” or “real” information? Pay a fee and you will… or, well, maybe you will, kind of, sort of… just trust us.
So it’s now become take this certain ‘advanced’ information-- some of which must be blindly accepted-- or simply rely largely on traditional statistics. Most of which are culled from direct involvement numbers and therefore lack a certain amount of periphery that these systems purport to solve. And, while there is significant and very useful information that paints a wide and accurate story in the use of many of the statistics commonly available, there is little real work being done to define and refine any truly advanced systems that go beyond--- well, largely quarterbacks-- where most of the analysis out there that is readily available and most of the work done seems to primarily focus.
Why? Because the QB is the one position that provides the absolute most amount of data. Just like MLB hitters who log hundreds of at bats a year, Quarterbacks throw hundreds of passes: Yards, Touchdowns, Interceptions, Completion Percentage, Yards Per Target-- a cornucopia of easily accessible information. The rest of the positions? Not so much.
Meanwhile, the guys inside the system (as evidenced above) are virtually telling us flat out that the information out there outside of that position (EPA which is mainly applicable to QB and QBR received the most votes of the few votes that were even cast) is marginally important. What NFL teams are looking at and working on, is not what is being provided by any current outlet.
So, where may we go from here? What are possible changes and ways that the football analytics field could easily expand into more relevant data across the range of players using fairly accessible statistics across all positions in the game and perhaps provide a better model for teams and fans alike? Is there a “Jamesian Way” of refocusing the discussion? Is there a better mousetrap?
We’ll delve into some ideas in Part II...