AWS for M&E Blog
Tackling Next Gen Stats: How AWS is using AI to advance sports analytics with the NFL
Football, a sport of strategy, athleticism, and heart-pounding moments, has always been a game of numbers. Beyond the touchdowns, sacks, and interceptions, there’s a world of hidden metrics that shape the dynamics of each play. We sat down with the AWS Generative AI Innovation Center (GAIIC) to unveil the intricacies behind ‘Pressure Probability,’ the latest AI-powered stat developed in collaboration with the NFL’s Next Gen Stats team.
Introduction
In the dynamic world of professional sports, the quest to gain a competitive edge has shifted from the weight room to the data room. The National Football League (NFL), in its ongoing pursuit of excellence, has embraced cutting-edge artificial intelligence (AI) and data science techniques to extract meaningful insights from its troves of raw data. As we’ve seen through the Amazon Web Services (AWS) and Next Gen Stats (NGS) partnership, deciphering the intricate patterns of player movement, actions, and interactions on the field is essential for both strategic analysis and enhancing the fan experience.
In a glimpse behind the scenes, we delve into a conversation with the engineers and data scientists who help spearhead the NFL’s AI revolution. With an intricate blend of technical expertise, domain knowledge, and innovative problem-solving, they are transforming raw data into actionable insights to reshape the game of football.
Pressure probability and beyond: A deep dive into Next Gen Stats
The NFL’s Next Gen Stats initiative encompasses several ambitious projects, each driven by its unique challenges. Among these projects, the pursuit of defining what pressure is during a passing play and how it affects the game, has stood out as a fascinating confluence of AI and football knowledge.
Inspired by submissions from the 2023 Big Data Bowl, an annual open-source competition organized by the NFL, the AWS and NGS teams set out to create a new suite of analytics, all pertaining to the analysis of pressure. Identifying and quantifying pressure created by an individual defender; the magnitude of pressure created over the course of the play; blocking responsibilities and credit; and identifying double teams and unblocked rushers are just a few of many of the insights derived from the fusion of player-tracking data, advanced AI techniques, and subject-matter expertise.
Figure 1: The pressure score estimation system estimates team-level and individual player-level pressure scores. It uses in-play information of defensive rushers and offensive blockers to estimate the scores.
To build this stat, the AWS Generative AI Innovation Center team leveraged more than 90,000 passing plays over the past five seasons provided by Next Gen Stats. This data, captured at a rate of 10 times per second, includes advanced player-tracking data points like the coordinates, velocities, and orientations of every player during every play. The challenge lays in distilling this data into a model that accurately predicts the probability of pressure while accounting for dynamic variables like blockers, defenders, and play formations.
This seemingly straightforward task poses a number of complexities due to the nuanced interactions of players, positions, and movements on the field as well as technical challenges not uncommon to most data science or AI initiatives.
Automating blocker and rusher identification
Meanwhile, on a parallel track, AWS engineers also worked on automating the identification of blockers and rushers with a goal was to develop models that could autonomously identify players’ roles on the field. This process traditionally relies on manual charting that is prone to label errors and often takes hours to generate.
The approach is a blend of machine learning techniques, starting from baseline models that use simple positional features to predict roles. Gradually, models evolved to incorporate dynamic variables like player speed, distance, and orientation throughout play. The pinnacle of their work is a graph neural network (GNN) that treats each player as a node in a network, capturing their spatial relationships and interactions to identify blockers and rushers on a given play.
Figure 2: For rusher and blocker identifications, AWS built an AutoGluon model that achieves greater than 99% accuracy by using positional and spatial temporal features such as x,y coordinate, speed, and orientation throughout the first 3 seconds of the play. In total, 183 features are used to build the model.
The art of sophistication and simplification
AWS initially explored neural network models to estimate pressure score and situational average player movement. However, the complexity of the underlying data and the inconsistencies in player positioning during a play necessitated both sophistication and simplification. For pressure score estimation, the approach evolved into a more focused one-player-at-a-time prediction combined with advanced feature engineering. This pivot led to significantly improved results, addressing the model’s instability, and enhancing the probability estimates for pressure events.
On the other hand, AWS developed a state-of-the-art transformer-based system for the estimation of situational average player movement. In the field of AI, a proper design choice often matters a lot. Sometimes a simpler model combined with carefully engineered features is more suitable for a specific task, and there are also tasks for which a state-of-the-art method needs to be employed. In the case of pressure, the simpler model yields better results.
Figure 3: The system estimates a situational average player scenario: “How would an average player have done in that situation instead of Player X?”. The situational average player estimation system estimates the movement of a hypothetical average player based on the other players’ behaviors on the field. It trained on ~200K play snippets.
The road ahead: Data-driven insights
The technical achievements unveiled by AWS scientists are just the tip of the iceberg. The journey of predicting pressure probabilities identifying blockers and rushers showcases the power of blending technical prowess with deep subject-matter expertise. The lessons learned, the models refined, and the insights derived through this data-driven journey open doors to a new era of football analysis that combines technological innovation and passion for the game.