AWS for M&E Blog
Advancing Next Gen Stats: 2024 Big Data Bowl analyzes tackling performance across the NFL
If you’re a professional or aspiring amateur data scientist, the Big Data Bowl should be on your radar. This annual competition, now in its sixth year, provides an open platform for engineers, data scientists, students, and analytics enthusiasts worldwide to get involved in football analytics—whether or not they have any sports experience.
The Big Data Bowl challenge is this: Use the NFL’s Next Gen Stats (NGS) powered by Amazon Web Services (AWS) to analyze and rethink trends and player performance, while also advancing the way football is played and coached. The reward: winners share a prize pool of $100,000—and the opportunity to present results to the NFL at the league’s Scouting Combine in Indianapolis.
2024 theme: Tackling
Past Big Data Bowls have analyzed running backs, defensive backs, special teams, and pass rush plays, and have generated metrics that are regularly used on television and by NFL teams. The focus of the 2024 Big Data Bowl is to examine tackling performance across the NFL. Participants have access to data from weeks 1-9 of the 2022 NFL season analyzing the location, speed, and acceleration of all 22 players on the field, along with the location of the football during a given play. Other Pro Football Focus (PFF) scouting data and advanced statistics such as expected points and win probability will also be included.
“We have a lot of metrics focused on players that have the ball, or defensive players tasked with something specific, but we haven’t explored the actual act of bringing the ball carrier down,” said Mike Lopez, Sr. Director of Data and Analytics at the NFL and co-creator of the Big Data Bowl. “We want a metric to quantify which players are good tacklers and what makes for a good tackle.”
The NGS team is looking to go beyond data that’s available today and shed light on areas like tackle probability, which defenders are likely to make tackles in which locations, which angles players use to approach a tackle, or their acceleration and closing speed on a tackle. They’d also like to better understand events like group tackles or total tackles at the end of a game, and who gets credit.
“A lot of being a good defender is being in someone’s way, or in football terms, ‘setting the edge,’” said Lopez. “When you set the edge, you usually don’t make the tackle because the running back goes the other way. That’s the kind of thing that goes into the whole process of making a tackle that you can’t get from public data, but using Next Gen Stats, we hope to get there.”
Back for the 2024 season’s Big Data Bowl is the coaching-centric track, encouraging coaches to partner with data scientists on a submission. Additional tracks include the undergraduate track, open to groups or individuals composed entirely of undergraduate students, as well as the metric track, requiring contestants to create a metric that assesses performance or strategy. For each track, contestants may focus on offensive or defensive players and develop insights on either an individual or team level.
Mentorship and a foot in the door
A key component of the Big Data Bowl is a mentorship program, in which data scientist mentees are paired with NFL analytics experts and invited to join a four-month program to create a submission that includes analytics support and networking opportunities. “A number of these mentees have been hired in sports, and several of them work at the NFL League office now,” said Lopez.
He added, “We’re incredibly appreciative of the folks who enter our competition each year. And their participation can be the most important line in their resume if they want to be a sports data scientist—more important than a GPA or a lot of other things that would typically get them a job. They’re judged on the quality of their work. When teams go to hire they have a sense of what that person can do and how they can help the team.”
Evolving the Big Data Bowl
The main goal of the competition is to provide analysts with more data every season, building on what exists and extending it to unveil new dimensions of game, team, and individual players. The theme of last year’s competition, for example, was Pressure Probability—data that deals with pass plays from the moment of the snap until the offense gets rid of the ball. In creating that model, all of the data that happened after the pass or run was completed was cut out— but it’s that data that will be useful for this year’s tackling challenge.
“We have some really complex models that wouldn’t have gotten there without the building blocks from previous big tables. We just keep building and extending with each project,” said Lopez
The fast advancement in machine learning and computing power has also helped the Big Data Bowl evolve by making it easier for analysts to tap into the data. “Tracking data is hard. You don’t just open a spreadsheet and make a pivot table. It takes time to do all of that. Folks starting out now, with the sophistication of the models we already have and faster computing speeds, are starting off at a much higher level than the teams five or six years ago.”
Lopez explains how AWS plays a significant role in the expansion of NGS and the Big Data Bowl. “Our football analytics team relies on AWS every day for our cloud computing structure. In the competition, they help us to implement the technical aspects of our work. How do we get these stats in real time? How do we share them?”
“This competition wouldn’t exist without AWS. We exist because we want to generate new ideas in football. They extend to our network, to our Next Gen Stats team, and on the air. AWS is the big pivot that helps us get that all done.”
You can learn more and follow the competition here.