nflfastpy is a Python package I manage for loading play by play data in to your pandas code. The package is really simple, all it does is pull from
@mrcaseb
and
@benbbaldwin
nflfastR-data repo, but the images below exemplify why I took the time to create the package. Before/after
Ok, we’ve decided - Huddlevision beta will be released in May. It will be a paid API which’ll give easy access to our models for analyzing game footage. No knowledge of AI or deep learning required to use :)
I'm still building out the dataset for this and I need some help
Give me your most complex, hard-to-answer NFL stat questions you'd want to ask this chatbot and I'll add it to the training set (within reason)
eg. What was Russell Wilson's completion percentage on passes of 5 or
We’re planning on shipping a LLM app that’s connected to football databases to help writers, analysts, and stats nerds (it will be a pretty web app won’t look as scary as this). Ask any football question and you’ll get back stats, tables, charts with no code.
There will of
A quick demo of huddlechat, showcasing how it could be used for querying NFL stats and writing articles. Beta coming soon! (with ability to generate plots)
Huddlechat is a natural language interface to NFL stats powered by AI. We are slowly rolling out the beta to users
Here's a thread showcasing what it can do so far for 10 offense player stats (I'll make another thread for defensive players) ⬇️
Very excited to announce our first partner for
@HuddlevisionAI
Lots of hard work leading up to this point and very excited to be working with the Tracking Football team on this!
We are excited to announce that Huddlevision is partnering with
@TrckFootball
to develop computer vision technology to enhance their HS and NCAA football data offering! An official announcement will be held at the Recruiting and Personnel Synposium by the Tracking Football team.
Another showcase of the field registration model. The model was trained to learn *where* on the field we are, given a video of a play. The field map overlay is just a nice visual
Very important task and turns out to be very difficult for an AI system to learn to do precisely
Will be posting more details about this in the coming weeks, but:
Fantasy Data Pros is hosting a Best Ball Data Bowl with the help of
@peteroverzet
! Goal is to find the best insight from Underdog data for BBM4
Early registration is available here:
Giveaway! - RT & like to enter for lifetime access to my Learn Python with Fantasy Football Course🐍🏈 16 modules and 16 hours of video teaching you the basics of Python all the way to building models for your draft. Available here:
Under the hood, the 3-point model fits a clustering algorithm called a Gaussian Mixture Model to each player's shot location data.
It finds the optimal shape and # of clusters through cross-validation.
Even if you're not a data scientist, it's still a pretty picture:
A new Python tutorial has been posted to the FFDP blog! In this one, we show you how to use matplotlib to plot out
@jamalagnew
‘s path on punt returns. Link to the post is in the next tweet below
#NextGenStats
👇👇
If you’re interested in working on huddlevision shoot me a DM!
Looking to hire a junior data scientist / intern in June. Position would be part-time, probably 10-20 hr commitment per week.
The Win at Sports Betting with Python series is up!
Learn how to use Python to analyze player props with the new free blog post below. ln this post, we use Python to analyze Jalen Hurts O/U 31.5 pass attempts (+100 on Draftkings)
@tejfbanalytics
Not an expert on this, but I’m pretty sure - Packers are a corp, which is a separate legal entity from its owners so the entity itself puts up the escrow, whereas other teams are structured as partnerships. Source: I used to work for a firm that did the taxes for the Dolphins.
Decided to take a look underneath the hood of the Python three-point model:
These are 9 of the 100,000 simulations that were ran by the model for Jordan Poole vs. LAL tonight.
Under 3.5 3PM is currently -140 on Draftkings. Model gives the under +EV and 15.5% edge👀
🚨Happy to announce the winner / winning submission of the Best Ball Data Bowl:
Dan Falkenheim (
@thefalkon
)’s “Surveying the Avalanche: what happens to the draft board when there’s a run at a position?”
Once again, thank you to everyone who entered the competition, especially
Dan Snyder sells for $6B after purchasing the team in 1999 for $750M
That’s a 9.05% compounded annual growth rate while the S&P 500 returned around 7% during the same time frame.
A thread on franchise valuations:
Here are the 5 finalists for the Best Ball Data Bowl. Very tough to pick 5 we wanted to invite for Pete’s show, lots of great submissions. Tried to get a mix here of best ball fundamentals focused entries and data science focused entries
Congrats to the finalists and thank you
If you’re interested in working on huddlevision shoot me a DM!
Looking to hire a junior data scientist / intern in June. Position would be part-time, probably 10-20 hr commitment per week.
To kill time during the offseason, figured I'd put together an NBA model and do a betting challenge. I'll be setting aside $1,000 and attempt to get an ROI using just Python-driven analysis. First task was pulling lines for each book to find the best odds for each matchup
What I envisioned for my life: me having a blast training AI models for football and playing the new NCAA in between
Reality: a million CUDA errors and yelling at my TV after throwing 5 picks against an FCS school on Heisman difficulty
Don't have GPU access so my computer might explode, but going to take a shot at extracting the first CSV of tracking data from huddlevision
Will post link later, person who correctly guesses the play will get lifetime access to huddlevision
Update on huddlevision (good news and bad):
Good news: we will be releasing an AI product soon and on schedule (it will be a good and affordable product, as exciting as stuff I've been posting here)
Bad news: it will not be our computer vision software
For the time being, our
@WeHateRob
It’s one part of a computer vision / AI system that extracts tracking data from game footage. This part maps a template of the NFL field onto a play image. It makes more sense with all the parts together, but this is just one piece I’ve been working on for a while
Here's another way to look at the clustering algorithm (GMM) that is core to the threes model:
GMM is great cause it gives us a generative model that describes the distribution of our data (This is important because we can use it to simulate shot locations around the court).
Damian Lillard's legendary performance last night:
13 three-pointers made, one shy of Klay Thompson's single-game record of 14 in 2018.
Here's how the three-point model simulated last night's game:
Huddlechat is already teaching me new things. 🤣
Apparently, Mac Jones and Russell Wilson were
#2
and
#3
in terms of EPA per dropback when trailing by one possession in the 4th quarter (min. 25 dropbacks)
Using
@UnderdogFantasy
data we started looking at how different stacking configs affected outcomes.
We found that, in general, stacking did little to improve mean roster output or right-tail outcomes. If you drafted a top-5 QB by ADP, though, stacking improved your results.
BYO game footage. You’ll be able to build your own projects by leveraging our fine tuned models - analyze routes run, formations, speed/acceleration/separation metrics, biometric stuff if you’re fancy. All the stuff you can do with tracking data basically, maybe a little more.
Wisdom of the crowd does not actually mean the collective is smarter
It means that when you aggregate everyone's predictions together, *you cancel out everyone's specific biases and errors*, which leads to a more balanced prediction than any one individuals prediction
1. ADP is literally a player's value
2. Wisdom of the crowd >>>>>>> your own personal takes
3. Google "Galton 1906 ox" to learn about how 800 people accurately predicted the weight of an ox at a state fair when you averaged their guesses together
First entry to the Best Ball Data Bowl has been submitted by
@phtousi
:
"Predicting Best Ball Playoff Teams with the Best Ball Value Curve and a Stacked XGBoost Model"
Linked Philippe's work in replies. Haven't had time to review in full yet but seems like a strong submission!
Btw if you want to learn more about this cool AI stuff I post, or are just interested in learning data science / how to code, I teach a course on applying Python to fantasy football over at Fantasy Data Pros
We just crossed 5000 course members this week since I started it in
A quick demo of huddlechat, showcasing how it could be used for querying NFL stats and writing articles. Beta coming soon! (with ability to generate plots)
Amazon's field registration dataset for football is 835 images for train and test (largest on record I could find, although no open source datasets exist).
We're at 147 images, on track for 1000 by end of week, 5000 by end of month
The next course on Fantasy Data Pros will be on building a NFL betting model with Python.
It will be in depth course on supervised machine learning, specifically classification.
I feel like you can read through all of the Best Ball Data Bowl submissions and come out a reasonably sharp player in BBM. Personally learning a ton reading everyone's work. Very cool to see
Designing the front page of huddlechat rn. What do you think?
We have all the backend models built out and ready for the beta, just working on the user interface now
We are going to be deploying huddlechat soon, slowly (over the next couple weeks) to people who have requested access
However, if you'd like to be a *paid* beta tester (short term position), let me know through DMs that you're interested and I'll send you a google form to fill
From
@loudogvideo
's Best Ball Data Bowl submission (learned Python with our course btw), playoff advance rate by # of WRs with the same bye week:
You can read Lou's full entry, "Do Bye Weeks Matter?" in the link below.
Used the new 3-point model to simulate Jayson Tatum 3PM against every defense. Did 10,000 simulations per team.
Tatum 3FGM on Thursday (vs. IND) will be one of the first props we take with this model. Waiting patiently for lines to open 😬
Here are the results:
This Best Ball Data Bowl submission (Surveying the Avalanche: What happens to the draft board when there's a run at the position?) by
@thefalkon
is very, very cool
Posted link to the entry below, definitely check it out:
One of the coolest aspects of the Best Ball Data Bowl was seeing people with less followers but great ideas being given the floor to show off their analysis and methodology. Throughout the summer, we invited several people onto
@peteroverzet
show to discuss their submissions,
Don't have GPU access so my computer might explode, but going to take a shot at extracting the first CSV of tracking data from huddlevision
Will post link later, person who correctly guesses the play will get lifetime access to huddlevision
Lines started to show up yesterday for NBA player props (finally). Here's the initial predictions of the Python model for Thursday's games.
Couple +EV bets that stand out including Jordan Poole under 3.5 3PM on Fanduel (-140)👀
Will be sharing this sheet w/ full list of props!
✅ Jordan Poole finishes 2/7
Happy with the results of the Python model for day 1. We missed Kyrie over 3.5 by a single 3PM, but other than that finished with decent accuracy on +EV bets.
Only one day of data but a decent start so far.
Decided to take a look underneath the hood of the Python three-point model:
These are 9 of the 100,000 simulations that were ran by the model for Jordan Poole vs. LAL tonight.
Under 3.5 3PM is currently -140 on Draftkings. Model gives the under +EV and 15.5% edge👀
🚨 A new tutorial is up on Fantasy Data Pros, a guest post from Josh Cordell
@fantasycalc1
showing how to use their publicly available trade value API.
Super cool and unique data to work with, def check out their site if you haven't already!
Post:
Updated Python model simulations results for Thursday's games. Added a column for EV. Highest quality plays I see:
Kyrie Irving over 3.5 (+140 on DK): 0.45 EV
Jordan Poole under 3.5 (-145 on DK): 0.26 EV
Will focus on these two props while I spend time debugging and tweaking!
@EricHallman1
@theloaner11
@GuyDealership
@charliebilello
Idk maybe overlay a chart of the 10 year and you’ll find your answer? When interest rates go up, Wall Street demands companies show operating leverage. That is - free money party is over and companies need to show they can turn a profit. Carvana can’t and won’t - hence the
🚨 We have another submission to the Best Ball Data Bowl by
@dsloan__
!
Dylan looked at the significance of backup RBs in best ball by creating RB archetypes and looking at their draft positions in relation to overall, playoff and top 1% teams.
Links below to the notebook:
Working on a fork of
@nfl_data_py
that implements threading to pull play by play data.
It looks like this approach is about 50% faster on average than requesting the data synchronously.
Here, it reduced the time to pull PBP data from 1999 to 2022 from 71 seconds to 38 seconds
🚨 We have another entry to the Best Ball Data Bowl!
Sackreligious and Hackr6849 from examine the limitations of advance rate metrics in their notebook and propose a new, improved metric they call Roster Agnostic Advance Rate (RAAR)
Links below:
Next part of the Learn Python with Fantasy Football tutorial series is up! Written by
@run_the_sims
In this post, we cover conditionals and use this programming concept to classify WR skill level for Tyreek Hill, George Pickens, and Kenny Golladay