The Data Scout Method: A 3-Step Framework
Transform Data into Actionable Insights and Decisions
Over the past few years, I’ve been on a journey to learn as much as possible about football analytics. Inspired by Moneyball and the success stories of clubs like Brighton and Brentford, I wanted to understand how they achieve such effective recruitment.
But there’s a problem — football, especially when it comes to scouting and recruitment, is notoriously secretive.
Apparently, not even Brighton’s own scouts and staff have access to the club’s player database or the details of its identification process — a service provided by a data company owned by Brighton’s owner, Tony Bloom (FourFourTwo, 2023).
Method
There’s no shortage of articles praising how certain clubs use data, but very few explain how they do it. That’s because, in today’s data-rich environment, success doesn’t come from simply having access to numbers — it comes from knowing how to apply them. And in an industry as competitive as football, clubs are understandably reluctant to reveal their methods.
With limited publicly available insights, the best way to understand these processes is to experiment. By analyzing historical transfers, testing different scouting approaches, and working directly with data, I’ve developed my own structured method — one that mirrors the strategic thinking used by top clubs.
While I don’t have the same resources or proprietary algorithms as Premier League sides, the fundamental principle remains the same: modern scouting starts with data and having a clear strategy.
So, let me introduce The Data Scout Method: A 3-Step Framework — a structured approach to identifying talent using data.
I will demonstrate the method using an example where I extract data from FBref and create insights from this.
Step 1: Profile
The essence of data scouting is to narrow down a large pool of players into a select few who fit your needs.
Think of it as a funnel — players enter at the top, and only the most suitable ones make it through to the other side. But to achieve this, you first need a clear idea of what you’re looking for.
In other words, you must define a specific problem or challenge — the player profile you want to identify through data.
For this example, our target profile is a ball-playing defender from the top five leagues. While we could refine the criteria further, FBref only allows us to filter by general positions — attackers, midfielders, or defenders. As a result, our search will include both full-backs and center-backs. To focus more on center-backs, we’ll prioritize defenders who excel in aerial duels and win the majority of their challenges.
Additionally, we’re looking for players under 25 years old, as the goal is to recruit for the future.
With these parameters in mind, our initial profile criteria look like this:
Position = DF
Age ≤ 25 years
Minutes Played ≥ 900 (10 Games)
Aerial Duels Won, % ≥ 60 %
Ballplaying DF: Must Excel in Progressive Metrics
Step 2: Filter
The second step in the method is filtering the data, which requires some coding. Our initial position filter gives us a list of 789 players — a number far too large to analyze effectively.
Next, we narrow it down by age, selecting only players who are 25 years old or younger. This reduces the pool to 361 players. The third filter ensures that we only consider players who have played at least 900 minutes (equivalent to 10 full matches), as sample size and player availability are crucial factors in scouting. After applying this filter, we’re left with 146 players.
These were the basic filters — now it’s time to refine the search further. Since we’re looking for defenders who dominate aerial duels and excel in their own box, we apply a filter for aerial duel success rate. Limiting our list to players who win more than 60% of their aerial duels trims it down to just 55 players.
The final step is identifying ball-playing ability. To do this, I applied separate filters for progressive passes and progressive carries. First, filtering for players who average more than 5 progressive passes per 90 minutes leaves us with a shortlist of just 13 players. Among them, Dortmund’s Nico Schlotterbeck stands out, as does 18-year-old Pau Cubarsí — what a level he’s already playing at!
For our final filter, we focused on players who average more than 2 progressive carries per game. Applying this criterion narrows our shortlist even further, leaving us with just 8 players.
Topping the chart we have Lorenz Assignon at Rennes, averaging impressive 4 progressive carries per game.
Two Atalanta players stand out here as well, which makes sense given the club’s expansive and progressive style of play. Atalanta relies heavily on their wing-backs, who are expected to contribute both offensively and defensively.
In this case, both players not only fit that mold but also excel defensively — remember, every player on this list wins more than 60% of their aerial duels.
One particularly exciting name is Matteo Ruggeri. At just 22 years old, he appears on both our progressive passing and carrying shortlists, making him a standout prospect.
Also, a shoutout is deserved to Neco Williams, playing as a left-back for Forrest while being right footed — fantastic development since Liverpool let him go.
Step 3: Visualize
With the heavy lifting of data filtering complete, it’s time for the final and arguably most important step — making the data presentable.
This is where raw numbers are transformed into meaningful insights that scouts and coaches can easily interpret.
Rather than simply listing the filtered players, we’ll visualize the results using data plots. The first step is to create a scatter plot, which allows us to compare key metrics and spot standout performers at a glance.
If you’re unfamiliar with scatter plots, you can check out my previous article for a deeper dive here.
Looks much better, right?
For the scatter plot, I’ve highlighted the players with the highest combined total of progressive passes and progressive carries — essentially the five most progressive defenders in our dataset.
A fitting way to categorize these players would be as hybrid defenders. Perhaps the best example is Joško Gvardiol. Originally a center-back, he has been deployed as a left-back and, more recently, as a wing-back — most notably in Manchester City’s win against Chelsea, where he also scored the equalizer.
Nico Schlotterbeck is another standout. Now 25 years old, he might be ready for a bigger move. Given his profile, I wouldn’t be surprised if one of Europe’s top clubs comes calling soon.
To summarize a player’s overall profile, we can use a radar chart. This visualization represents a player’s skill set in percentiles, comparing them to others in the same position.
It provides a clear snapshot of strengths and weaknesses, making it easier to assess their suitability for a specific role.
As illustrated in the radar chart, Schlotterbeck stands out in progressive passes and carries, ranking in the 81st percentile for aerial duels won.
With this analysis, we have successfully identified a ball-playing defender using The Data Scout Method.
Conclusion
We began by defining the ideal player profile we aimed to uncover through data. From there, we started with a pool of 789 players and narrowed it down to the 5 players featured in the scatter plot. Finally, we visualized our ideal profile in the radar chart.
Learn “The Data Scout Method”
If you’re eager to learn more about The Data Scout Method and how it can elevate your scouting skills, click here.
References
FourFourTwo (2023). Brighton’s Secret Recruitment Algorithm. URL: https://www.fourfourtwo.com/news/brighton-and-hove-albion-chief-executive-paul-barber-on-his-secret-recruitment-algorithm-that-even-club-staff-dont-have-knowledge-of
SkySports (2023). Future of Football: The AI-wielding ‘unicorns’ and neuroscientists changing transfers and recruitment. URL: https://www.skysports.com/football/news/11095/12928151/future-of-football-the-ai-wielding-unicorns-and-neuroscientists-changing-transfers-and-recruitment
Very interesting module! May I ask you something? Did you use Excel to analyze and display your examples? I'm new to the field and trying to figure out the most convenient tools to use for this purpose.
Thank you for getting back to me.