Collecting data for football analysis
I thought that the only data available were the typical stats such as goals, percentage of passes, numbers of shoots, etc.
However, in 2020 I heard about the book “Soccermatics” by David Sumpter and the movie “Moneyball”, and I knew the term: Advanced Football Stats.
Since then I have been involved in football research and analytical practice, first through SomosCantera/DeepPitch and later as an enthusiast. As an enthusiast, I had to search for free resources (data and courses) to practice, and in this post, I share the useful information that I have found.
Before sharing the information, I will talk about data type of data in football analytical. The performance analyst collect data about:
- Traditional data: Stats as goals, passess, fouls, shoots, etc.
- Physical data: Generally this data is collected by companies like CATAPULT.
- Event data: Quantifies what actually happens on the pitch. ,Most “events” are individual player actions.
- Tracking data: Records the position of each player on the pitch. It’s, useful for calculating players statistics like distance covered and average speed.
Now yes, the resources:
Data available:
- Statsbombs, is the principal data provider and have available free data such as the Champions League, Premier League, and World Cup.
- Wyscout, is the leading platform on scouting.
- Event data Metrica.
- Skillcorner, Broadcast Tracking data.
- Friends of Tracking: contains tracking data about Liverpool.
An important consideration to the getting free data is learning about Web scraping. With library like Beautifulsoup you can extract information of Transfermarkt, WhoScored, fbref. The information that you can extract is from squads, transfers, match statistics, player, etc.
To finish I recommend following the tutorials or learning paths of Statsbomb and FoT.