Soccerdonna Scraper

The soccerdonna_scraper package provides tools for extracting player and club data from Soccerdonna.de, one of the largest databases for women’s football.

It can scrape all clubs in a given league, collect detailed player profiles, and export the data into an Excel file for analysis.

Installation

You can install the package directly from GitHub:

pip install git+https://github.com/marclamberts/soccerdonna_scraper.git

The following dependencies are installed automatically:

  • requests (HTTP requests)

  • beautifulsoup4 (HTML parsing)

  • pandas (data handling)

  • openpyxl (Excel export)

Python 3.8 or newer is required.

Usage

You can use the scraper either from Python or via the command line interface (CLI).

### Using from Python

from soccerdonna_scraper import run_scraper

# Scrape the Women's Super League (ENG1) and save results
run_scraper("womens-super-league", "ENG1", output_file="WSL_ENG1_players.xlsx")

Parameters:

  • league_code: The league slug used in Soccerdonna URLs (e.g., womens-super-league).

  • comp_code: The league competition code (e.g., ENG1).

  • output_file (optional): Name of the Excel file to save. Defaults to {league_code}_{comp_code}_players.xlsx.

### Using from CLI

After installation, a command-line script wsl-scraper becomes available:

wsl-scraper womens-super-league ENG1 -o WSL_ENG1.xlsx

Arguments:

  • league_code: The league slug (e.g., kvindeliga, womens-super-league).

  • comp_code: The competition code (e.g., 3FL, ENG1).

  • -o, --output: Optional Excel file name.

Examples

Scraping the Kvindeliga 2024-25 season:

from soccerdonna_scraper import run_scraper
run_scraper("kvindeliga", "3FL", "Kvindeliga_2024_2025.xlsx")

Scraping the Women’s Super League:

wsl-scraper womens-super-league ENG1 -o WSL_ENG1.xlsx

The output Excel file contains one row per player with fields such as:

  • Name

  • Club

  • Date of birth

  • Nationality

  • Position

  • Height

  • Shirt number

  • Profile URL

  • Additional attributes from the Soccerdonna player page

API Reference

Notes

  • Politeness: The scraper includes a short delay between requests (time.sleep(0.5)) to reduce server load. Do not remove this if running large scrapes.

  • Stability: The scraper relies on Soccerdonna’s HTML structure. If the website changes, the scraper may need updating.

  • Ethics: Use this package responsibly and respect the website’s terms of use.

Contributing

Contributions are welcome! If you’d like to extend functionality (e.g., support more leagues, add data fields, or improve speed), please open a Pull Request on GitHub:

https://github.com/marclamberts/soccerdonna_scraper