A simple Python project for extracting and processing data from the Spotify Web API in batch operations. This demonstrates API authentication, pagination techniques, and serves as a foundation for potential future music analytics projects, trend analysis/clustering, or music recommendation algorithms.
- OAuth 2.0 Authentication: token-based authentication with automatic refresh
- Paginated Data Retrieval: two pagination strategies (offset-based and URL-based)
- Rate Limit Handling: built-in rate limiting awareness and best practices
-
Create a new app with these settings:
- App name:
spotify-batch-processing - App description:
Batch data processing from Spotify API - Redirect URIs:
http://127.0.0.1:3000 - API: Web API
- App name:
-
Copy
Client IDandClient Secret
Create a file src/.env with your Spotify credentials:
CLIENT_ID=your_client_id
CLIENT_SECRET=your_client_secret
- Installation
# clone repo
git clone https://github.com/anyantudre/spotify-batch-data-processing.git
cd spotify-batch-data-processing
# deps
pip install requests python-dotenv spotipy- Run the app
python src/main.pyWe implemented two pagination approaches:
-
Offset-Based Pagination
- Manually calculates next page using
offset+limit - More control over pagination logic
- Useful for custom pagination requirements
- Manually calculates next page using
-
URL-Based Pagination
- Uses the
nextURL provided by the API - Simpler implementation
- Recommended approach for most use cases
- Uses the
Spotify API uses dynamic rate limiting based on a rolling 30-second window. We include:
- Built-in delays: configurable request intervals
- Best practices: exponential backoff strategies
- Monitoring: request timing analysis
-
Data & Analytics
- Audio features integration (tempo, energy, danceability)
- Advanced filtering and artist analytics
- Data visualization dashboards
-
Technical
- Database integration (PostgreSQL/MongoDB)
- Async processing and caching
- Docker containerization
- CI/CD pipeline
-
Machine Learning
- Music recommendation algorithms
- Genre classification models
- Trend analysis and clustering