Skip to content

panoskard3070/EDA-Spotify-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

EDA-Spotify-Project

Exploratory Data Analysis (EDA) on Spotify’s music dataset (175k rows). Includes feature exploration, normalization, outlier detection, and trend analysis across decades to uncover how music characteristics evolved over time.

Overview

This project performs an in-depth exploratory data analysis (EDA) on a large-scale Spotify dataset containing over 175,000 songs.
The goal is to explore the evolution of musical characteristics across decades and uncover key patterns in the data using statistical and visual analysis.

Objectives

  1. Clean and preprocess the raw dataset (handling duplicates, missing values, inconsistent date formats).
  2. Engineer meaningful features such as duration_min, tempo_energy, and vibe_score.
  3. Analyze numerical variables like danceability, energy, loudness, and popularity over time.
  4. Detect outliers and study feature distributions and correlations.
  5. Visualize long-term musical trends across decades.

Key Findings

  • Loudness and energy have both increased notably over time — confirming the "Loudness War" phenomenon.
  • Acousticness has decreased, showing a shift toward more electronic production styles.
  • Danceability and valence reveal that modern songs tend to be more upbeat and emotionally positive.
  • Popularity remains highly right-skewed — very few songs achieve exceptional fame.
  • Outlier analysis shows extreme loudness variability in the 1940s–1950s, likely due to recording limitations.

Tools and Libraries

  • Python (Pandas, NumPy, Seaborn, Matplotlib, Kagglehub and OS)
  • Datetime was about to be used however after analysis I decided not to.
  • Jupyter Notebook for full workflow transparency

Project Structure

  1. Data Import
  2. Data Inspection
  3. Data Cleaning
  4. Feature Engineering
  5. Statistical Analysis
  6. Visualization
  7. Conclusion

Author

Panagiotis Kardatos
Mathematician | Aspiring Machine Learning Engineer

About

Exploratory Data Analysis (EDA) on Spotify’s music dataset (175k rows). Includes feature exploration, normalization, outlier detection, and trend analysis across decades to uncover how music characteristics evolved over time.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors