Back to Project Library
Completed
Sanitized

Regression + cross-validation

Spotify Popularity Prediction

Used audio-feature data to model song popularity and translate statistical results into product and marketing recommendations.

PythonRegressionCross ValidationFeature AnalysisEDA

Popularity Prediction

Regression + cross-validation | Python • Regression Engine • Cross Validation • EDA

The Problem

Audio metadata properties present an overwhelming number of dimensions to parse when trying to determine the objective, statistical correlations that compose a commercially successful hit.

The Methodology

Ran complex multiple linear regression models, extensive feature correlation analysis, handled extreme outliers strictly, and validated via careful cross-validation frameworks.

The Impact & Outcome

Determined clear coefficient signals for tempo, danceability, and acoustic variables as predicting success, while discarding noise features.

Key Metric: Statistically significant insight into feature weights without falling back onto black-box non-interpretable models. Transparent variables only.

Regression Interpretability Presentation

Below is the walkthrough highlighting the multi-linear regression architecture, significance outcomes, and isolated statistical variables driving commercial success.

Download Evaluation (.docx)