In my last blog post, I walked through how I was able to scrape John Mayer’s entire discography off of Spotify using an API. At first, I only accessed 100 songs due to Spotify restrictions regarding the number of songs that can be scraped in a single session; however, after spending a bit more time messing around and looping my requests, I was able to access his full discography!

Now that I have that data, it’s time to look at what this data tells me, and if there is anything that I want to look into further!

Popularity

One of the first things I wanted to look at when exploring this John Mayer dataset was what songs and/or albums were the most popular? And how popular is his music in general? John Mayer is fairly well known, but he isn’t hitting the top charts. So what does overall music popularity look like?

In this context, popularity is a 0-to-100 score that ranks how popular an artist is relative to other artists on Spotify. As their numbers grow, the artist get placed in more editorial playlists and increases their reach on algorithmic playlists and recommendations.

PopularityDensity

Overall, it seems like his music is decently popular. It definitely appears as though some songs are much more popular than others, with most songs ranging between 20 and 80 points and an average overal popularity score of 43.67.

Taking this one step further, I wanted to look at the popularity breakdown by album! Naturally, his first few albums may be less popular than recent albums, due to increased exposure and trending music in today’s age versus 10-15 years ago.

PopularitybyAlbum

Based on this boxplot, his most popular albums are Room for Squares, Continuum, and Sob Rock. This makes sense considering Sob Rock is his newest album, while Room for Squares and Continuum contain many of his most well-known songs (i.e. “Gravity”, “No Such Thing”, “Your Body Is A Wonderland”, etc.)

What about the other audio features you talked about in your last post?

If you recall my previous blog post, I talked about different audio features that I was able to scrape from Spotify, namely valence, tempo, danceability, and acousticness. Naturally, I was super curious to know of the relationship between these features and the popularity of John Mayer’s songs, so I generated some correlation heat maps and regression plots:

PopularityReg

HeatMap

The strongest relationship seems to be popularity and danceability – when the danceability of a song is higher, the popularity of the song seems to be higher. This seems like a fair analysis considering people typically enjoy listening to music that is upbeat and catchy.

However, one relationship that I found interesting was popularity and valence. Valence is an audio feature that measures how positive or negative the lyrics are in a song, with 0 being the most negative and 1 being the most positive. When examining the relationship between the two, it seems like the valence of John Mayer’s songs don’t seem to impact how popular his songs can be – meaning a song that is really sad can still end up having a high popularity score! A similar pattern is found when analyzing popularity and tempo – slower doesn’t necessarily mean less popular. In a further analysis, I may want to explore this more in depth and see if there are any predictive capabilities as a result of these relationships.

Other fun discoveries

Out of curiosity, I wanted to see the breakdown of John Mayer songs that were in different key signatures, and what the popularity looks like for those different keys!

PopularitybyKey

I thought this boxplot was fun because it showed that his most popular songs have been in the keys of G, D, F, and Db – which is not a very common key! However, there’s clearly some risk involved with the key of D, cause it also looks like some of his least popular songs are in the key of D too, haha!

Overall

Going forward, I think I want to explore his albums more specifically. What was the average popularity of each album, and what were their additional respective audio feature scores? Is there a model that can be fit to predict how popular a John Mayer song will be based on how positive, acoustic, or fast it is? I’m not John Mayer’s record label by any means, but it would be fun to get an idea at what artist teams potentially look at when considering what kind of music to make for today’s audience!

If you’d like to follow along with me, I’ve published the dataset and a markdown with the visualizations (and data cleaning!) in my Github Repository