Hi, I'm Andrew, a senior data analyst for BuzzFeed and a Swiftie! As each Taylor’s Version is released, (now updated for 1989), I thought it would be fun to investigate: which Taylor's Versions sound the closest and the most different to their first versions?*
Where does the data come from?
The Spotify API holds descriptive audio characteristics* for songs. I gathered this data for every Taylor's Version and its first version on the deluxe editions: 20 songs from Fearless (I'm considering "If This Was A Movie" to be a Fearless song), 20 from Red, 16 from Speak Now, and 16 from 1989.
The variables I chose are:
1) Acousticness: a measure between 0 and 1 of how "acoustic" a track is. The closer to 1, the more acoustic.
2) Danceability: a measure between 0 and 1 of how well the track can be used for dancing (closer to 1 means more danceable). It takes into account metrics like tempo and rhythm.
3) Energy: a measure between 0 and 1 of intensity within a track ("energetic tracks feel fast, loud, and noisy"). A higher score means a higher energy.
4) Instrumentalness: a measure between 0 and 1 of whether a track is instrumental. A measure closer to 1 is a higher likelihood of no vocals.
5) Speechness: a measure between 0 and 1 detecting spoken word within a track. The higher the measure, the higher the probability that the track is spoken word.
6) Valence: a measure between 0 and 1 describing the "musical positiveness" of a track (0: most sad vs. 1: most happy-sounding).
7) Loudness: The overall loudness of a track, in decibels.
Every variable in the data set ranges between 0 and 1, except for loudness. To make sure every variable was on the same scale, I normalized the loudness over each pair of albums to range between 0 and 1.
I chose to exclude metrics like liveness, which detects the presence of an audience within a track (none of the songs were recorded live), and things like tempo, key, and time signature which remain consistent between both versions.
(See link above for the full definitions of the variables).
The methodology:
Here's the 20 pairs (Taylor's Version vs. first version) that are the most different from each other:
For each song, I generated a graph showing the audio metrics for Taylor's Version and its first version. If a variable on the graph only shows one album cover, it is because the values are so similar that the points overlapped.