An Exercise in Language Compression Are Pop Lyrics Getting More Repetitive?.
I'll be analyzing the repetitiveness of a dataset of 15,000 songs that charted on the Billboard Hot 100 between 1958 and 2017.
I know a repetitive song when I hear one, but translating that intuition into a number isn't easy.
It turns out, for example, that the entire lyrics of Cheap Thrills reduce in size 76% when compressed.
The Repetition of Pop Music Distribution of compressibility of 15,000 songs from 1958 to 2017, excluding 20 outliers.
The current decade is pretty well-represented in the top 10 above, but it's also a bit overrepresented in my dataset (it's easier to find lyrics for recent songs).
The background blob is the histogram of all songs in the dataset (the same one as before, but mirrored). »