|
|
|
|
|
by glifchits
2472 days ago
|
|
Did I miss something or does this project include popularity measures as features? In the section on dataset features, they include "popularity" (calculated by Spotify) as well as Billboard chart stats like weeks, rank, and a custom-made "score". To me it's not clear whether these features were hidden from the train/test sets or whether the popularity features were only used in their "artist past performance" measures. If they included these popularity features, it's like asking "can we predict whether a song is a hit just by looking at how popular it is?" If it is the case that they peeked into the future and observed ex-post song popularity, obtaining just 89% accuracy hints at how unpredictable song success truly is. Check out [1] for a famous study of song success which experimentally demonstrates the unpredictability of song success. [1] Salganik, M. J., Dodds, P. S., & Watts, D. J. (2006). Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market. Science, 311(5762), 854–856. https://doi.org/10.1126/science.1121066 |
|
>To extend previous work, in addition to audio analysis features, we consider song duration and mine an additional artist past-performance feature. Artist past-performance for a given song represents how many prior Billboard hits the artist has released before that track’s release date
emphasis mine.
I wonder how accurate a model using this feature alone would be.