Inferring Algorithms: How Random is Your Music Player?

“You’re Inferring that I’m stupid.”

“No, I’m implying that you’re stupid. You’re inferring it.”

– Wilt, by Tom Sharpe

My latest contract means spending some time on a bus at each end of the day. The movement of the bus means it’s not comfortable to read, so I treated myself to a nearly new pair of decent BlueTooth headphones, and rediscovered the joys of just listening to music. I set the default music player app to “random” and let it do its stuff.

That’s when the trouble started. I started thinking about the randomisation algorithm used by the music player on the Sony phone. I can’t help it. I’m a software architect – it’s what I do.

One good music randomisation algorithm would look like this:

  1. Assign every song on your device a number from 1 to n
  2. When you want to play a random song, generate a random number between 1 and n, and play the song with that number.

However in my experience no-one ever implements this, as it relies on maintaining an index of all the music on the device, and assigning sequential numbers to it. That’s not actually very difficult, given that every platform indexes the music anyway and a developer can usually access that data, but it’s not the path of least resistance.

Let’s also say a word about generating random numbers. In reality these are always pseudo-random, and depending on how you seed the generator the values may be predictable. That may be the case with Microsoft’s software for picking desktop backgrounds, which seems to pick the same picture simultaneously on my laptop and desktop more often than I’d expect, but that’s a topic for another blog, so for now let’s assume that we can generate an acceptably random spread of pseudo-random numbers in a given integer range.

Here’s another algorithm:

  1. Start in the top directory for the music files
  2. Pick an item from that directory at random. Depending on the type:
    • If it’s a music file, play it. When finished, start again at step 1
    • If it’s a directory, make it your target and redo step 2
    • If it’s anything else, just repeat step 2

This is easy to implement, runs quickly and plays nicely with independently changing media files. I’ve written something similar for displaying random pictures on a website. It doesn’t require maintaining any sort of index. It generates a good spread of chosen files, but will play albums which are alone under the first level root (usually the artist) much more than those which have multiple siblings.

My old VW Eos had a neat but very different system. Like most players it could work through the entire catalogue in order, spidering up and down the directory structure as required. In “random” mode it simply calculated a number from 1 to approximately 30 after each song, and used that as the number of songs to skip forwards in the sequence.

This was actually quite a good algorithm. As well as being easy to implement it had the side-effect of being at least partially predictable, usually playing a couple of songs by the same artist before moving on, and allowing a bit of “what’s next” guesswork which could be entertaining on a long drive.

So what about the Sony music app on my phone? At first it felt like it was doing the job well, providing a good mix of genres, but after a while I started to become suspicious. As it holds the playlist in a readable form, I could check that suspicion. These are key highlights from the playlist after about 40 songs:

  • 1 from ZZ top
  • 1 from “Zumba”
  • 3 from Yazoo!
  • 1 from Wild Cherry
  • 1 from Wet Wet Wet
  • Several from “Various Artists” with album titles like “The Very Best…”
  • 0 from any artist filed under A-S!

I wasn’t absolutely sure about the last point. What about Acker Bilk and Louis Armstrong? Turns out they are both on an album entitled “The Very Best of Smooth Jazz”…

I can also look ahead at the list, and it doesn’t get much better. Van Morrison, Walter Trout, The Walker Brothers, and more Wet Wet Wet 🙁

So how does this algorithm work (apart from “badly”)? I have a couple of hypotheses:

  • It implements a form of the “give every track a number” algorithm, but the index only remembers a fixed number of tracks numbering a few hundreds (maybe ~1000), and anything it read earlier in the indexing process is discarded.
  • It implements the “give every track a number algorithm”, but the random number generator is heavily biased towards the end of the number range.
  • It’s attempting a “random walk”, skipping a random number of steps forwards or backwards through the list at each play (a bit like the VW algorithm, but bidirectional). If this is correct it’s odd that it has never gone into “positive” territory (artists beginning with A-S), but that could be down to chance and not impossible. The problem is that without a definite bias a random walk tends to stay in the same place, so it’s a very poor way of scanning your music collection.

Otherwise I’m at a loss. It’s not like I have a massive number of songs and could have run into an integer size limit or similar (there are only around 11,000 files, including directories and artwork).

Ultimately it doesn’t matter that much. I can live with it for a while and I can probably resolve the issue by downloading another music player app. However you can’t help feeling that a giant of entertainment technology like Sony should probably manage better.

Regardless of that, it’s an interesting exercise in analysis, and also potentially in design. Having identified some poor models, what constitutes a “good ” random music player? I’ve seen some good concepts around grouping songs by “mood”, or machine learning from previous playlists, and I’ve got an idea forming in my head about an app being more like a radio DJ, looking for “links” between the songs in terms of their artist names, titles or genres. Maybe that’s the next development concept. Watch this space.

Leave a Reply

Your email address will not be published. Required fields are marked *