{"id":2081,"date":"2018-04-08T07:49:37","date_gmt":"2018-04-08T06:49:37","guid":{"rendered":"http:\/\/www.andrewj.com\/blog\/?p=2081"},"modified":"2018-04-08T08:16:01","modified_gmt":"2018-04-08T07:16:01","slug":"inferring-algorithms-how-random-is-your-music-player","status":"publish","type":"post","link":"https:\/\/www.andrewj.com\/blog\/2018\/inferring-algorithms-how-random-is-your-music-player\/","title":{"rendered":"Inferring Algorithms: How Random is Your Music Player?"},"content":{"rendered":"<p align=\"center\"><em>&#8220;You&#8217;re Inferring that I&#8217;m stupid.&#8221;<\/em><\/p>\n<p align=\"center\"><em>&#8220;No, I&#8217;m implying that you&#8217;re stupid. You&#8217;re inferring it.&#8221;<\/em><\/p>\n<p align=\"center\"><em>&#8211; Wilt, by Tom Sharpe<\/em><\/p>\n<p>My latest contract means spending some time on a bus at each end of the day. The movement of the bus means it&#8217;s not comfortable to read, so I treated myself to a nearly new pair of decent BlueTooth headphones, and rediscovered the joys of just listening to music. I set the default music player app to &#8220;random&#8221; and let it do its stuff.<\/p>\n<p>That&#8217;s when the trouble started. I started thinking about the randomisation algorithm used by the music player on the Sony phone. I can&#8217;t help it. I&#8217;m a software architect &#8211; it&#8217;s what I do.<\/p>\n<p>One good music randomisation algorithm would look like this:<\/p>\n<ol>\n<li>Assign every song on your device a number from 1 to n<\/li>\n<li>When you want to play a random song, generate a random number between 1 and n, and play the song with that number.<\/li>\n<\/ol>\n<p>However in my experience no-one ever implements this, as it relies on maintaining an index of all the music on the device, and assigning sequential numbers to it. That&#8217;s not actually very difficult, given that every platform indexes the music anyway and a developer can usually access that data, but it&#8217;s not the path of least resistance.<\/p>\n<p>Let&#8217;s also say a word about generating random numbers. In reality these are always pseudo-random, and depending on how you seed the generator the values may be predictable. That may be the case with Microsoft&#8217;s software for picking desktop backgrounds, which seems to pick the same picture simultaneously on my laptop and desktop more often than I&#8217;d expect, but that&#8217;s a topic for another blog, so for now let&#8217;s assume that we can generate an acceptably random spread of pseudo-random numbers in a given integer range.<\/p>\n<p>Here&#8217;s another algorithm:<\/p>\n<ol>\n<li>Start in the top directory for the music files<\/li>\n<li>Pick an item from that directory at random. Depending on the type:\n<ul>\n<li>If it&#8217;s a music file, play it. When finished, start again at step 1<\/li>\n<li>If it&#8217;s a directory, make it your target and redo step 2<\/li>\n<li>If it&#8217;s anything else, just repeat step 2<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<p>This is easy to implement, runs quickly and plays nicely with independently changing media files. I&#8217;ve written something similar for displaying random pictures on a website. It doesn&#8217;t require maintaining any sort of index. It generates a good spread of chosen files, but will play albums which are alone under the first level root (usually the artist) much more than those which have multiple siblings.<\/p>\n<p>My old VW Eos had a neat but very different system. Like most players it could work through the entire catalogue in order, spidering up and down the directory structure as required. In &#8220;random&#8221; mode it simply calculated a number from 1 to approximately 30 after each song, and used that as the number of songs to skip forwards in the sequence.<\/p>\n<p>This was actually quite a good algorithm. As well as being easy to implement it had the side-effect of being at least partially predictable, usually playing a couple of songs by the same artist before moving on, and allowing a bit of &#8220;what&#8217;s next&#8221; guesswork which could be entertaining on a long drive.<\/p>\n<p>So what about the Sony music app on my phone? At first it felt like it was doing the job well, providing a good mix of genres, but after a while I started to become suspicious. As it holds the playlist in a readable form, I could check that suspicion. These are key highlights from the playlist after about 40 songs:<\/p>\n<ul>\n<li>1 from ZZ top<\/li>\n<li>1 from &#8220;Zumba&#8221;<\/li>\n<li>3 from Yazoo!<\/li>\n<li>1 from Wild Cherry<\/li>\n<li>1 from Wet Wet Wet<\/li>\n<li>Several from &#8220;Various Artists&#8221; with album titles like &#8220;The Very Best&#8230;&#8221;<\/li>\n<li>0 from any artist filed under A-S!<\/li>\n<\/ul>\n<p>I wasn&#8217;t absolutely sure about the last point. What about Acker Bilk and Louis Armstrong? Turns out they are both on an album entitled &#8220;The Very Best of Smooth Jazz&#8221;&#8230;<\/p>\n<p>I can also look ahead at the list, and it doesn&#8217;t get much better. Van Morrison, Walter Trout, The Walker Brothers, and more Wet Wet Wet \ud83d\ude41<\/p>\n<p>So how does this algorithm work (apart from &#8220;badly&#8221;)? I have a couple of hypotheses:<\/p>\n<ul>\n<li>It implements a form of the &#8220;give every track a number&#8221; algorithm, but the index only remembers a fixed number of tracks numbering a few hundreds (maybe ~1000), and anything it read earlier in the indexing process is discarded.<\/li>\n<li>It implements the &#8220;give every track a number algorithm&#8221;, but the random number generator is heavily biased towards the end of the number range.<\/li>\n<li>It&#8217;s attempting a &#8220;random walk&#8221;, skipping a random number of steps forwards or backwards through the list at each play (a bit like the VW algorithm, but bidirectional). If this is correct it&#8217;s odd that it has never gone into &#8220;positive&#8221; territory (artists beginning with A-S), but that could be down to chance and not impossible. The problem is that without a definite bias a random walk tends to stay in the same place, so it&#8217;s a very poor way of scanning your music collection.<\/li>\n<\/ul>\n<p>Otherwise I&#8217;m at a loss. It&#8217;s not like I have a massive number of songs and could have run into an integer size limit or similar (there are only around 11,000 files, including directories and artwork).<\/p>\n<p>Ultimately it doesn&#8217;t matter that much. I can live with it for a while and I can probably resolve the issue by downloading another music player app. However you can&#8217;t help feeling that a giant of entertainment technology like Sony should probably manage better.<\/p>\n<p>Regardless of that, it&#8217;s an interesting exercise in analysis, and also potentially in design. Having identified some poor models, what constitutes a &#8220;good &#8221; random music player? I&#8217;ve seen some good concepts around grouping songs by &#8220;mood&#8221;, or machine learning from previous playlists, and I&#8217;ve got an idea forming in my head about an app being more like a radio DJ, looking for &#8220;links&#8221; between the songs in terms of their artist names, titles or genres. Maybe that&#8217;s the next development concept. Watch this space.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;You&#8217;re Inferring that I&#8217;m stupid.&#8221; &#8220;No, I&#8217;m implying that you&#8217;re stupid. You&#8217;re inferring it.&#8221; &#8211; Wilt, by Tom Sharpe My latest contract means spending some time on a bus at each end of the day. The movement of the bus &hellip; <a href=\"https:\/\/www.andrewj.com\/blog\/2018\/inferring-algorithms-how-random-is-your-music-player\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9,2],"tags":[],"_links":{"self":[{"href":"https:\/\/www.andrewj.com\/blog\/wp-json\/wp\/v2\/posts\/2081"}],"collection":[{"href":"https:\/\/www.andrewj.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.andrewj.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.andrewj.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.andrewj.com\/blog\/wp-json\/wp\/v2\/comments?post=2081"}],"version-history":[{"count":0,"href":"https:\/\/www.andrewj.com\/blog\/wp-json\/wp\/v2\/posts\/2081\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.andrewj.com\/blog\/wp-json\/wp\/v2\/media?parent=2081"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.andrewj.com\/blog\/wp-json\/wp\/v2\/categories?post=2081"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.andrewj.com\/blog\/wp-json\/wp\/v2\/tags?post=2081"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}