“Let our rigorous testing and reviews be your guidelines to A/V equipment – not marketing slogans”
Facebook Youtube Twitter instagram pinterest

Measuring Pandora Radio

by May 19, 2010
The Big P Under the Scope

The Big P Under the Scope

If you haven't heard of Pandora, you should. Pandora is a customizable radio service that is mostly free with commercials or $36 a year for commercial free plus additional features. I say "mostly" because recently they give you 40 hours free a month (with commercials) and if you go over they shut you down unless you pay $1. Since I'm very cheap, I use the free version and make sure I don't go over 40 hours a month.

Maybe I should back up for those that haven't used the service. Basically, you choose a type of music, a band, or a specific song for a channel seed. Pandora builds a channel around that seed. You can then add more seeds (if you wish) or vote up or down songs. This gives Pandora more info about what you like and don't like, further tailoring the channel to your likes. Theoretically, in the end, you should end up with a channel that plays the songs you know you like and ones that you didn't know you'd like. How do they do this? The Music Genome Project:

On January 6, 2000 a group of musicians and music-loving technologists came together with the idea of creating the most comprehensive analysis of music ever.

Together we set out to capture the essence of music at the most fundamental level. We ended up assembling literally hundreds of musical attributes or "genes" into a very large Music Genome. Taken together these genes capture the unique and magical musical identity of a song - everything from melody, harmony and rhythm, to instrumentation, orchestration, arrangement, lyrics, and of course the rich world of singing and vocal harmony. It's not about what a band looks like, or what genre they supposedly belong to, or about who buys their records - it's about what each individual song sounds like.

Since we started back in 2000, we've carefully listened to the songs of tens of thousands of different artists - ranging from popular to obscure - and analyzed the musical qualities of each song one attribute at a time. This work continues each and every day as we endeavor to include all the great new stuff coming out of studios, clubs and garages around the world.

In fact, when a song is playing, you can look up why Pandora chose the song. You'd be surprised at the level of detail. But you'll also notice that it isn't perfect. You'll still hear songs you don't like. One thing I've found, disconcertingly, is that all my Pandora channels end up sounding similar (if not the same). Is this because of my musical tastes, Pandora's selection, or something else. What I wondered was: what was more important - voting up or down songs. Which made the most difference? Deciding to take a somewhat systematic approach, I embarked on a voyage to discover how I could maximize my Pandora experience.

Method

First I had to decide how to measure Pandora. Since Pandora is designed to discern my musical tastes, I figured that whether I liked a song or not was a good enough metric. Pandora has a six skip limit on a channel an hour with 12 max a day (all channels). Since voting down also skips the song, I used that as the maximum input Pandora would get. The methodology is fairly simple and straightforward - Create a channel, listen, decide for each song if you like it or not, and record your preference. I'd have four different groups - Raw, six down votes, six up votes, and three up and three down votes. While you can make the argument that the final group should have been six up and six down, I thought this would give Pandora twice the input as the other groups and would unfairly weight it. I lastly decided, quite arbitrarily, that I'd listen to 20 songs in a row after the aforementioned votes were applied and record if I liked or disliked the song.

One thing I heavily debated was whether or not to include a "neutral" category for songs I would have let go by without a vote. Since this is a very real part of actual Pandora use (if I hate or love a song, I vote for it - if not I usually just let it go until I have an opinion of it), it would have been a valid category. But with an "n" of 1 (one participant - me) and only 20 data points (20 songs) that would reduce the usable data even more. As it is, I'd have to love or hate 19 out of 20 songs for a statistically significant result which probably wasn't going to happen. Instead, I decided to think of each song as if I had no choice but to vote.

Obviously listening to this many songs (80 plus however many it took to get the number of up/down votes) takes a lot of time and wasn't something I could do in one sitting. When I needed a break I'd hit pause and come back. When I finished with one set of measurements, I deleted the channel and closed/reopened my browser before creating the same channel again. I ensured that none of my previous choices were carried over to the new channel in the "edit" screen. I chose to seed my channel on an artist whose music I tend to like.

Testing

As you might surmise, this process wasn't as easy as I expected from the beginning. You only get a certain number of skips a day which I had to save for my down votes. It would have been a 10 minute process if I could have listened to a song long enough to know if I liked it or not (easy with songs you recognize) and skipped to the next. Instead, it took me about 2 days of listening to get it all done.

There are some issues (other than sample size/participant pool) with this study. First, Pandora's song selection process is not exactly transparent. There is no way of knowing what might affect the song pool. I can imagine that time of day or maybe even the number of other people listening to similar songs might play a part. There is also the issue of randomness. With only six selections (three of each type in the balanced group), the songs that were voted on should make a HUGE difference in what came next during the measurement phase. I strove to wait for songs that I really liked/hated for my votes, though time was an issue. I couldn't exactly wait for my absolute favorite or most hated songs to pop up. Anything that drew a strong reaction got a vote. As a side note, any song that I generally like but are getting a little tired of (Pandora has a "I'm tired of this song" option which takes the song out of your lineup for a month that I love to use) got an up vote.

Of course, I had some preconceptions going into this test. It makes perfect sense that the Raw category (no votes) would probably hover just north of 'dislike'. After all, I put in a band I like so it should already have a clue about what types of music I like. I assume that 'up' voting a song would be more powerful than 'down' voting. I make this assumption because when you 'down' vote a song, it says it won't play that song any more but when you 'up' vote one it says it will play more songs like that one. The former sounds like you are excluding a single song while the latter sounds like you are making more fundamental changes to your channel. The balanced group (3 up, 3 down) was the real wild card in my mind. With so little input, how well could Pandora do?

Results

After two days of listening, these are my measurements:

Rating Raw
Six Up
Six Down
Balanced (3/3)
Liked 10 8 16 14
Disliked 10 12 4  6

I was pleasantly surprised by the Raw results as at least they were nice and middle of the road. Theoretically, left to it's own devices, Pandora will guess what I like to listen to based on a single band 50% of the time. The next two results were a little more disconcerting. If I voted songs up, Pandora tended to pick less songs I liked. If I voted songs down it tended to pick more songs I liked. With the balanced, it tended to (as I had hoped) pick more songs I liked but not as many as the pure down vote method.

Conclusion and Next Steps

What does this mean? Well, the sample size is small so it really doesn't mean much but this is the Internet and we don't let little things like statistical invalidity stop us from drawing conclusions do we? In the case of this test, we see that voting 'down' seems to have more of an effect than simply to exclude a song. The 'up' vote result is confusing but is close enough to a 10/10 split that we're going to call that one a wash. Regardless, with less input in a single direction (Balanced), it tended toward picking songs I liked. The last thing that came to my mind was to compare these results to a channel that I had already created. This channel has two artist and two song seeds and over 130 up/195 down votes (no one that listens to AV Rant will be surprised to find I vote down more often than up). If any channel should know what I like, it is this one. I turned it on and rated, exactly as I had in the previous tests, the first 20 songs and Pandora got 19 right. The one I didn't like, I didn't hate (probably would have let it pass on a normal day), but based on how I was rating all the other songs I had to put it in the "Dislike" column. This shows that Pandora will, with enough input, create a channel that you'll like. The original question still stands - which is more important up votes or down votes? The next step in this process would be to do the experiment again. I'm convinced with a large enough sample size the methodology is sound. What we need is a couple of hundred volunteers to do what I just did. Pick a band, make a station, and do one or more of the tests above. Send them to me here (tom at Audioholics dot com) and I'll throw them together into an analyzable data set and post the results at a later date. Happy listening!

 

About the author:
author portrait

As Associate Editor at Audioholics, Tom promises to the best of his ability to give each review the same amount of attention, consideration, and thoughtfulness as possible and keep his writings free from undue bias and preconceptions. Any indication, either internally or from another, that bias has entered into his review will be immediately investigated. Substantiation of mistakes or bias will be immediately corrected regardless of personal stake, feelings, or ego.

View full profile