The all new BBC music site where programmes meet music and the semantic web

We (well Nick, Patrick, Tony, Deanna, Sacha and Guy) have recently been working on a major rewrite of the BBC’s music site [beta] – and since its just gone live as a public beta I thought it would be a good time to explain a bit about what we’ve done and what we hope to do with the site. I would also love to hear what you think about what we’ve done so far – especially as we move from public beta towards a replacement for the current site.

Madonna's artist page on the BBC's new beta music site

Our work so far has focused on providing core information for every artist the BBC plays on our daytime radio network shows. That’s shows like Annie Mac and Chris Moyles on Radio 1, or Steve Lamacq on 6Music. We’ve focused on these shows because we’ve integrated the site with the radio playout system VCS dira! which are basically giant iPods in the basement of Broadcasting House. Unfortunately the specialist shows as well as national and local radio shows don’t use these so we haven’t got track listing data in the right format for these programmes. But we have a plan and will be adding more shows in due course.

So what have done so far?

For starters we decided that we wanted to have a single canonical page for every artist. We decided to do this because we want to aggregate everything we know about an artist at a single URL. But this means that we need unique, persistent, unambiguous URLs. Thankfully, as Michael has already discussed, Musicbrainz gives us unique identifiers that allow us to provide just such a set of URLs.

The core of the new site then is built around Musicbrainz. In addition to giving us web scale identifiers Musicbrainz is also being used to give core music metadata e.g. discographies, related artists and related links. Some of those related links are Wikipedia links and we are using those to go and fetch the introductory text for each artist’s biography from Wikipedia.

The approach we’re using to keep the Wikipedia data up to date is, I think, quite neat. Patrick has written a bot to monitor the Wikipedia IRC channel for updates – when it spots an update we fetch the new Wikipedia content. Oh and obviously all this data is rendered dynamically using the same MVC framework we are using for /programmes which means that updates happen almost instantaneously.

This brings me to the next major feature – integration with /programmes. As I’ve said we have integrated with VCS to give us track listing information for our daytime radio shows – we are matching this data with both our internal database of programme metadata and Musicbrainz. This lets us know which radio stations and shows have played which artists.

We also want to make this data available for others to use and so have designed the site to provide a RESTful API, following the principles of Linked Data:

…namely thinking of URIs as more than just locations for documents. Instead using them to identify anything, from a particular person to a particular programme. These resources in-turn have representations, which can be machine-processable (through the use of RDF, Microformats, RDFa, etc.), and these representations can hold links towards further web resources, allowing agents to jump from one data-set to another.

If you would like to have a play with these we have RDF, XML, JSON and YAML representations of the resources – just add .xml .rdf .json or .yaml to the end of the artist url.

Where next?

Nick, Michael and I have previous spoken about our plans for linking programmes, music, events, topics and users. Well this is our first foray into this world. Information about programmes and music is interesting, it’s useful; but it’s not as interesting nor as useful as when the two are intelligently linked. Joining the two worlds means that you can aggregate information about the programmes that have played an artist [as we’ve done], you can put track listings on episode pages, you can have charts of which artists are played most on all BBC Radio programmes, on Radio 1, by Zane Lowe. You can also aggregate all episodes that you can currently listen to that feature a given artist. Or show which programme first played a given artist and how often the BBC has played them since. Basically the interesting stuff happens at the joins between the nodes because that’s where the context lives.

By exposing the information that is created by joining programmes and music we can provide context and serendipity. We can help you find out about the music you’ve just listened to and introduce you to new shows that also features the music you like. So that’s what we’re working on.

We also need to provide more data about each artist – from both inside and outside the BBC. That means bringing the album reviews into the fold, hooking up external news feeds and the like. But whatever we do it’s worth bearing in mind that this is on a much larger scale than the current music site. This new site has in the order of 388,398 artist pages, 157,677 external links and 93,912 artist to artist relationships.

It would be great to hear what you think about what we’ve done and our plans for the future. You are welcome to leave a comment here, or via the Backstage mailing list.