Skip to content
Menu

Derivadow

Thoughts and observations on the open web, linked data, URIs and generally how technology and design can create great things for people to use.

  • Twitter
  • Flickr
  • LinkedIn
  • Google

Categories

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 1,374 other followers

Tag: wikipedia

The web as a CMS

If you run a website you’re going to want to manage your content. You might use an Enterprise CMS, an open source CMS, a blogging platform or a bespoke app, and as you might expect at the BBC the same rules apply. Except some of us have been trying out something a bit different — using the web as a content management system.

Coffee Shop Study by notashamed. Some rights reserved

As anyone that reads this blog will realise I’m a bit of of Linked Data nut. I believe that the future of web design lies not in webpages but in URLs and resources. If you’re not sure what I’m on about then I suggest you read Tim Berners Lee’s article on the Giant Global Graph or Tom Coates’s presentation Native to the Web of Data.

I and the teams working on BBC programmes and music believe that the web is wonderful because of links — because people can go on journeys of discovery, browsing by meaning — by following links to the things that interest them. We also believe that where existing webscale identifiers exist or where existing ontologies, taxonomies and metadata exists we should reuse those and link to them. This focus on URLs and resources, and existing services has meant the services that we’ve been building are a bit different, as we’ve stated previously:

A core feature of the site is the integration with external services – notably MusicBrainz and Wikipedia. We are using these services to provide core information (discographies, biographical information, membership etc) about artists and releases. We are then combining this data with information from within the BBC – including details about which BBC programmes have played that artist.

The use of community, non-BBC, maintained databases obviously means that much of the data the BBC is publishing on bbc.co.uk/music/beta comes from external services. But what may not be immediately obvious is that the BBC isn’t forking this data, as others have done. Because that would be silly.

wikipedia

So if we want to create a new artist, or edit any of this content then rather than editing that data via some internal CMS we edit MusicBrainz or Wikipedia. Just like you or anyone else, as Nick puts it:

From now on if I want to indulge my love of Frank Sinatra, I’ll just edit the Wikipedia page, knowing it will turn up on the BBC. Collaboration is the future, and not just in music.

Indeed with MusicBrainz we’ve been actively contributing content since June 2007.

[…] In exchange, MusicBrainz receives a monthly license fee that will allow MetaBrainz to hire some engineering help in the coming months to work on new features and to improve the existing infrastructure. This is quite significant since MusicBrainz has been resource constrained for many months now — having paid people on staff will ensure a more reasonable amount of progress moving forward.

Even cooler, the BBC online music editors will soon participate in the MusicBrainz community contributing their knowledge to MusicBrainz. The goal is to have the BBC /music editorial team round out and add new information to MusicBrainz as they need to use it in their MusicBrainz enabled applications internally.

As of today the BBC music team has contributed over 2,800 edits to MusicBrainz. None of this is to say that we don’t need content management solutions for our own, internal (meta)data we clearly do, but it does means that a significant proportion of these services are drawing on data from elsewhere on the web, and that means that if we want to edit it we edit the web improving those services and those provided by the BBC (including search).

Share this:

  • Twitter
  • Reddit
  • More
  • Print
  • Email
  • LinkedIn
  • Facebook

Like this:

Like Loading...
By Tom Scottin BBC, BBC Programmes, Linked Data, Metadata, MusicBrainz, Semantic web, Technology, URL, Web development, Work13 January 20091 July 2011622 Words22 Comments

The all new BBC music site where programmes meet music and the semantic web

We (well Nick, Patrick, Tony, Deanna, Sacha and Guy) have recently been working on a major rewrite of the BBC’s music site [beta] – and since its just gone live as a public beta I thought it would be a good time to explain a bit about what we’ve done and what we hope to do with the site. I would also love to hear what you think about what we’ve done so far – especially as we move from public beta towards a replacement for the current site.

Madonna's artist page on the BBC's new beta music site
Madonna's artist page on the BBC's new beta music site

Our work so far has focused on providing core information for every artist the BBC plays on our daytime radio network shows. That’s shows like Annie Mac and Chris Moyles on Radio 1, or Steve Lamacq on 6Music. We’ve focused on these shows because we’ve integrated the site with the radio playout system VCS dira! which are basically giant iPods in the basement of Broadcasting House. Unfortunately the specialist shows as well as national and local radio shows don’t use these so we haven’t got track listing data in the right format for these programmes. But we have a plan and will be adding more shows in due course.

So what have done so far?

For starters we decided that we wanted to have a single canonical page for every artist. We decided to do this because we want to aggregate everything we know about an artist at a single URL. But this means that we need unique, persistent, unambiguous URLs. Thankfully, as Michael has already discussed, Musicbrainz gives us unique identifiers that allow us to provide just such a set of URLs.

The core of the new site then is built around Musicbrainz. In addition to giving us web scale identifiers Musicbrainz is also being used to give core music metadata e.g. discographies, related artists and related links. Some of those related links are Wikipedia links and we are using those to go and fetch the introductory text for each artist’s biography from Wikipedia.

The approach we’re using to keep the Wikipedia data up to date is, I think, quite neat. Patrick has written a bot to monitor the Wikipedia IRC channel for updates – when it spots an update we fetch the new Wikipedia content. Oh and obviously all this data is rendered dynamically using the same MVC framework we are using for /programmes which means that updates happen almost instantaneously.

This brings me to the next major feature – integration with /programmes. As I’ve said we have integrated with VCS to give us track listing information for our daytime radio shows – we are matching this data with both our internal database of programme metadata and Musicbrainz. This lets us know which radio stations and shows have played which artists.

We also want to make this data available for others to use and so have designed the site to provide a RESTful API, following the principles of Linked Data:

…namely thinking of URIs as more than just locations for documents. Instead using them to identify anything, from a particular person to a particular programme. These resources in-turn have representations, which can be machine-processable (through the use of RDF, Microformats, RDFa, etc.), and these representations can hold links towards further web resources, allowing agents to jump from one data-set to another.

If you would like to have a play with these we have RDF, XML, JSON and YAML representations of the resources – just add .xml .rdf .json or .yaml to the end of the artist url.

Where next?

Nick, Michael and I have previous spoken about our plans for linking programmes, music, events, topics and users. Well this is our first foray into this world. Information about programmes and music is interesting, it’s useful; but it’s not as interesting nor as useful as when the two are intelligently linked. Joining the two worlds means that you can aggregate information about the programmes that have played an artist [as we’ve done], you can put track listings on episode pages, you can have charts of which artists are played most on all BBC Radio programmes, on Radio 1, by Zane Lowe. You can also aggregate all episodes that you can currently listen to that feature a given artist. Or show which programme first played a given artist and how often the BBC has played them since. Basically the interesting stuff happens at the joins between the nodes because that’s where the context lives.

By exposing the information that is created by joining programmes and music we can provide context and serendipity. We can help you find out about the music you’ve just listened to and introduce you to new shows that also features the music you like. So that’s what we’re working on.

We also need to provide more data about each artist – from both inside and outside the BBC. That means bringing the album reviews into the fold, hooking up external news feeds and the like. But whatever we do it’s worth bearing in mind that this is on a much larger scale than the current music site. This new site has in the order of 388,398 artist pages, 157,677 external links and 93,912 artist to artist relationships.

It would be great to hear what you think about what we’ve done and our plans for the future. You are welcome to leave a comment here, or via the Backstage mailing list.

Share this:

  • Twitter
  • Reddit
  • More
  • Print
  • Email
  • LinkedIn
  • Facebook

Like this:

Like Loading...
By Tom Scottin BBC Programmes, Information Architecture, Linked Data, Metadata, Music, MusicBrainz, Semantic web, URL, Web development, Work28 July 200814 April 2011919 Words26 Comments
Blog at WordPress.com.

Menu

  • Home
  • About me
  • Talks
  • Archive
  • Colophon
  • Contact
  • Twitter
  • Flickr
  • LinkedIn
  • Google
Cancel
loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.
%d bloggers like this: