Semantic web – why bother?

Semantic web technologies allow you to describe arbitrary concepts and the relationship between those concepts so that machines can process the associated data. But why would you bother? After all Clay Shirky has suggested that the Semantic web is useful almost nowhere.

To date when the press picks up on the semantic web they tend to think of it as a search technology. A Google killer. Unfortunately that misses the point – now for sure I don’t know what will turn out to be the killer app – but I will be very surprised if its search. So if it’s not about search what is the point of the semantic web? Well Wikipedia describes the purpose of the semantic web as follows:

Humans are capable of using the Web to carry out tasks such as finding the Finnish word for “cat”, reserving a library book, and searching for a low price on a DVD. However, a computer cannot accomplish the same tasks without human direction because web pages are designed to be read by people, not machines. The semantic web is a vision of information that is understandable by computers, so that they can perform more of the tedious work involved in finding, sharing and combining information on the web.”

Note that this doesn’t mention search once. Unfortunately we are all so use to Google being the dominant web application and search being the standard way in which we find documents and generally interface with the web that we tend to frame the semantic web in the same terms: search. We tend to think of the Semantic Web as a solution to a better search. But that’s not it – at least not in my opinion.

Sure Semantic Web technologies such as SPARQL allow software engineer’s to construct complex queries across multiple, remote data repositories but that’s different from end users searching for web documents. It’s the difference between SQL and Google.

What I think we will see with the birth of the semantic web is a split economy. With one half supplying structured, linked data to the cloud and the other consuming it and using it to build user facing products.

If you are a publisher of linked data, like the BBC has started to be with its programme metadata, then you should focus on making your data available in a variety of open, machine readable formats for other software engineers to consume. These software engineers can then use this raw data to build end user facing products, mashing up data from multiple sources.

As a publisher of linked data not only are you enabling others to build cool things with your data – helping you right now – you are also helping to insulate yourself against atrophy. If your data is as open and accessible as possible – then it is relatively straight forward for your future colleagues, working on a as yet unimagined product, to use the data you are publishing right now.

This then is the reason we should bother with the semantic web. By making data available for the machine overlords we will start to see re-use of data, by people who would not normally have access to it, so that they can make things with it that weren’t originally thought of to delight end users.

Photo: i know what you mean, by dullhunk. Used under licence.