On the web I reckon there’s only metadata and URIs or perhaps there’s no metadata and only data. Either way the metadata, data/content distinction isn’t helpful.

Linked Data allows you to bind HTTP URIs to an object and to information about that object. This is useful because it’s more useful to talk about real world things — things like people, places and events — the things that people think about. Despite this I have numerous conversations, and have done for years, about what ‘metadata’ to use to describe a document. Typically what this really means is: “what keywords to use so that some technomagical solution can use that ‘metadata’ to personalise/ recommend content”.

Self-portraiture + metadata by Saltatempo's. Some rights reserved
Self-portraiture + metadata by Saltatempo's. Some rights reserved

Beyond the obvious — keywords on their own are never going to achieve the sorts of solutions non-technical people imagine — it also forces an unhelpful schism. It makes people think about their content and your metadata, or that metadata is somehow outwith the content they are creating. The trouble is that one persons data is another persons metadata. Is the title of a story metadata or content? Is a news story content or metadata about a real world event? The answer depends on your perspective.

It seems to be that a more useful way to think about things is to have URIs to identify things and then have information/documents/data/metadata/whatever that make assertions about those things. Sometimes those bits of information will be simple data points, for example, for an album release they might include information/metadata about who performed or wrote the piece (obviously linking to URIs to identify the person who did perform or write it, with appropriate predicates) while other bits of metadata might be more verbose: reviews of the album or the lyrics etc. and then again some might be media things (recordings of the album etc.).

And of course because we’re talking about a graph of data, those documents making assertions about a thing can in turn also have metadata/data/documents which make assertions about them, for example, who wrote it, comments about it etc.

Imagine what might happen if a news website took this approach? You would mint a URI for the event (or reuse one that already existed) and then write news stories about it, each with their own URL, each making assertions about that event. It would create a news service which was truly native to the Web, rather than a facsimile of the printed press. Imagine then what it would be like if we could link-up all the news stories on the web which also made assertions about that event. As a user of such a site/ set of sites I could find everything about a given thing (a person, event or place).

Of course, as Dan Brickley, put it:

concepts and events are still social and technological artefacts, but they are designed to help interconnect descriptions of butterflies, documents (and data) about butterflies, and people with interest or expertise relating to butterflies.

In other words what matters is a way of identifying things, a way of interconnecting them and a way of describing them — subdividing those ways of describing them into ‘data’ and ‘metadata’ is unhelpful, or at the very least adds nothing useful.

It is however useful to separate our concept of something from our conception of it. As Stephen Pinkers puts it:

…if you look up William Shakespeare in a dictionary it says “English playwright, lived in the 17th century, wrote Romeo and Juliet and Hamlet, etc.” Is that what the name William Shakespeare means, and is that what the concept William Shakespeare is? That sounds plausible, but it turns out not to be true. If we were to learn that William Shakespeare didn’t write any of the plays attributed to him — let’s say that we learned he didn’t even live in Stratford, that there was a clerical error and he really lived in Warwick. He would still be William Shakespeare, and we wouldn’t posthumously dub the real author of Shakespeare’s plays William Shakespeare. We would just say we were mistaken about what we believed about William Shakespeare.

So what is the concept of William Shakespeare, the meaning of the word William Shakespeare? Basically, when Mr. and Mrs. Shakespeare christened their son William, and the name stuck, and then everyone who knew him, and then who knew someone else, who knew someone else, and passed it down to us — that unbroken chain of transmission of the name from the moment of first dubbing is what gives William Shakespeare its meaning. There’s a sense in which to have a concept necessarily means to be connected to the world through this chain of transmission of a name going back to the moment of first dubbing.

So while I don’t think it’s helpful to separate data from metadata it is helpful separate concept from conception.