Some thoughts on rNews

IPTC are working an ontology known as rNews which aims to standardise (and encourage the adoption of) RDFa in news articles.

This is a very, very good idea – it should allow for better content discovery, new ways to aggregate news stories about people, places or subjects and generally allow computers to help people process some of the structured information behind a story.

Newspaper by Luc De Leeuw

rNews is still in draft. At the time of writing the published spec is at version 0.1, there are clearly ambitions to built out on this work and it will be interesting to see where it goes.

Although I’m sure much of this has been thought about before I thought I would jot down my initial thoughts on this early draft.

More URIs please

The current spec makes extensive use of xsd:string and xsd:double to assign attributes to a class. For example, the Location Class includes attributes for longitude, latitude and altitude but no URIs for places.

Using URIs to name places (and people, subjects, organisations etc.) would allow for much more interesting things to be done with the data.

It would make it easier to aggregate content from more than one news outlet and generally link things together by location, person and area of interest.

There’s obviously an issue here – there needs to be a good source of URI for places – but in reality there are lots of candidates out there from dbpedia to geonames.

Greater reuse of existing vocabularies

There are existing vocabularies that describe the some of the classes described in rNew – notably FOAF and Dublin Core.

I would prefer rNews reusing those vocabularies or at least linking (owl:sameAS) to them.

I’m not a fan of tags

I don’t really like “tagging” it lack semantics and is extremely ambiguous.

If I tag a news story am I claiming it’s primarily about that thing, features that thing, also about that thing, what? And whatever you think it means I guarantee I can find someone else who disagrees!

I would rather see more defined predicates such as primarilyAbout etc. I recognise this would add a bit of complexity but it would also increase the utility of the vocabulary.

If the intention is to aid discoverability through categorisation then use SKOS.

Explicit predicates for source materials

I think it’s really important to explicitly link to source material, especially for science and medicine (it’s why Nature News and has always done so).

A simple set of predicates for the DOI, abstract URI, scientist/researcher of the original research and/or a URI for the raw data should suffice.

Again, it would also help if there was a handy source of URIs for scientists.

Should the story be at the heart of the ontology?

I’ve always thought of news stories as metadata about real world events.

If you reframe the problem in this way then what you really want are predicates to describe the relationship of the story (article, photo, video) to the event. You also then want links between people & places and those events (which could be inferred through the various news stories).

Building the ontology this way round would allow for some very powerful analysis and discovery of stories.

Anyway – I’ll be really interested to see how the ontology develops and how widely it gets adopted.

Comments

7 Comments so far. Leave a comment below.
  1. Thanks Tom for these comments! As IPTS folks have already answered you briefly on Twitter, rNews v0.5 (already staged) will be released next week and addressed most of your concerns: there will be many more URIs everywhere (location, people, organization, topics), the Tag class have replaced by a Concept class, etc.

    rNews has decided to mint new URIs in its own namespace (facebook or google approach) but at the same time provide official mappings to common vocab (FOAF, DC, Geo, etc.). Those will me materialized with owl:equivalentClass, owl:equivalentProperty or subClassOf properties.

    rNews 0.5 will still not be the final version of rNews and we know already what should be yet improved but we will welcome your feedback.

    • rNews 0.5 sounds good to me! Be interested to see how you’re dealing with URIs for locations, people and organistations – are you recommending an source of URIs etc.

      Out of interest why did you decide to mint URIs it your own namespace? I’m curious what motivated you/what advantages you think it gives you?

      Looking forward to seeing the next version.

      • News agencies do manage their own taxonomies for people and organizations and I think they will continue to do so. There will be (other) people discovering and publishing bunch of owl:sameAs links with other encyclopedia knowledge bases à la DBpedia. The purpose of rNews is to improve new agencies workflow and exchange, mainly B2B scenario I think.

        I’m personally quite neutral in minting new URI or directly re-using existing URIs when they exist, as soon as when new URIs are minted, vocab are also interconnected. I see pro and cons for both approaches. When you have only URIs in your domain, you don’t have to be afraid that FOAF might disappear in the future. It gives you a bit more control over your metadata. The links between vocabularies enable the interoperability you’re looking for …

        • Thanks for getting back so quickly.

          I didn’t realise that news agencies published (publicly addressable) URIs for their taxonomies. Not that I thought rNews should mint its own it would be good if there was a mechanism to cross link the New York Times’s “Egypt” with the Guardians URI for Egypt and the BBC’s. Clearly rNews is a big part of the equation in getting this to happen.

          I guess I lead towards reuse – but perhaps that’s because I’m lazy :)

          Good luck with rNews.

  2. “I’ve always thought of news stories as metadata about real world events.”

    Simply love that line. I have never thought about news that way, opens up interesting possibilities, so thanks for that.

    Also agree with your view that providing predicates would allow for ‘analysis and discovery’. But I have found it a little difficult to move beyond intuitively understanding this, which basically means I know it is feasible but cannot figure out how nor seen anyone get this right.

    Mind you the discovery bit is doable. Amazon’s recommendation system accomplishes this, perhaps without the rigor of semantic data/vocab models.

    But the analysis is a tricky. Could you provide an example of how semantically marked up news, say via rNews, will allow for analysis?

    • From my own work the discovery bit can be seen (in a simple way) on pages like this: Polar Bear page if you look at the RDF

      You can see in both the UI and in the model that there are predicates that define the habitats where this species live, the adaptations is has to live in those environment etc.

      The model and the links help you understand more about the animals whether or not you dereference those URIs.

      You could also use a SPARQL endpoint to query all the data to e.g. return you only carnivorous (adaptation) polar (habitat) mammals (taxonomic rank).

  3. Zac Bjelogrlic,

    Very nice Tom. I like the ideas. More later.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,329 other followers

%d bloggers like this: