…is a blog by Tom Scott a place where I ramble about my thoughts and observations on the open web, linked data, URIs and generally how technology and design can create great things for people to use.
I’ve just posted a piece on my thoughts about the first couple of days at last weeks XTech over at the BBC’s Internet blog.
As David Recordon of Six Apart noted in Wednesday morning’s plenary, open software and hardware have become hip and have given small groups of developers the chance to build interesting web apps – and, more importantly, the chance to get them adopted. This is a new wave of web companies which expose their data via APIs and consume others’ APIs. And what is interesting about these companies is that they are converging on common standards – in particular, OAuth and OpenID.”
There was a lot on data portability and Semantic Web stuff (including our presentation on the Programme’s Ontology) both of which I’m really pleased to report are getting some practical adoption. And as with the Social Graph Foo Camp XMPP appears to be to an important emergent technology. I just hope it can scale.
We’re all use to using URLs to point at web pages but we too often forget that they can be use for other things too. They can address any resource and that includes: people, documents, images, services (e.g., “today’s weather report for London”), TV or Radio Programmes in fact any abstract concept or entity that can be identified, named and addressed.
Also, because these resources can have representations which can be processed by machines (through the use of RDF, Microformats, RDFa, etc.), you can do interesting things with that information. Some of the most interesting things you can do happen when URLs identify people.
Currently people are normally identified within web apps by their email address. I guess this sort of makes sense because email addresses are unique, just about everyone has one and it means the website can contact you. But URLs are better. URLs are better because they offer the right affordance.
If you have someone’s URL then you can go to that URL and find out stuff about that person – you can assess their provenience (by reading what they’ve said about themselves, by seeing who’s in their social network via tools such as XFN, FOAF and Google’s Social Graph API), you can also discover how to contact them (or ask permission to do so).
With e-mails the affordance is all the wrong way round – if I have your email address I can send you stuff, but I can’t check to see who you are, or even if it is really you. Email addresses are for contacting people they aren’t identifiers; by conflating the two we’ve gots ourselves into trouble because email addresses aren’t very good at identifying people nor can they be shared publicly without exposing folk to spam and the like.
This is in essence the key advantage offered by OpenID which uses URLs to provide digital identifiers for people. If we then add OAuth into the mix we can do all sorts of clear things.
The OAuth protocol can be used to authenticate any request for information (for example sending the person a message), the owner of the URL/OpenID decides whether or not to grant you that privilege. This means that it doesn’t matter if someone gets hold of an URL identifier – unless the owner grants permission (on a per instance basis) they are useless – this is in contrast to what happens with Email identifiers – once I have it I can use it to contact you whether you like it or not.
Also because I can give any service a list of my friend’s URLs without worrying that their contact details will get stolen I can tip up at any web service and find which of my friends are using it without having to share their contact details. In other words by using URLs to identify people I can share my online relationships without sharing or porting my or my friend’s contact data.
You retain control over your data, but we share the relationships (the edges) within our social graph. And that’s the way it should be, after all that all it needs to be. If I have your URL I can find whatever information (email, home phone number, current location, bank details) you decide you want to make public and I can ask you nicely for more if I need it – using OAuth you can give me permission and revoke it if you want.
I think most social networking sites get data portability wrong – because they copy your contact data from one system to another. And in doing so often end up spamming your friends with ‘invites’ as well as leaving you with the headache of having to maintain your contact details in lots of different places.
The problem is that just because you have someones contact details, that doesn’t mean that they will want to join every service that you want to join and visa versa. You don’t want to port all your contact data from one service to another; you just want to know when your friends also join a service so you can connect to them.
It can also be argued that data portability can create issues for users. As I’ve discussed previously, at the recent Social Graph Foo Camp there appeared to be consensus that people don’t (yet) expect the data they enter in one site to suddenly appear in another. But they do expect to be able to easily find their friends within a new network. And as noted by Robert Scoble in his conversation with Dave Morin, head of Facebook’s application platform:
… if a user wants to delete his or her info off of Facebook. Today that’s possible. But what about in a really data portable world? After all, in such a world Facebook might have sprayed your email and other data to other social networks. What if those other social networks don’t want to delete your data after you asked Facebook to?
Or
Which of your data is yours? Which belongs to your friends? And, which belongs to the social network itself? For instance, we can say that my photos that I put on Facebook are mine and that they should also be shared with, say, Flickr or SmugMug, right? How about the comments under those photos? The tags? The privacy data that was entered about them? The voting data? And other stuff that other users might have put onto those photos? Is all of that stuff supposed to be portable?
Now I should say up front that I don’t completely buy into Dave’s arguments – for starters they smack of FUD – but that doesn’t mean that there’s no merit in these arguments nor that there aren’t issues with data portability. Copying your entire social graph between different systems can’t be the way forward. As Simon Willison puts it:
I think data portability is the wrong framing—moving data between sites is really hard. Importing social relationships between sites is much more viable (hence my interest in social network portability). Also, the complaints about systems sharing e-mail addresses are neatly addressed by using OpenID as the GUID for a user instead.
A couple of sites spring to mind that I think are getting much closer to the answer: Dopplr and Fire Eagle. Dopplr helps travellers meet up with each other by showing when your friends’ travel plans coincide with yours. Fire Eagle is a service that acts as a geolocation brokerage service – tying together applications that provide geolocation data (mobile phones, GPS devices etc.) with services that consume such data (like Dopplr).
Dopplr doesn’t try to port your address book into it’s own database instead it uses XFN,Google’s contacts data API and Yahoo’s Flickr Auth to find existing Dopplr users you already know on Twitter, GMail and Flickr respectively. In other words Dopplr only imports the social relationships that already exist.
Fire Eagle doesn’t even try to import your social graph. Instead it snuggles into it’s own niche by adding specific functionality to existing services, giving you the ability to share your location with sites and services online. This is very smart, because it means Fire Eagle can focus on what it does best (sharing your location in a secure fashion) and not on what others do best (telling people, you know, which city you are travelling to; sharing photos with your friends; telling folk what you’re up to etc.).
This differentiation strategy – focusing on what you do best and making it as easy as possible for others to integrate with your service – points the way to a possible future where you can plug services together extending the data and functionality available to you. What would this look like? Well one way of cutting it would be:
Online services either provide functionality or data that can be plugged into your favourite social networking site; or functionality that lets you manage your social graph’s relationships (similar to Dopplr) – all mediated via OAuth or similar.
Everyone in your network jointly owns the graph – and this is what we should focus on making portable so that if someone you know joins a service you are using then you get to know.
You should manage your identity and personal data. OpenID is the obvious way of doing this and would mean that your details could be managed in one location and independent of any given service (like Facebook) although of course these sites could also act as OpenID providers.
Obviously you own your resources – your photos, documents etc. and you should always have the right to move these to other services. But you should also be able to connect your social graph to these resources – should you wish to – as Dopplr have done with Flickr. Dopplr doesn’t provide a photo sharing feature – instead it integrates with Flickr so your photos are stored with Flickr but accessible via Dopplr.
Taking this approach not only places you in control of your data – so you won’t get into the problems Dave Morin highlights above but it’s also good for competition.
I’ve just published a short piece on my recent trip to San Francisco and the O’Reilly Foo Camp over at the BBC Radio Lab’s blog.
It was my first trip to San Francisco and I loved the city (you can see my photos on Flickr). But I was also struck my how meme friendly the place is. I guess that’s not that surprising – it’s a relatively small city with a high density of tech companies in and around the bay area, but none the less it does appear to be a good place for tech memes to arise and flourish. One reason why that corner of the world produces so much innovative technology?
Anyway below is my blog post as published on the Radio Lab’s blog.
“I’ve recently returned from a very enjoyable and educational trip to California where I was honored to be invited to attend the Social Graph Foo Camp. Although I do have to say that while I found the whole thing very exciting I was also, at times, left realising just how far behind some of the conversations I have become, it really is amazing how rapidly the issues and technology within this space are developing – and that’s in the context of a fast moving industry.
It was, however, clear that the really big issues are social not technological: user expectations, data ownership and portability. Although a key piece of the technology puzzle in all this is the establishment of XFN and FOAF which are going to play an ever increasingly important role in glueing different social networks together. And with the launch of Google’s Social Graph API (released under a Creative Commons license by the way) data portability is going to really explode; but with it expect more “Scoblegate” like incidents.
But the prize for getting this right are great, as illustrated by this clip of Joseph Smarr of Plaxo presenting on friends list portability and who owns the data in social networks.
For my part what I took away from this and other discussion is that although on the surface moving data between one social network and another is no different from copying a business card into Outlook people’s expectations make it different. People don’t (yet) expect the data they enter in one site to suddenly appear in another. But they do expect to be able to easily find their friends within a new network. Google’s Social Graph API will make it easier – but there will be a price, as Tim O’Reilly points out:
“Google’s Social Graph API… will definitively end “security by obscurity” regarding people and their relationships, as well as opening up the social graph to “rel=me” spammers. The counter-argument is that all this data is available anyway, and that by making it more visible, we raise people’s awareness and ultimately their behavior.”
Tied to all of this, of course, is the rise of OpenID, the open and decentralized identity system, and OAuth an open protocol to allow secure API authentication between application. Both of which appear to be central to most people’s plans for the coming year.
So what were the other highlights? For me I’m really exited by Tom Coates and Rabble’s latest Yahoo! project: Fire Eagle; which allows you to share you location with friends, other websites or services.
You can think of Fire Eagle as a location brokerage service. Via open APIs other people can write applications that update Fire Eagle with your location so that further applications that can then use it. So for example, someone might write an application that runs on your mobile that triangulates your position based on the location of the transmitters before sending the data to Fire Eagle. You could then run an application on your phone that let you know if your friends where near by, what restaurants are in your area or where the nearest train or tube station is.
Obviously what Fire Eagle also provides is lots of security so you can control who and what applications have access to your location data. I can’t wait to see what people end up doing with Fire Eagle and I’m hoping that we can come up with some interesting applications too.
Finally, XMPP, which I have to say caught me a bit by surprises. If you’ve not come across it before XMPP it’s a messaging and presence protocol developed by Jabber and now used by Google Talk, Jaiku and Apple’s iChat amongst others (with a lot more clients on the way if last weekend was anything to go by).
XMPP is a much more efficient protocol than HTTP for two way messaging because you don’t require your application to check in with the servers periodically – instead the server sends a signal via XMPP when new information is published. And there’s no need to limit that communication to person to person – XMPP can also be used for essentially machine-to-machine Instant Messaging which means you have real time communication between machines.
So based on last weekend’s Foo Camp it looks like XMPP, OpenID, OAuth are all going to be huge in 2008, Google’s Social Graph API and related technologies (FOAF and XFN) will result in some head aches while people’s understanding and expectations settle down but it will be worth it as we move towards a world of data portability.”
So Robert Scoble got his Facebook account disabled for running a script that scrapped his account for names, email address and birthdays and load the data into his Plaxo account – so that he could match Facebook names with names in Plaxo’s database. On the surface this is no different from Facebook’s own importer – which lets you enter your email address and password for, for example, your GMail account – so that your contact details can be loaded into Facebook (which BTW is a very bad idea).
It’s worth remembering that what we’re talking about here is basic contact information – the script didn’t try to grab any information from Scoble’s Social Graph – no friends of friends data, not people’s interests, nothing like that – nor did Plaxo sign up those users to its Social Networking application Pulse. Despite that the general feeling out there is that Plaxo are evil and neither Plaxo nor Robert had the right to run the script. I suspect that this is mainly because the early version of Plaxo made it very easy to email everyone in your address book with a request to join Plaxo, this was a bit rubbish and got Plaxo a bad name for spamming folk. Quite right too although its worth noting that this hasn’t been a problem since they rewrote it last year.
But if you step away from people’s prior poor experience with Plaxo what they and Scoble tried to do was no different from what Facebook does. The difference is one of reputation. All Plaxo are trying to give their users are tools to get data into their database. This is harder with Facebook because it’s a walled garden and walled gardens, as the name suggests, makes too tough to get data out. The pertinent question then is who owns the data – is it Facebook, Robert Scoble or each ‘friend’?
I know that, as far as I’m concerned, it’s not Facebook. You should be able to move your data between systems. The DataPortability folk have got the right philosophy:
As users, our identity, photos, videos and other forms of personal data should be discoverable by, and shared between our chosen tools or vendors. We need a DHCP for Identity. A distributed File System for data. The technologies already exist, we simply need a complete reference design to put the pieces together.
Unfortunately as the Scoble-Facebook story illustrates access to our online identity is often effectively controlled by others. Robert Scoble has access to 5,000 people’s contact details plus a good chunk of their social graph via Facebook. So while Facebook is wrong to lock your data away behind a walled garden, Scoble or anyone else might do the wrong thing if they export the social graph and profile information of their contacts (not that he did in this instance).
What we also need, in addition to data portability, are privacy controls. As Jason Kottke puts it:
[what’s needed is]…Facebook inside-out, so that instead of custom applications running on a platform in a walled garden, applications run on the internet, out in the open, and people can tie their social network into it if they want, with privacy controls, access levels, and alter-egos galore.
Or as Robert Scoble suggests a DRM for your personal data:
COMPLETELY OPEN: You’re allowed to take anything on my profile page and import it, use it, copy it, print it, import it.
EMAIL ONLY: You can only take my name, and email address to other systems.
EMAIL PLUS CORE PERSONAL INFO: In addition to email address and name you can also take my birthday and phone number to other systems.
CUSTOM: You choose which fields can be exported or used on other systems.
NAPKIN ONLY: You can use anything you want, but no automated systems, you’ve gotta manually copy everything over by hand.
PUBLIC ONLY: Only data that I put on my public profile can be used elsewhere.
FAN ONLY: I only wanted to see your social network and behaviors here, I don’t want to give you access to mine.
Clearly what I’m suggesting (and I assume so is Scoble) is a rights management system which would be respected by the various social networking applications, not a solution that would encrypt your data into a binary file that required your approval to unpackage. In other words a system that would give you control over your data and allow you to decide how it was shared with others who may or may not be using the same social networking application as you.
1842 - Ada Lovelace writes the first program. She is hampered in her efforts by the minor inconvenience that she doesn't have any actual computers to run her code. Enterprise architects will later relearn her techniques in order to program in UML.
My name is Tom Scott and this is my personal blog. I currently work at Nature Publishing Group, I previously worked at the BBC. However, these are my thoughts and observations not theirs.
I’m interested in and blog about how to make the web more human literate, linking data the webby way.