The URL shortening anti pattern

Along with others I’ve recently started to grok Twitter – it took a while – but I now find it a fantastic way to keep in touch with folk that I know or respect, or catch up on snippets of info from news services around the web. It’s great.

What makes Twitter particularly useful, as a way of keeping in touch with a large number of people, is the limit of 140 characters per ‘tweet’. That’s it, each tweet is 140 character or less. But what this also means is that if you tweet about a URL that URL eats up a lot of those 140 character. To help solve this problem Twitter uses TinyURL to shorten the URL. This is a solution to the problem but unfortunately it also creates a new one.

Example of poor url design

URLs are important. They are at the very heart of the idea behind Linked Data, the semantic web and Web 2.0 because if you can’t point to a resource on the web then it might as well not exist and this means URLs need to be persistent. But URLs are important because they also tell you about the provenance of the resource and that helps you decide how important or trustworthy a resource is likely to be.

URL shortening service such as TinyURL or RURL are very bad news because they break the web. They don’t provide stable references because they are Single Points of Failure acting as they do as another single level of indirection. URL shortening services then are an anti pattern:

In computer science, anti-patterns are specific repeated practices that appear initially to be beneficial, but ultimately result in bad consequences that outweigh the hoped-for advantages.

URL shortening services create opaque URLs – the ultimate destination of the URL is hidden form the user and software. This might not sound such a big deal – but it does mean that it’s easier to send people to spam or malware sites (which is why Qurl and jtty closed – breaking all their shortened URLs in the process). And that highlights the real problem – they introduce a dependency on a third-party that might go belly up. If that third-party closes down all the URLs using that service break, and because they are opaque you’ve no idea where there originally pointed.

And even if the service doesn’t shut down there would be nothing you could do if that service decided to censor content. For example the Chinese Communist Party might demand that TinyURL remap all the URLs it decided were inappropriate to state propaganda pages. You couldn’t stop them.

But of course we don’t need to evoke such Machiavellian scenarios to still have a problem. URL shortening services have a finite number of available URLs. Some shortening services like RURL use 3 character (e.g. http://rurl.org/lbt), this means these more aggressive RUL shortening services have about 250,000 possible unique three-character short URLs, once they’ve all been used they either need to add more characters to their URLs or start to recycle old one. And once you’ve started to recycle old URLs your karma really will suffer (TinyURL uses 6 characters so this problem will take a lot longer to materialise!)

There is an argument that on services such as Twitter the permanence of the URL isn’t such an issue – after all the whole point of Twitter is to provide a transitory, short lived announcement – Twitter isn’t intended to provide an archive. And the fact that the provenance of the URL is obfuscated maybe doesn’t matter too much either, since you know who posted the link. All that’s true, but it still causes a problem when TinyURL goes down, as it did last November and it also reinforces the anti-pattern and that is bad.

Bottom line, URLs should remain naked, providing this level of indirection is just wrong. The Internet isn’t supposed to work via such intermediate services; the Internet was designed to ensure there wasn’t a single single point of failure that can so easily break such large parts of the web.

Of course simply saying don’t use these URL shortening services isn’t going to work. Especially when using services such as Twitter, where there is a need for short URLs. However, what it does mean is that if you’re designing a website you need to think about URL design and that includes the length of the URL. And if you’re linking to something on a forum, wiki, blog or anything that has permenance please don’t shorten the URL, keep them naked. Thanks.

Photo: Example of poor URL design, by Frank Farm. Used under licence.

13 thoughts on “The URL shortening anti pattern

  1. Perhaps the antipatterns that pertain to particular ‘views’ of a data stream are bad, but what if (and I’m not suggesting this is in fact the case) somehwere inside twitter there was a record of the original url that had been humped into tiny URL. The question is then how do you see that data in a view, and in particular how you see that in an archive application specific view.

    The problem’s not gone away, but it’s mitigated, no?

  2. Something regarding url’s that would be nice, is if companies that create giant url’s also provided a more useful url as well. This would 301 redirect to the original.
    Example:

    http://www.amazon.co.uk/Url-King-Beau-Beaudoin/dp/0978840100/ref=sr_1_9?ie=UTF8&s=books&qid=1206044753&sr=8-9

    could be also be seen at:

    http://amazon.co.uk/t/gyh

    Then the use for these tiny url services would be needed less, and the companies would still keep the google juice.

  3. Ant: Yeah I guess if an application provided it’s own URL shortening service then that would certainly help with one aspect: if the service goes belly up then you aren’t left with broken links (you’re left with no links coz the service has gone!).

    But it wouldn’t help solve the other issues: obfuscated URLs – you won’t know the provenance of the URL. And that makes it harder for computers to process and for people to interpret. And nor would it solve the censorship issue.

    So I think it would mitigate part of the problem.

  4. I agree with the URL obfuscation. It’s one of the reasons I use http://notlong.com which is a nice service, not many people know about it, which means I can make a URL like http://1518HWH.notlong.com (that’s the view from the fifteenth floor of the office, hence my use of 1518 HWH).

    However, Twitter is pretty transitory. The use case here is “I want to send a URL, now, and I need it to be as short as possible”. I’m not convinced the use case is “I still need to have this working in the middle of next week”; so not convinced, wholly, that it’s a particular problem in Twitter.

  5. I can see why Twitter uses these in the SMS and mobile website it publishes out, but they’re also used in tweets displayed on the regular website. This makes no sense. Surely in a desktop browser there’s enough room for the whole URL? They also use ellipses with links to the “whole” tweet on the site, another anti-pattern when the full text is 140 chars.

  6. Michal,

    I think from Twitter’s perspective URL shortening makes sense on all platforms – unless they decide that they are going to drop the 140 character limit and/or drop it for the desktop version. Both of which would be odd. So I think it does make sense from a certain perspective.

    Trouble is that it reinforces the use of URL shortening as an OK thing elsewhere – hence it’s anti pattern nature.

  7. @Wekiki – not sure I agree sn.vc still provides a single point of failure, still allows the service to censor content (it’s says as much in the T&Cs) and still obfuscates the url.

  8. Coming very late to the party but…

    @jamescridland – i disagree that twitter is transitory. the invention of the hashtag and incorporation of search has given old conversations new life. u can follow conference proceedings eg before, during and after. admittedly twitter’s suppression of older tweets means there’s still a limited active life but that active period seems to keep expanding as people find new ways to use the twitter platform

    @tom it’s kind of a lie that twitter web messages are limited to 140 chars. if i write a tweet to the 140 limit that includes a link then <a href=”whatever”>whatever</a> will be added to the message. so whilst the visible part of the message is limited to 140 chars the message source isn’t. There’s no reason twitter couldn’t use the long url in the href whilst keeping the short url as the link text…
    anyway, as u say there are 4 problems with url shortening:
    (1) url obfuscation – i don’t know what i’ll get
    (2) reliance on 3rd party – if they go out of business links break
    (3) potential censorship by 3rd parties
    (4) over aggressive url shorteners reusing ids
    Having just come across longurl.org i think it could solve at least some of these problems. It provides a service to expand short urls from many, many providers into long urls. With a firefox extension they already solve (1).
    The cool bit is it caches the expansion so has a persistent store of short <> long mappings. They plan to expose these mappings on the web which would also solve (2).
    This is cool cos most shortening services don’t allow u to check if a short exists for a given long without the side effect of creating a short if it doesn’t already exist….
    There’s not much we can do about (3). if longurl catches a short pre- censored it’ll cache the mapping but if it’s subsequently censored i’m not sure whether they’ll overwrite or keep the original mapping. Makes mappings time dependent which isn’t nice…
    And (4) is just properly evil!!!!
    IMHO longurl could be a web saviour. So download the firefox extension and use the jQuery plugin for derivadow.com. In an ideal world twitter would also use the jQuery plugin and we’d have a bigger, better store of mappings
    Apologies for long comment. I would have wrote a blog post but i don’t have a blog. Guest post offers gratefully received ;-)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s