Lego, Wombles and Linked Data

As a child I loved Lego. I could let my imagination run riot, design and build cars, space stations, castles and airplanes.

Blue lego brick

My brother didn’t like Lego, instead preferring to play with Action Men and toy cars. These sorts of toys did nothing for me, and from the perspective of an adult I can understand why. I couldn’t modify them, I couldn’t create anything new. Perhaps I didn’t have a good enough imagination because I needed to make my ideas real. I wanted to build things, I still do.

Then the most exciting thing happened. My dad bought a BBC micro.

Obviously computers such as the BBC Micro were in many, many ways different from today’s Macs and if you must PCs. Obviously they were several orders of magnitude less powerful than today’s computers but, and importantly, they were designed to be programmed by the user, you were encouraged to do so. It was expected that that’s what you would do. So from a certain perspective they were more powerful.

BBC Micro’s didn’t come preloaded with word processors, spreadsheets and graphics editors and they certainly weren’t WIMPs.

What they did come with was BBC BASIC and Assembly Language.

They also came with two thick manuals. One telling you how to set the computer up; the other how to programme it.

This was all very exciting, I suddenly had something with which I could build incredibly complex things. I could, in theory at least, build something that was more complex than the planes, spaceships and cars which I modelled with Lego a few years before.

Like so many children of my age I cut my computing teeth on the BBC Micro. Learnt to programme computers, and played a lot of games!

Unfortunately all was not well. You see I wasn’t very good at programming my BBC micro. I could never actually build the things I had pictured in my mind’s eye, I just wasn’t talented enough.

You see Lego hit a sweet spot which those early computers on the one hand and Action Man on the other missed.

What Lego provided was reusable bits.

When Christmas or my birthdays came around I would start off by building everything suggested by the sets I was given. But I would then dismantle the models and reuse those bricks to build something new, whatever was in my head. By reusing bricks from lots of different sets I could build different models. The more sets I got given, the more things I could build.

Action men simply didn’t offer any of those opportunities, I couldn’t create anything new.

Early computers where certainly very capable of providing a creative platform; but they lacked the reusable bricks, it was more like being given an infinite supply of clay. And clay is harder to reuse than bricks.

Today, with the online world we are in a similar place but with digital bits and bytes rather than moulded plastic bits and bricks.

The Web allows people to create their own stories – it allows people to follow their nose to create threads through the information about the things that interest them, commenting, and discussing it on the way. But the Web also allows developers to reuse previously published information within new, different context to tell new stories.

But only if we build it right.

Most Lego bricks are designed to allow you to stick one brick to another. But not all bricks can be stuck to all others. Some can only be put at the top – these are the tiles and pointy bricks to build your spires, turrets and roofs. These bricks are important, but they can only be used at the end because you can’t build on top of them.

The same is true of the Web – we need to start by building the reusable bits, then the walls and only then the towers and spires and twiddly bits.

But this can be difficult – the shinny towers are seductive and the draw to start with the shiny towers can be strong; only to find out that you then need to knock it down and start again when you want to reuse the bits inside.

We often don’t give ourselves the best opportunity to womble with what we’ve got – to reuse what others make, to reuse what we make ourselves. Or to let others outside our organisations build with our stuff. If you want to take these opportunities then publish your data the webby way.

Managing the code-garden

Kevin Barnes suggests that software engineering is a bit like gardening – software is never finished – you need to spend some time planning, some time adding new features and some time tending to what you have. Otherwise your code will become steadily more and more unmanageable.

Artfully planned decay

Basically, code is like a garden. We lay it out, plant it and then tend and maintain it for as long as it continues to warrant the attention.”

The trouble is how do you make sure that you allocate enough time to maintenance while continuing to add new features or enhancements? How do you make sure you don’t allocate all your time to adding the next new thing, nor spend weeks on end tidying your code because all you’ve done recently is add new features?

Likewise how do you give the team the time and space to innovate – to try out their ideas? One solution is to let anyone add a new item to the product backlog or requirements catalogue and then prioritise it alongside everything else. Well this is OK if it’s a big idea, but not great for smaller items nor for more geeky ideas if the Product Owner or the business at large doesn’t understand the value of such things. It also feels wrong – if someone has a good idea that can be implemented quickly then they should be able to so, after all going via the Product Baacklog route may take more time than simply implementing the feature. And that is as good a way to kill off innovation as anything else.

But likewise you need to provide appropriate project governance – if everyone did whatever they wanted then the business is unlikely to get what it needs from the software.

One solution is the idea of Gold Cards [pdf] as suggested by Tim MacKinnon. Gold cards are designed to address:

A lack of innovation because the customer does not necessarily explore options that are technically possible but not currently required. Consequently, cutting-edge knowledge may be slowly lost from the team.

Gold Cards allow developers time to explore technical possibilities in a controlled and focused way.

In Scrum the current sprint’s work items are written on cards and pinned to the wall – so everyone knows what’s being worked on and everyone knows what’s completed. Gold Cards are special cards that let you work on your own idea – it can be anything you want so long as it’s on the current project.

Each developer is allocated two Gold Cards at the beginning of each calendar month… Gold Cards can be used at any time during a month, but cannot be carried over into the next month. […]

Each card grants the developer who has it, one day of work on a topic of their choice.

In a similar fashion ‘Gardening Cards’ let you work on whatever is bugging you – not bugs which should be prioritised elsewhere – but those things that just annoy you about the way something is implemented, that missing feature that would make life much easier if it were there. It’s your chance to spend time tending to your garden, not planting new features.

So the idea is that you place one Gold Card and one Gardening Card on the wall. Each sprint everyone can spend one day on a gold card item and one on a bit of gardening. But because the project only has one gold card and one gardening card only one person can be doing each activity at a time – everyone else is working on the backlog as normal. It’s the Scrum Master’s responsibility to encourage everyone to take the time to work on these items.

Now clearly how many days you allocate to gardening and gold card items will depend on a number of different factors: team size, sprint length, age of the project and the state of the code (you wouldn’t allocate gardening time at the start of a project for instance). But generally when a project matures you should be allocating some time every sprint to this. The Scrum Master can plan around this because although they don’t know exactly what everyone will work on she does know how much time will be allocated.

The other advantage of this approach is that they also help fill the gaps – if your planning is a bit off, you encounter some unforeseen problems and some member of the team are being held up then they can remain productive by working on their gold card idea or fixing that pesky item that’s being bugging them for while rather than waiting for someone else.

Link for 2008.01.04

» Facebook disabled Robert Scoble’s account – ?because he was screen scraping contact or activity data [scobleizer.com]He’s under an NDA at the moment so can’t go into the details but he was running a script on the site that broke Facebooks’ Terms of Use. It looks like the account has been deleted taking with it all his data. This is why walled gardens are bad.

» Promoting ‘Data Portability’ standards [dataportability.org]As a result of Facebook’s decision to delete his account Robert Scoble has signed up to this. Which is good news. Data portability between systems is the key to Web 2.0. If you can’t point to a resource (outside a walled garden) and use it then it’s not a web 2.0 citizen. And if data is about you then you should have control – it is yours after al.

» Frameworks exist for conceptual integrity [204 No Content Blog]When someone uses a framework what they are doing is delegating decision-making to someone else – having too many options in this situation is a bad thing. Frameworks that give developers too many options hoping to maximise code reuse are misguided. Software reuse is not an end. Reuse is a means, and if the available means don’t meet your ends, then find other means.

Link for 2007.12.29

» Size Is The Enemy aka “Java is the problem” because Java is a statically typed language, it requires lots of tedious, repetitive boilerplate code to get things done [Coding Horror]
Jeff Atwood’s review of Steve Yegge’s Code’s Worst Enemy: “One of the most fundamental and truly effective pieces of advice you can give a software development team – any software development team – is to write less code, by any means necessary.”

» Ruby 1.9—Right for You? [PragDave]
It’s faster, importantly it supports unicode – but on the downside it’s not backwardly compatible in a few areas and is a development release that’s not ready for production use.

» Google Phone In Spring 2008? [GigaOM]
Google, apparently has taken substantial amount of floor space at the upcoming Mobile World Congress trade show in Barcelona, Spain, leading some to speculate that the company might actually be ready to launch its Android based phones.

» Comet: Low Latency Data for the Browser [Continuing Intermittent Incoherency]
Comet applications can deliver data to the client at any time, not only in response to user input. The data is delivered over a single, previously-opened connection.

» Comet works, and it’s easier than you think [Simon Willison]
“Before taking a detailed look at Comet, my assumption was that the amount of complexity involved meant it was out of bounds to all but the most dedicated JavaScript hackers. I’m pleased to admit that I was wrong: Comet is probably about 90% of the way to being usable for mainstream projects, and the few remaining barriers (Bayeux authentication chief amongst them) are likely to be solved before too long.”

Enterprise architects are town planners not architects

The word “architect” is derived from the Greek arkhitekton (arkhi, chief + tekton, builder”) and in the real world this is just what they are – chief builders – they plan, design and oversee a building’s construction. Translating their clients needs into a physical building.

Tokyo Skyline

Now in the world of IT we have ‘enterprise architects‘ who, unlike real architects, spend too much time in the rarefied air of extreme abstraction rather than down close to actual user problems and code. These are what Joel Spolsky calls architecture astronauts.

When you go too far up, abstraction-wise, you run out of oxygen. Sometimes smart thinkers just don’t know when to stop, and they create these absurd, all-encompassing, high-level pictures of the universe that are all good and fine, but don’t actually mean anything at all.

These are the people I call Architecture Astronauts. It’s very hard to get them to write code or design programs, because they won’t stop thinking about Architecture. They’re astronauts because they are above the oxygen level, I don’t know how they’re breathing. They tend to work for really big companies that can afford to have lots of unproductive people with really advanced degrees that don’t contribute to the bottom line. […]

Another common thing Architecture Astronauts like to do is invent some new architecture and claim it solves something. Java, XML, Soap, XmlRpc, Hailstorm, .NET, Jini, oh lord I can’t keep up.

Enterprise architects aren’t involved in the specifics of a actual code, functions and operations; instead they focus on the strategic, the low detail organisational breadth, on process architecture. As Jeff Atwood might put it they are talkers not doers.

Software isn’t about methodologies, languages, or even operating systems. It is about working applications. At Adobe I would have learned the art of building massive applications that generate millions of dollars in revenue. Sure, PostScript wasn’t the sexiest application, and it was written in old school C, but it performed a significant and useful task that thousands (if not millions) of people relied on to do their job. There could hardly be a better place to learn the skills of building commercial applications, no matter the tools that were employed at the time. I did learn an important lesson at ObjectSpace. A UML diagram can’t push 500 pages per minute through a RIP.

There are two types of people in this industry. Talkers and Doers. ObjectSpace was a company of talkers. Adobe is a company of doers. Adobe took in $430 million in revenue last quarter. ObjectSpace is long bankrupt. [Christopher Baus]

Don’t get me wrong patterns and practices have their place, but they need to be framed in the context of a specific user problem – that’s what architects (the ones who are responsible for buildings and bridges) do.

If you decide to place too many layers of abstraction between yourself, your solution and a specific real world end user problem then you aren’t a chief builder. You’re more like a town planner.

Town Planner’s don’t design buildings. They research large scale requirements, draw up and consult on their plans and map out how a town should develop over the next few years. This job needs doing but it is different from architecture. We need town planners to help make sure all the parts of a town fit together, to make sure that the town’s infrastructure is adequate. To make sure the town’s interoperability is looked after.

But you obviously don’t want town planners to design a town’s buildings. Likewise you don’t want decisions that impact a project’s implementation to governed by Enterprise Architects. As Roger Sessions concludes:

The general rule of thumb is that decisions that impact interoperability should be centralized. Decisions that impact implementations should be decentralized. Not knowing which decision is which is a common error in enterprise architectural departments.

Let’s look at some typical errors that come up in enterprise architectures:

Platform—Many organizations attempt to define a standard software platform, often debating endlessly between, say, Microsoft .NET, IBM’s WebSphere, or BEA’s WebLogic. This effort is misplaced. Platform is an implementation decision, and it has no bearing on how the applications on those platforms will work together. As long as the platform meets the organization’s interoperability requirements, the application team should be given latitude to choose the best platform for their application’s needs.

Data—Many organizations attempt to define a single data layer that will be shared by all applications in the organizations. This effort is often expensive and rarely successful. How data is stored should be treated as an implementation detail of an application.

Business Intelligence—Most organizations treat data and business intelligence interchangeably. Whereas data (such as how a customer is stored in a database) is an implementation detail, intelligence (such as what business we have conducted with a given customer) is an organizational asset. It is appropriate to decide how such intelligence will be shared. It is not appropriate to decide (at the enterprise level) how applications will keep track of the data that feeds this intelligence.

Code Sharing—Many organizations believe that reuse is achieved through code sharing. It is somewhat amazing that this belief persists, despite decades of failure to achieve this result. The best way to reduce the amount of code that a given project needs is through delegation of functionality (as in the case of Web services), not through code sharing.

Web Service APIs—Many organizations believe, correctly, that the use of Web services is critical to achieving interoperability. Many organizations think that this means that the way applications use the Web service APIs should be standardized. In reality, the Web service APIs are far below the level of concern of applications. Applications typically make use of a buffering layer that is vendor-specific—for example, the Windows Communications Framework layer provided by the Microsoft .NET platform. The purpose of this layer is to insulate applications from needing to understand the intricacies of the Web service APIs. This buffering layer is specific to the platform, and therefore, it is part of the implementation details of the application.

The bottom line is that enterprise architecture is not the same thing as application architecture. Application architecture is about the design of applications. These designs should be the responsibility of the group owning the applications. This is one of the ways we achieve economies of small scale. Enterprise architects should worry about how those applications work together, and thereby provide better value to the organization.

Photo: Tokyo Skyline Panograph, by Chalky Lives. Used under licence.

Perl on Rails

I’ve just published a post over at the BBC’s Radio Labs blog about ‘Perl on Rails‘ the MVC framework we’ve written to dynamically publish /programmes and /music (next years project).

This isn’t quite as insane as it might appear. Remember that we have some rather specific non-functional requirements. We [the BBC] need to use Perl, there are restrictions on which libraries can and can’t be installed on the live environment and we needed a framework that could handle significant load. What we’ve built ticks all those boxes. Our benchmarking figures point to significantly better performance than Ruby on Rails (at least for the applications we are building), it can live in the BBC technical ecosystem and it provides a familiar API to our web development and software engineering teams with a nice clean separation of duties with rendering completely separated from models and controllers.

We’ve also adopted an open source approach to its development which is already starting to bear fruit and is personally and professionally hugely rewarding.

In general the BBC’s Software Engineering community is pretty good at sharing code. If one team has something that might be useful elsewhere then there’s no problem in installing it and using it elsewhere. What we’re not so good at is coordinating our effort so that we can all contribute to the same code base – in short we don’t really have an open source mentality between teams – we’re more cathedral and less bazaar even if we freely let each other into our cathedrals.

With the Perl on Rails framework I was keen to adopted a much more open source model – and actively encouraged other teams around the BBC to contribute code – and that’s pretty much what we’ve done. In the few weeks since the programmes beta launch JSON and YAML views have been written – due to go live next month. Integration with the BBC’s centrally managed controlled vocabulary – to provide accurate term extraction and therefore programme aggregation by subject, person or place – is well underway and should be with us in the new year. And finally the iPlayer team are building the next generation iPlayer browse using the framework. All this activity is great news. With multiple teams contributing code (rather than forking it) everyone benefits from faster development cycles, less bureaucracy and enhanced functionality.

UPDATE (2007-12-05)

We’re releasing the ‘Perl on Rails’ code under an open source license. James has just written about this and about the BBC’s infrastructure over at the BBC’s Internet blog in response to the I am Seb’s post.

UPDATE (2007-12-03)

Wow! Well this certainly generated quite a lot of interest. I’m not sure that I will be able to address everyone’s issues – especially all the folk over at slashdot but I’ll write another post to address as much as I can, in the meantime I just wanted to pick up on a few of the major themes.

Why didn’t we use Catalyst or something else that already existed? As Duncan indicated the simple answer is because we can’t install Catalyst etc. on the live environment. The BBC’s infrastructure is limited to Perl 5.6 with a limited set of approved modules and there are further limitation on what is allowed (making unmoderated system calls etc.)

Access to the code: I’ll see what I can do. The BBC does open source some of its code at http://www.bbc.co.uk/opensource/, I’m don’t know if we will be able to open source this code but I’ll let you know. However, its worth bearing in mind that we ended up writing this app to work within the BBC’s infrastructure (Perl 5.6, BBC approved modules etc.) so if we did release the code under an OSS license we would still need to maintain this requirement (clearly the code could be forked etc.)

Too many files on a file system: @Viad R. nice solution and yes that solves the seek time issue – unfortunately it doesn’t solve the other problems we needed solving. These include building the sort of interactive products we want to build; nor how to maintain up to date links between pages – when we publish information about a programme we not only need to publish a page for that programme but also update a large number of other pages that should now link to it/ or remove those that shouldn’t – trying to work out what pages to update becomes a nightmare, its much easier to render the pages dynamically.

BBC web infrastructure – not sure were to start with this one. You my find this hard to believe but the vast majority of bbc.co.uk is published statically – people write HTML and FTP those files onto the live servers. This provides a reliable service, if not a dreadfully exciting one. Its also clearly restrictive which is why we needed to build a solution like this in the first place, rather than using an existing framework. Now I’m very aware that this whole endeavor – building ‘Perl on Rails’ – seems bonkers to most people outside the BBC. But what isn’t bonkers is the fact that we have built an elegant application (with a small team) and deployed it within the constraints of the infrastructure and in doing so delivered a new product to our users and helped move the debate on inside the BBC.

UPDATE (2007-12-02)

Since I posted about this work there’s been quite a lot of chat about why we didn’t simply use an existing framework – like Catalyst for example. The simple answer is we can’t – we are restricted in what can be installed on the live environment (Perl 5.6 etc.) ‘I am Seb’ has some more information on the infrastructure. Believe me if we could have simply used an existing framework then we would have done – we all want to build great audience facing services – unfortunately to get there we sometimes need to do some unusual foundation work first. Still now that we’ve done it we can get on with building some great web apps.

Hudson Bay Start – reducing project risk

The Hudson’s Bay Company (HBC) was founded by King Charles II in 1670 and that makes it the oldest commercial company in the English-speaking world. For centuries the HBC controlled the fur trade throughout much of British-controlled North America, in an area known as Rupert’s Land, undertaking early exploration and functioning as the de facto government in many areas of the continent.

Hudson Bay Company traders

Fur trapping and exploration in Rupert’s Land in the 1600s must have been a high risk job to say the least. The trappers would have to survive on their own in the wilderness so having the right supplies in the correct amounts was literally a matter of life and death. To minimise this risk the trappers developed a technique known as the ‘Hudson Bay Start’.

The Hudson Bay Start was a test. A test to make sure the trappers had the right equipment before they headed off into the wilderness. They would pack up as if they were going to leave for the season but only canoe a short distance and camp overnight to make sure they had neither forgotten anything, nor taken too much. They would then return home, correct any mistakes, before heading out for real. Clearly this approach didn’t come for free – the trapping season was quite short – so while it reduced risk it did present an opportunity cost and might be seen as having an impact on their shipping schedule. A similar approach can be applied to large software projects.

Large scale software projects are high risk. The bigger the project the higher the risk, both in terms of impact (if it goes wrong the company will lose more money) and the rate of occurrence because big projects are more complex and are more likely to contain unk-unks. The idea of a Hudson Bay start is to expose some of those unk-unks and assumptions and therfore reduce some of your project risks, as Jerry Weinberg explains:

We take a trivial problem – really trivial like putting HELLO WORLD on a single screen, and run it through the entire development process – tools, people, documents, test plans, testing, or whatever they’re planning to use. It’s amazing the things that shake out in an hour or two. What’s even more amazing is how many clients are reluctant to do this – “It’s a waste of time; we’ve got an ambitious five-year project and we can’t spare a half day on an exercise.”

This is similar to spikes and tracer bullets in agile methodologies like Scrum. Except with tracer bullets you are working on a narrow implementation of a larger user story to help fine tune what needs to be built, helping you aim more accurately; and with spikes you are building quick and dirty throw away code to gain knowledge. Whereas with a Hudson Bay start you’re less interested in the actual product and more interested in making sure the development and deployment process works.