Perl on Rails

I’ve just published a post over at the BBC’s Radio Labs blog about ‘Perl on Rails‘ the MVC framework we’ve written to dynamically publish /programmes and /music (next years project).

This isn’t quite as insane as it might appear. Remember that we have some rather specific non-functional requirements. We [the BBC] need to use Perl, there are restrictions on which libraries can and can’t be installed on the live environment and we needed a framework that could handle significant load. What we’ve built ticks all those boxes. Our benchmarking figures point to significantly better performance than Ruby on Rails (at least for the applications we are building), it can live in the BBC technical ecosystem and it provides a familiar API to our web development and software engineering teams with a nice clean separation of duties with rendering completely separated from models and controllers.

We’ve also adopted an open source approach to its development which is already starting to bear fruit and is personally and professionally hugely rewarding.

In general the BBC’s Software Engineering community is pretty good at sharing code. If one team has something that might be useful elsewhere then there’s no problem in installing it and using it elsewhere. What we’re not so good at is coordinating our effort so that we can all contribute to the same code base – in short we don’t really have an open source mentality between teams – we’re more cathedral and less bazaar even if we freely let each other into our cathedrals.

With the Perl on Rails framework I was keen to adopted a much more open source model – and actively encouraged other teams around the BBC to contribute code – and that’s pretty much what we’ve done. In the few weeks since the programmes beta launch JSON and YAML views have been written – due to go live next month. Integration with the BBC’s centrally managed controlled vocabulary – to provide accurate term extraction and therefore programme aggregation by subject, person or place – is well underway and should be with us in the new year. And finally the iPlayer team are building the next generation iPlayer browse using the framework. All this activity is great news. With multiple teams contributing code (rather than forking it) everyone benefits from faster development cycles, less bureaucracy and enhanced functionality.

UPDATE (2007-12-05)

We’re releasing the ‘Perl on Rails’ code under an open source license. James has just written about this and about the BBC’s infrastructure over at the BBC’s Internet blog in response to the I am Seb’s post.

UPDATE (2007-12-03)

Wow! Well this certainly generated quite a lot of interest. I’m not sure that I will be able to address everyone’s issues – especially all the folk over at slashdot but I’ll write another post to address as much as I can, in the meantime I just wanted to pick up on a few of the major themes.

Why didn’t we use Catalyst or something else that already existed? As Duncan indicated the simple answer is because we can’t install Catalyst etc. on the live environment. The BBC’s infrastructure is limited to Perl 5.6 with a limited set of approved modules and there are further limitation on what is allowed (making unmoderated system calls etc.)

Access to the code: I’ll see what I can do. The BBC does open source some of its code at http://www.bbc.co.uk/opensource/, I’m don’t know if we will be able to open source this code but I’ll let you know. However, its worth bearing in mind that we ended up writing this app to work within the BBC’s infrastructure (Perl 5.6, BBC approved modules etc.) so if we did release the code under an OSS license we would still need to maintain this requirement (clearly the code could be forked etc.)

Too many files on a file system: @Viad R. nice solution and yes that solves the seek time issue – unfortunately it doesn’t solve the other problems we needed solving. These include building the sort of interactive products we want to build; nor how to maintain up to date links between pages – when we publish information about a programme we not only need to publish a page for that programme but also update a large number of other pages that should now link to it/ or remove those that shouldn’t – trying to work out what pages to update becomes a nightmare, its much easier to render the pages dynamically.

BBC web infrastructure – not sure were to start with this one. You my find this hard to believe but the vast majority of bbc.co.uk is published statically – people write HTML and FTP those files onto the live servers. This provides a reliable service, if not a dreadfully exciting one. Its also clearly restrictive which is why we needed to build a solution like this in the first place, rather than using an existing framework. Now I’m very aware that this whole endeavor – building ‘Perl on Rails’ – seems bonkers to most people outside the BBC. But what isn’t bonkers is the fact that we have built an elegant application (with a small team) and deployed it within the constraints of the infrastructure and in doing so delivered a new product to our users and helped move the debate on inside the BBC.

UPDATE (2007-12-02)

Since I posted about this work there’s been quite a lot of chat about why we didn’t simply use an existing framework – like Catalyst for example. The simple answer is we can’t – we are restricted in what can be installed on the live environment (Perl 5.6 etc.) ‘I am Seb’ has some more information on the infrastructure. Believe me if we could have simply used an existing framework then we would have done – we all want to build great audience facing services – unfortunately to get there we sometimes need to do some unusual foundation work first. Still now that we’ve done it we can get on with building some great web apps.

Living with Code

I’ve been responsible for projects of widely different sizes and complexity. From small production jobs – with no real software development (just visual design and integration) – through to large scale application developments, which we’ve then need to live with, to maintain and develop. And big projects, especially products (that need to be reused), aren’t scaled up small projects. They need to be managed differently not just because the projects are more complex but also because you will need to be able to live with the code for years to come.

Developing a large application isn’t the same as developing a small application. For starters large projects are significantly more complex than small projects – they have more interdependencies, there are more unknowns, indeed more unk unks. A project that is three times as big isn’t three times as complex – its 12 times more complex, at least. As Steve McConnell puts it, just because you can build a doghouse doesn’t mean you can build a skyscraper.

People who have written a few small programs in college sometimes think that writing large, professional programs is the same kind of work-only on a larger scale. It is not the same kind of work. I can build a beautiful doghouse in my backyard in a few hours. It might even take first prize at the county fair’s doghouse competition. But that does not imply that I have the expertise to build a skyscraper. The skyscraper project requires an entirely more sophisticated kind of expertise. The difference in complexity between student programs and professional programs can be just as great, and non-professional programmers -underestimate the difference in required expertise at their own peril.”

And to make matters worse, if you’ve not attempted to build a doghouse, skyscraper or fort before it’s likely that your estimates will be rubbish – since its likely that when planning your work – you’ll miss off important tasks, underestimate unfamiliar tasks, that so of thing. Clearly having team members that have worked on similar applications before helps, as does adopting an agile project management approach.

Agile software development isn’t necessary with small projects – because its reasonable to assume the requirements are stable, there shouldn’t be any unk unks and you’ve probably built a dog house before. But as applications get bigger, as projects become more complex, those assumptions don’t hold true. And you need to able able to manage the new information as it arises. But that’s not the whole story.

The code also needs to be easily maintained by your development team – they need code they can live with – and that doesn’t mean the team that build the application, because if you are building a product that will be used, maintained and developed over a number of years the software engineers that end up looking after the code won’t be the people that wrote it. With small projects – the code can usually be treated as throw away code – there’s no desperate need to supported it.

Clearly some technologies make writing supportable code easier than others. For example, while there is nothing inherently wrong with PHP – indeed it has many advantages, its easy to learn and quick to develop with – its all too easy to end up with spaghetti code, because PHP promotes conflating the business logic and presentation – this can make it painful to maintain, extend or debug. As Tim Bray points out:

…based on my limited experience with PHP (deploying a couple of free apps to do this and that, and debugging a site for a non-technical friend here and there): all the PHP code I have seen in that experience has been messy, unmaintainable crap. Spaghetti SQL wrapped in spaghetti PHP wrapped in spaghetti HTML, replicated in slightly-varying form in dozens of places.”

Don’t get me wrong, I’ve nothing against PHP you can write maintainable PHP code. The trouble is its also easy not to – so depending on the experience of the team it might be prudent to chose a technology that promotes good practice. You will want to extend the application – to deal with new requirements – so architect your application to make it easier to extend.