The Art Of The Bodge

A bodge, a kludge, a hack. All essentially the same thing. A quick and dirty solution that can be inefficient, hard to maintain or clumsy. Now I bet you are wondering how you could ever have a good bodge and, though I like the description above, I prefer the definition “engineering but with larger tolerances”. Just because something is, ‘spaghetti code’ does not mean it is not justified, with many using the bodge as a tool to help speed up production when total redesign of a solution is impractical.

UTF-8 and GREP

Many of you reading this have most likely heard of Ken Thompson, the man who created the original Unix operating system. Thompson was the master of the bodge. His most notable being the definition for UTF-8, the character encoding used by almost 92% of the World Wide Web today.

A little history here, in the 1960’s the Americans came up with ASCII, a clever way of encoding characters to 7 bits. Then new 8 bit processors came along, allowing a whole 127 other possible characters in the encoding. This was fine up until the 1990’s, when the World Wide Web was invented. People now wanted to send their ASCII encoded emails from America to the multibit encoding of Japan. This caused so many problems that the Japanese even came up with the word ‘mojibake’, meaning garbled text as the result of decoding using an unintended character encoding. Due to this, the Unicode consortium was created to impose a standard on the current 100,000 characters of the world.

Moving forward, let’s look at UTF-8. It had to encode those 100,000 characters which would need about 32 bits per character. UTF-8 starts by encoding English in the same way as ASCII (if it isn’t broken don’t fix it), meaning you would have had a lot of zero’s in every English character you type, averaging it at 24 bits of wasted space. Another problem was old systems that would see 8 0’s would see this as a NULL and stop reading. And finally as if it wasn’t bad enough the whole thing had to be backwards compatible. So UTF-8 starts by taking the original ASCII encoding for the first 7 bits, with an extra 0 on the end to make 8 bits. Now if you want something higher than this you add headers to the bytes. For two bytes, the first three characters would be 110. Two 1’s, two bytes. And the next byte would start with 10, a continuation flag. The rest could be filled in with the encoding for the character. This continues, with each byte adding a 1 to the first byte used, up to 111110 or 5 bytes of encoding. Not only did this fix every problem currently posed by different encodings, it would also allow for the entire code to be written on a mere napkin, which it was by Ken Thompson and Rob Pike in a 1992 in a New Jersey diner when they invented UTF-8.

However, this wasn’t Ken’s first bodge. In 1974 a friend of his, Lee E. McMahon, was attempting to analyze the text of old federalist papers. The ed editor Ken had developed for the Unix system could support regular expressions but on a smaller scale so was unable to help McMahon. Therefore, Ken decided to take the code from the ed editor and bodge it into its own standalone tool overnight, by Globally searching for Regular Expressions and Printing them out, also known as GREP, a now standard command in UNIX.

Buffer Overflows

Moving forward, let’s look at UTF-8. It had to encode those 100,000 characters which would need about 32 bits per character. UTF-8 starts by encoding English in the same way as ASCII (if it isn’t broken don’t fix it), meaning you would have had a lot of zero’s in every English character you type, averaging it at 24 bits of wasted space. Another problem was old systems that would see 8 0’s would see this as a NULL and stop reading. And finally as if it wasn’t bad enough the whole thing had to be backwards compatible. So UTF-8 starts by taking the original ASCII encoding for the first 7 bits, with an extra 0 on the end to make 8 bits. Now if you want something higher than this you add headers to the bytes. For two bytes, the first three characters would be 110. Two 1’s, two bytes. And the next byte would start with 10, a continuation flag. The rest could be filled in with the encoding for the character. This continues, with each byte adding a 1 to the first byte used, up to 111110 or 5 bytes of encoding. Not only did this fix every problem currently posed by different encodings, it would also allow for the entire code to be written on a mere napkin, which it was by Ken Thompson and Rob Pike in a 1992 in a New Jersey diner when they invented UTF-8.

However, this wasn’t Ken’s first bodge. In 1974 a friend of his, Lee E. McMahon, was attempting to analyze the text of old federalist papers. The ed editor Ken had developed for the Unix system could support regular expressions but on a smaller scale so was unable to help McMahon. Therefore, Ken decided to take the code from the ed editor and bodge it into its own standalone tool overnight, by Globally searching for Regular Expressions and Printing them out, also known as GREP, a now standard command in UNIX.

Bodging Isn’t A Magic Fix

In general, the bodge is an approach often brought on by having to implement solutions for systems that are so tied down, there is little else you can do but bodge it the best you can. However, developers can feel the need to draw on the art of the bodge when being pressured into creating quick fixes in order to meet tight deadlines.

Having a developer implement a bodge could be sustainable for smaller teams, but as a team grows and adapts, coming across that bodge several months later could result in serious problems. This bodge now, deal with it later patter is a common problem for Agile projects lacking proper management. However, this need to bodge can be corrected through the implementation of correct retrospectives, the correct estimation of the time it takes to complete a task and remembering the Agile manifesto, “Individuals and interactions over processes and tools”.

As much as I have praised the art of the bodge, it is worth remembering that for every successful bodge there are a hundred more that can destroy a system and your employee’s enthusiasm.

//Temporary conclusion, proper conclusion to come in phase 2 of this article

More From The Blog

IR35, Here it Comes Again…

IR35, Here it Comes Again…

IR35, Here it Comes Again...In 2021 the reform to IR35 Off-Payroll rules is to be rolled out to the private sector. As before the reform will only affect companies that do not meet the following attributes: an annual turnover below £10m fewer than 50 employees or a...

Solving the Resource Conundrum

Solving the Resource Conundrum

Solving the Resource ConundrumPicture this. One minute all is fine and dandy, you have access to all the resources you could possibly need, then bam an unexpected challenge arises. Suddenly you find yourself lacking the capacity to meet the new need. What are your...

Quality – An Aid to Produce Consistent Rubbish

Quality – An Aid to Produce Consistent Rubbish

Quality - An Aid to Produce Consistent RubbishAnother year has passed, and myself and a colleague have hosted a BSI auditor for our annual ISO9001/TickITplus check-up, and in fact this was more than the regular check, in that it was our 3-year re-certification audit,...

The Hazards of Legacy Systems

The Hazards of Legacy Systems

The Hazards of Legacy SystemsBeing the owner of a software system with a dedicated customer base sounds like the kind of position one would like to find themselves in. At least until it gets superseded and you have to face dealing with a legacy system. Many developers...

How to Test Without Access to The Test Environment

How to Test Without Access to The Test Environment

How to Test Without Access to The Test EnvironmentIn many of our previous articles, we have expressed the importance of achieving a high standard of testing. Potentially blocking this achievement, several factors can come together to affect the quality of your...

The Technical Workshop – How To Make Them Work For You

The Technical Workshop – How To Make Them Work For You

The Technical Workshop - How To Make Them Work For YouAnyone experienced in product design will understand just how valuable a facilitated workshop can be. Bringing together a project's key stakeholders into a single space allows for the exploration of diverse...

Developing Software for Safety Related Systems

Developing Software for Safety Related Systems

Developing Software for Safety Related SystemsSoftware systems should always be both robust and reliable, however the moment you introduce a safety element, this need for reliability increases significantly. The level of safety required is governed by the severity and...

How to Choose an Outsourcing Partner

How to Choose an Outsourcing Partner

How to Choose an Outsourcing PartnerHaving recognised a need to outsource, and worked your way through the initial preparations, you are now in a strong position to seek out a suitable partner. Choosing an outsourcing partner is no trivial affair, so taking the time...