Code Archeology

I have become very interested in patterns that cause software code to end up in the famous big ball of mud. One of the phenomena that I want to explore today is the fact that every business decision leaves a trace. This article includes a tourist guide for those interested in diving into the code of an ancient culture (i.e. your organization five years ago).

A feature of software development as opposed to other engineering disciplines is that it is possible to discover what you want to build while you are building it. The image that is sometimes used is that of a plane being assembled mid-flight. This quality of our industry is primarily more an opportunity than a threat. Using this power wisely can lead to a better fit between a software product and the problem it attempts to solve.

Nevertheless, frequent changes of direction have an impact on the fabric of the product. If not managed carefully, they may lead to accumulating technical debt as the team is moving to different locations in the solution space without fully erasing the traces of past decisions from the code.

The reason why this happens goes back to the very nature of the craft of software development. In its essence, developing software is an exercise of managing complexity. You might want to build a system that sells thousands of books in parallel, making sure at every step that no customer by accident pays for a book that is out of stock, that payment processing failures are handled correctly and that customers with only a vague idea of what they are looking for will find the right title. This is a complex challenge across multiple dimensions!

One tool to address complexity is decomposition. Buying a book can be broken down into the subproblem of searching for a book, the subproblem of handling payments, and the subproblem of order fulfillment, for example.

Another common tool is abstraction. For example, by designing an interface that represents any kind of payment you might avoid the complexity of having to handle different payment methods at several locations.

Often abstractions and decompositions are chosen in anticipation of future use cases (one of the reasons why your engineers should ideally have a clear understanding of the business context, see also Solution Spaces and Problem Spaces). Maybe your team decides to implement an abstraction that supports the purchase of any kind of product, not just books.

These abstractions can turn out to be premature. Instead of extending to other products, the business might decide to go all-in on books and offer them in different formats instead. We can now end up in a situation where the old abstraction either gets in the way or at least does not make a lot of sense anymore. Usually, revisiting all of the invalidated decisions of the past would be too expensive an exercise. Consequently, residues will remain in the code.

This leads to a point where, a couple of years down the line, a new joiner will wonder why in the code base of what has now become a streaming service for spoken Harry Potter fan-fiction contains a module that supports the sale of cucumbers.

A Tourist Guide to Historically Grown Code

For those that don’t mind getting their hands dirty digging in old code, this can represent a treasure chest, telling you a lot about the past of the organization you work for. If the code predates your time with the organization, you may have to find a senior colleague as a guide.

Here are some examples of artifacts that you may encounter during your expedition:

  • Overly broad abstractions are sometimes signs of particularly ambitious phases during the life of this project. Maybe these are traces of a golden age?
  • Very quickly thrown together pieces of code that implement one specific use case hint at a particular customer or prospect that the organization wanted to impress at the time.
  • Observe the coding style of each module. Code stemming from an academic environment is usually very different from that of an “enterprise Java” programmer. The latter might add SingletonFactoryBeans all over the place, the former might have named their variables by single letters in alphabetical order. This gives you a hint about the hiring practices of the organization at that time.
  • Watch out for buzzwords that can be used to date code to the phase in history where a certain approach or technology was hyped. Any hints about semantic web? XSLT transformations? SOAP? Java Applets?

Another typical artifact that can be found concerns the naming of concepts: Things that got introduced for a certain purpose can have shifted in meaning leading to an unintuitive name. Maybe your shop of physical books managed a list of transactions under the concept of “Shipments” which then later changed to include ebook downloads. Noting the original meaning of an unintuitive name can give a hint about the original purpose of a component.

Corollaries

Amusing as it may be to dig into old code, of course, the result of all of these inconsistencies is a product that is harder to maintain. But hey, if the product survived long enough to require maintenance, that is already good news! How to manage the software development process to minimize the negative impact of this effect is a different story and a question for another blog post. Nevertheless, I want to point at two small corollaries of the above.

Firstly, it is sometimes said that documentation is worthless in software projects as it is probably going to be outdated in a couple of weeks. I agree with the fact that documentation will quickly be out of date, certain kinds of documentation can nevertheless be useful. Logging why you made a decision and what the purpose of a specific abstraction is can provide the missing historical context for particularly unintuitive parts of the code. By documentation important technical decisions, you could become the Plutarch of your codebase.

A second corollary is that it makes no sense to build a “good design” too early in the process. You will sometimes meet teams with the understandable urge to “do it right from day one” on the next greenfield project. I think that this urge needs to be tamed as the definition of a “good design” will likely change as the project progresses.

3 thoughts on “Code Archeology

Add yours

  1. When to improve the design? Cross the bridge when you reach it!

    In the 1990th multimedia technologies started to deliver “impressive” results like videos on computer screens, graphical animations of technical processes etc. Therefore companies that developed all kinds of training material and courses, started exploring the use of computer based training (CBT) applications. At that time our software department had problems to find new projects for us and we tried to find new opportunities. We got in contact with a small company that produced videos for all kinds of purposes. They had the idea to extend their portfolio and offer CBT programs to teach technical people in maintaining and repairing technical devices such as ATMs. Using the multimedia platform ToolBook, I built a prototype system and succeeded in winning a contract for 5 small CBT programs based on the video material we received from our customer.
    My intention was to build on the prototype and use a copy & paste approach. For this I had calculated a reasonable price for the customer. But my colleague, who helped me to reach the planned shipping dates and was an excellent software engineer, heavily argued against this strategy. It was clearly not in line with the principles of good software architecture. It seems to me that he was suffering physical pain by following my strategy. Therefore we dropped the prototype and invested some time in a well-structured platform to build hundreds of CBT programs.
    Unfortunately it turned out, that the CBTs were not well accepted by the technicians. My own impression was that the animations were quite useful, but that it was very hard for learners to follow the videos. As a result we did not receive follow up contracts and I had to go to Canossa and report a 20% loss to my boss.
    My conclusion was and is, that it’s a good idea to have a First Things First heuristic (FTF). I.e. the project team should inspect the problem space carefully and compile and update a list of issues at a given time. And it should spend the resources in solving the most critical ones in that phase. In our example, user acceptance and the didactic quality of the videos were most critical. For solving maintenance or scalability issues it was too early.
    Of course it helps a lot, if the project team has a realistic and unbiased understanding of the problem to solve. As a software architect it’s a good idea to use any opportunity to get in close contact with your future users and the processes that they use. Do not rely only on what you are being told by others but try to gain your own impression!

    PS: Shortly after failure, we were captured by the internet hype. And forgot about CBT …

    Bernhard Mescheder,
    Software and Knowledge Engineer (retired from work but not from reflection)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: