Architecture Annealing

The term “Software Architecture” can evoke the impression that it describes a blueprint that needs to be completed before actual development work can start. This association is natural given the origin of the term: The architecture of a building better be complete before the construction begins.

Unfortunately, this vision is also at odds with what we learned about software development over the last decades. I would like to propose the analogy of an annealing process as a more appropriate guide.

Emergent Architecture and Prescriptive Design

Being a fuzzy concept, there is a large spectrum of definitions of what Software Architecture is. Here is the one that I am going to be using in this article:

The Software Architecture of a system is the set of structures needed to reason about the system.

Software Architecture in Practice, Bass et al

This is consistent with Martin Fowler’s definition that describes Software Architecture as the “shared understanding that the expert developers have of the system design”.

These definitions indicate that every system necessarily has an architecture. It may not be good and it might not be documented, but the structure is there. This leads directly to the following observation: The structures that make up the architecture of a system can be the result of natural emergence or the product of conscious, prescriptive design.

The typical tools and techniques for software architecture are mostly concerned with the prescriptive design of systems. However, it is, I think, important to acknowledge that emergent structures play an important role. Emergent architecture can be the result of fast iterations, experiments and people gradually coming to a shared understanding of the workings of a system.

Where prescriptive structures, by definition, impose limits on what a team can do, emergent structures are the result of an engineering team trying to solve a problem in a way that works. The system structure is a byproduct.

Software Architecture Design Controls the Rigidity of the System

So then what is the place for prescriptive architecture design in an agile environment in which we want autonomy for the engineering teams, quick iterations, and a fast reaction to changes? Would it be better to let all structures emerge naturally?

To answer this, I would like to propose a mental model to think about prescriptive architecture designs: As their goal is to describe structures that the system needs to adhere to, they, by definition, constrain the solution space in which the team can navigate. We could say that the amount of architecture design that takes place determines the rigidity of the system.

A rigid system has upsides and downsides depending on the phase of the project you are in: In an early phase, placing too many constraints on a system can prematurely rule out interesting parts of the solution space. Furthermore, fundamental changes to the goal of a project based on early feedback can quickly invalidate any carefully planned architecture, leading to technical debt resulting from premature abstraction. Lastly, putting up rigid boundaries too early can delay the initial value delivery, putting the entire project at risk.

However, in a phase where there are multiple or very large teams working on a system, too little rigidity will cause friction, communication overhead, and misalignments. Rigid structures also increase the uniformity of the system which is helpful when the code base has reached a certain size and the project needs to be maintained, no matter the size of the team.

An Annealing Process for Agile Projects

In material science, annealing describes the process of increasing the ductility of a material by heating it up. On a chemical level, as long as heat is applied to the system, atoms are moving freely. When the material cools down, they start settling into place.

I think that this is an analogy that we should apply to architecture in agile software projects: In the early phases, it is desirable to keep the heat up so that a team can explore the solution space. The architecture of the system is extremely malleable in this phase. The project only supports a hand full of engineers while it is hot, otherwise, people will just get in each other’s ways.

However, as indicated by the sketch above, with the project progressing, you will need more structure in place to enable greater scale and to reduce communication overhead. This is the moment to capture interfaces and component interactions more formally. Which parts of the system you want to freeze first needs to be decided in conjunction with the team topologies.

In the final phase of the project lifecycle, when it enters maintenance mode, the system will have become very rigid which fits with the desire to keep it running with limited manpower.

There are decisions that are fundamental and hard to reverse which need to be taken fairly early based on the information available at the time while others can wait a bit longer. For instance, breaking a monolith into services later, once the natural structure emerges, is likely less painful than building a micro-service architecture that does not align with the natural structure of the problem domain.

Finding out which parts need to be frozen first and which decisions can be deferred is, of course, the hard part. I think that Evan Bottcher’s idea of strong and weak forces can provide a good framework for prioritizing such decisions.

Note that the approach does not work if your project starts with 30 people working on the same problem from day one. This is an instance of an early overstaffing situation which Di Marco describes as follows in Peopleware:

When staff is brought on too quickly at the beginning of a project, there is almost always a waste of people’s time. Again, you might think this is an easy sin to avoid: Just figure out how quickly the work can absorb new people; then bring them on, only at that rate. Although this makes perfectly good sense, it is often politically infeasible.

Peopleware, Chapter 32

The problem with early overstaffing is that the architecture needs to be frozen very early on leaving no time for the material to find its natural shape.

Conclusion

The metaphor used here is a bit similar to the idea of Agile Architecture in the SAFe framework which works with the idea of an architecture runway. I prefer the metaphor of increasing rigidity because it starts to provide an intuition about where prescriptive architecture design is to be prioritized, making the concept more actionable. Of course, the hard part remains to figure out what this means in practice: How many contributors does my current system support? Which interfaces do I need to fix first? The real world is, as always, too messy to be fully captured by a simple analogy.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: