Bayesian Software Architectures: An Exercise of Predicting the Future

I have talked about how software architecture design is a way of constraining the solution space for a project. Here is another angle on this: Any attempt to design a software architecture is an exercise in predicting the future.

The usual way to make predictions is to extrapolate data from the past. This is based on the assumption that the future behaves somewhat like the past as it is the same underlying system that governs how events play out. The Bayesian take on this is to say that we have a belief about a system and that every bit of data we pick up can be used to update that belief. Let us take a look at the different sources of data we have at our disposal for architecture decisions and what role they play.

What do I mean when I say architecture is an attempt to predict the future? Consider one aspect of architecture: We usually want to find useful abstractions for the logic that governs our application. Building abstractions can be seen as a way of extrapolating what is currently known about a system into the future. Abstractions make the complexity of a system manageable. However, choosing an abstraction to support use cases that do not materialize leads to suboptimal results.

If we knew the set of all use cases for a given system beforehand, it would be a lot easier to select the appropriate abstractions: Knowing all points in the problem space that we want to support would provide a straightforward map to the solution space. Unfortunately, often we do not have that luxury.

Choosing an abstraction is thus inherently an exercise in predicting the future use cases of the system we are working on. We make some future use cases easier to reason about, sacrificing others.

A frustration often felt by engineers is that good predictions are hard to come by. When Product Management is asked if PayPal integration will ever be a use case or if the solution will ever need to run on-premise, I suspect that the most common answer is “Maybe! We can’t rule it out”.

To a degree, this is understandable. Predicting the future is hard after all! Unfortunately, anyone needing to make calls on architecture or technology decisions does need to work on the basis of some prediction anyway. This prediction may include a lot of uncertainty and it may not be accurate. Nevertheless, every small decision we make about how to structure the program is a small bet on a possible future. The engineering team cannot wiggle out of this need to predict.

Product Management saying “we don’t really know” just places the burden of making those calls on the engineering team.

So let us talk about how we, as engineers can deal with this uncertainty. How can we make predictions in the face of this uncertainty? I would argue that Bayesian thinking does give us a way of making sound judgments.

In Bayesian thinking, probability is interpreted as a degree of belief. How much do we believe that that multi-cloud support is going to be a use case?

The goal is to place bets but the bets need to be reasonable. The ideal is to neither keep all options open nor to go all-in on one specific hypothesis.

Without any more data, we will probably have a first intuition on the answer. In Bayesian statistics, this is called the prior. It is our initial belief about the probability of multi-cloud support ever being a thing. Likely, this value will be somewhere in the center of the probability distribution, maybe with a bias towards “no” as we depart from the YAGNI principle.

Having this belief we can start updating this prediction based on common data points:

  • What is the profile of the principal customer that we are creating the tool for? Are they conservative? Early adopters? What do we know about their needs?
  • What have we learned from our industry experience so far? How common is the use case we are evaluating?
  • Furthermore, organizations have a surprising amount of inertia. How have similar questions played out in the past at the same organization? Do projects tend to go one way or the other?
  • What about the political landscape of the organization. Are there currents that favor a certain strategic direction that might have a larger say in the future?
  • If you have access to any customer feedback, this is a valuable source of data. Has the organization lost any deals because of a missing feature?
  • Similarly, competitor behavior is a source of data. While the common wisdom is not to get distracted by competition, in practice them having a feature will make it more likely that you will need to build it.

Going through the available data points you should end up with a score that represents your belief in whether or not multi-cloud support is going to be a thing. Let us say you estimate the probability at around 20%. How do you use this number to make technical decisions?

This is where we place our bets. How bad would it be to err on both sides?

Not ruling out multi-cloud support comes with a significant price tag because we will need to build structures for this use case and we cannot rely on native functionality provided by any one cloud vendor. How much extra work does this entail? What is the opportunity cost of spending our time on these features rather than something else?

What about not foreseeing multi-cloud support and needing it? Would you have to redevelop the entire application? How long would that take? Would this cause you to lose any business deals?

Multiplying the odds of being wrong with the price tag gives us the expected regret associated with each of the decisions. One thing to note, however, is that your decision will probably end up influencing the odds: If you do not already support multiple clouds, the cost of starting to support it might become so prohibitive that people will find other options.

Certainly, not all of the factors going into this estimation are easy to determine. Furthermore, your margin of error might be enormous. However, as we have seen, every time you write a line of code you place a small bet on the future. We simply cannot escape the need to make a prediction. I would argue that to predict this future, past experience is as good a starting point as any.

One thought on “Bayesian Software Architectures: An Exercise of Predicting the Future

Add yours

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: