The Best QA is to Make Error States Impossible

The different levels of testing are well known: A bug found by the customer is extremely expensive, a bug found in acceptance testing somewhat less. At the end of the chain, usually, unit tests are listed as the quickest and cheapest mechanism to catch quality problems early. I would argue that this hierarchy still lacks one critical line of defense: The best quality guarantee is if you can write your program in such a way that certain error states cannot even exist.

A very trivial example of this is a function that does something with a geospatial point:

def do_something1(point):
    x = point[0]
    y = point[1]
    # do something
def do_something2(x, y):
    # do something

The first version of this function has more possible error states: What if the list is empty or if it only contains one element?

As always, this is not an exact science or a hard rule, but in general, I think it should be preferred to pick the version of the code that eliminates common failure patterns simply due to the structure of the data.

The following examples are simple instances of this principle:

Avoid Duplicating Information

Any time that the same information has a copy in different locations there is a risk of these copies being inconsistent. This is particularly true for data that is mutable: A common failure pattern is to update one copy of the information and to forget the second one.

Sometimes you do want duplication data for performance reasons, a cache is an example of such duplication.

In such a case it usually helps to define a clear main/replica relationship where only the single source of truth receives updates and special care needs to be taken that the replica is invalidated at the right moments.

A slightly different example of this is an API schema or protocol definition that needs to be maintained both in the service and in the client: These protocol definitions can go out of sync. Having one place with the definition of the interface that can be accessed by both client and server eliminates this error state.

Choose Appropriate Types

While type systems back in the days were mostly about how a piece of data could be stored in memory, today they are more about making sure that only sensible programs can be represented.

A good type system finds an acceptable compromise between restrictiveness and expressiveness, ruling out enough problems to be worthwhile without getting in the way[…]
Tom Stuart, Understanding Computation: From Simple Machines to Impossible Programs

The type of a symbol in your program indicates the possible values this symbol can refer to. If a certain value would result in an illegal state, ideally it should be ruled out by the type.

The classic example is that in many popular languages, null is a legal value for object types. However, in many situations, an empty value is not a legal state. Therefore it would be better to rule out null and to use an explicit Option type if optionality is required.

A different example is a list parameter that cannot be empty. Instead of checking the list at runtime and failing, it would be better to design a NonEmptyList type that always has at least one element. Then the caller is forced to prove that the list is indeed not empty. A NonEmptyList is easy to define as a data structure consisting of the head element and a list containing the (possibly empty) tail.

Sometimes we deal with legacy systems that don’t give type-level guarantees in their interface. In these cases, we should treat the returned value like user input and validate the required property as soon as a value enters our system. If it does not satisfy the constraints we can fail early. If it does, we can wrap the value with an appropriate type.

There are again situations where performance can be a valid consideration. If you wrap a numerical property with a class for additional type safety, some languages are not able to optimize away the additional boxing and unboxing overhead. This consideration only applies to critical paths of the application, though. Also, other languages support opaque types/value classes to get the required type safety without a performance hit.

Use the Correct Cardinalities When Modeling Relationships

When modeling the relationship between two entities, in a class model, a database, or any other data model you should be careful when determining how the two entities relate. Especially in the context of databases, I found that sometimes it is tempting to model a relationship that really is 1-to-1 as a 1-to-n relationship just to keep the model open or because no one quite wanted to think about the right structure to use at that stage. These can be valid reasons, nevertheless, they again come at the cost of risking an invalid state that could have been avoided by design.

Make Use of Enums

Enums are a simple way of constraining the values that a type can have to a limited set. The difference between a numeric error code and an enum, for example, is that you don’t run the risk of returning an error code that is not defined. If at any point in time you want to pass around strings that can hold only a limited amount of possible values, again an enum might be the better choice, eliminating a series of invalid states.

Tooling Considerations

A problem can be that many of us are stuck with programming languages that don’t allow some of these strategies, at least not elegantly, due to lacking type system features. Many teams are stuck on Java 8 or older for example. My recommendation here is that sometimes very small version updates already give access to several features making the type system more expressive.

It is not always necessary to migrate to a completely different programming environment like Scala. Don’t get me wrong, I think Scala is great. However, drastic changes like that come at a high cost relative to their value. If all goes well, updating the Java version should not break your code, giving you access to a gradual refactoring path.

If you are using JavaScript, I think that a gradual introduction of TypeScript can work. However, this is a mission that needs to be carefully planned and the overall effort should not be underestimated.

Conclusion

All of the ideas presented here are widely accepted best practices. There’s nothing new or controversial to using appropriate types. I think it is a useful exercise to consider why these strategies are useful. Instead of blindly applying a pattern, consider the ways in which your program could end up in an undesirable state and whether you can structurally eliminate these states with the tools at your disposal.