Simplify! – Part 4
In the previous post, I discussed the difference between essential and accidental complexity. While we cannot avoid the essential complexity of a given problem, accidental complexity makes things unnecessarily harder without adding any value to the solution.
I stopped the post with the default theme that essential complexity is everything that comes from the problem side (a.k.a. the business domain). Whenever the solution complexity exceeds the problem complexity, that excess is accidental complexity added in the solution domain (a.k.a. in IT).
Yet, I added that if we take a closer look we can see that it is not that easy. Accidentally complexity comes in different flavors and can be found at sometimes unexpected places. This is what we will examine in this post.
Accidental problem complexity
Let us start on the problem side. If we take the problem domain, we can split it up in two parts:
- The problem domain as seen by the users – This is what Moseley and Marks defined as essential complexity. We can usually describe it as a set of functional and non-functional requirements.
- The problem domain as seen by the company that creates the solution – This is what we find in the requirements document and is not part of the users' point of view. Usually, you will find quite an overlap between the user’s point of view and what you find in the requirements document. Yet, the business departments often come up with additional requirements that do not reflect the user’s point of view for many different reasons. These requirements are sometimes divergent from the requirements of the users 1. The company is also concerned about the different legal, commercial, temporal and other business-related constraints. This is something the users typically do not care about.
With this division of the problem domain we can revisit the problem complexity part:
As Moseley and Marks wrote, the requirements coming from the users are essential complexity. Assuming that you live in a competitive market and thus are interested in satisfied users, implementing these requirements are a must.
But there is more essential complexity. The company building the solution usually needs to adhere to some mandatory constraints, e.g., legal constraints. Typically, the users do not care much about those constraints and therefore they will not be part of their requirements. But they still have to be considered and thus add to the essential complexity.
And then we have everything that additionally is part of the requirements document, but is not covered by the aforementioned two parts of essential complexity. As written in footnote #1, there are lots of reasons – some more sensible than others – why these requirements were added to the requirements document.
For the sake of simplicity we assume that everything that satisfies or even delights users is already covered by the users' point of view. We all know that users often have a hard time describing what they actually want before they see it, but that is not the point here.
Here I want to differentiate requirements that actually address user needs and requirements that do not have the users (or mandatory constraints) in mind. All the latter do not add value to the solution – the users do not want them, neither explicitly nor implicitly – and thus add accidental complexity.
The relevant point is that we also find accidental complexity on the problem side, not only on the solution side. In other words: It is not only the IT departments piling up complexity without creating value. Also the business departments often come up with lots of requirements that do not create any value, i.e., with accidental complexity that make the solutions harder to understand, change, maintain and operate.
Especially if companies live in a post-industrial markets, but still act as if they were living in an industrial market, i.e., have long budgeting and project approval cycles, still focusing on internal optimization than on external customer satisfaction, we see lots of this accidental complexity caused on the problem side.
Accidental solution complexity
Let us move on to the solution side. We know that accidental complexity often is added on the solution side. Yet, to successfully tackle it, we need to understand better the different types of accidental complexity on the solution side. To start with, let us split up the solution domain in its parts:
- The platform – Everything the software implemented is built upon. This includes low-level infrastructure like compute, storage and network, databases and messaging platforms, but also higher level infrastructure like application servers, container schedulers, service meshes up to libraries and frameworks as well as supporting standard software solutions or SaaS offerings. It also comprises the development, test and deployment infrastructure, starting with the programming language, IDE and source code repository and containing everything that adds to CI/CD (or their manual equivalents).
- The software – Everything that needs to be coded and configured. Even if you buy an off-the-shelf solution, you usually need to configure it quite extensively to suit your needs up to writing a lot of customization code. If you cannot solve the problem by buying an off-the-shelf solution, you need to implement your own software on top of the provided infrastructure. This is what developers usually are concerned about if they discuss software quality in terms of maintainability, evolvability, modifiability and extensibility.
- IT-related constraints – The same way you have constraints on the problem side, you also have constraints on the solution side. It can be available skills that constrain a solution. It can be outsourcing of parts of the solution development. It can be technological restrictions based on the capabilities of the operations team. It can even be arbitrary restrictions made by an “opinionated” enterprise architect – just to make clear that also on the solution side constraints can be arbitrary.
With this division of the solution domain we can revisit the solution complexity part:
The solution complexity can be decomposed into a platform-related and a software-related part. The platform-related part of the solution complexity is defined by the platform as described above, the tools and technologies used. The software-related part is defined by the software itself, the concepts and implementation used.
The constraints influence the solution complexity indirectly by influencing how the platform is set up or how the software is built and organized. Note that if we find accidental complexity in the platform or the software, we will quite often find some flawed constraint that originally caused it. Thus, even if we limit our immediate observation of complexity to the platform and the software, we must not neglect the IT-related constraints as they often are the actual root cause of unnecessary complexity.
With this distinction we can revisit our original simple image of essential and accidental complexity. To set a baseline, let us first look at an ideal solution without any accidental complexity:
The solution complexity matches the problem complexity. For the sake of simplicity, we assume that the problem definition does not contain any accidental complexity but only reflects the essential problem complexity.
Additionally, the platform support is optimal with respect to the given problem. “Optimal” in that context means that the platform does not add any unnecessary complexity while at the same time supporting the software in a way that its complexity is minimal with respect to the given problem. As we will see in a moment, non-optimal platform support can add a lot of (unneeded) complexity to the solution.
This is the ideal solution without any accidental complexity on the solution side. With that image in mind, we can distinguish three types of accidental complexity.
Overly complex platform
The first source of accidental complexity on the solution side is an overly complex platform:
This is a relatively obvious source of accidental complexity. The platform is (a lot) more complex than the problem required. As a result the solution complexity exceeds the problem complexity, i.e., we added accidental complexity.
We have seen this type of accidental complexity due to overly complex platforms many times in the recent years when everything was implemented using microservices, SPAs, distributed data stores and more (including all the additional platform complexity those approaches require) even if the problem could have been solved perfectly using a much simpler approach. I will come back on this in more detail in one of next posts of this blog series.
Note that even if I assumed in the image that software complexity remains minimal, this is not necessarily the case. Actually, more often than not an overly complex platform also increases the required software complexity. E.g., if you choose an implementation based on microservices instead of a non-distributed approach, you need to deal with all the intricacies of distributed systems which makes the software a lot more complex. 2
Inadequate platform support
Interestingly, not only overly complex platforms, but also inadequate platforms support adds to accidental complexity.
If you fail to appropriately support the solution software on the platform side, the software needs to compensate the lack of support, i.e., needs to implement the missing parts or needs to find another way around the deficiencies of the platform. Just two examples to illustrate this situation:
I had a client a few years ago where the core development team refused to use any frameworks, libraries besides the default JavaSE/EE stack provided by a plain Tomcat server. The reasons why they made that decision, are not relevant in this context. The point is that they had to implement a ton of things on their own that you usually get in production quality with a little Maven snippet and a few lines of code.
Eventually, this lead to a software complexity that made them almost immobile: It took them extremely long to implementing any change request. They had a huge backlog of pending feature requests from their business department. It was extremely hard, if not impossible for them, to support new technology evolutions like, e.g., mobile device or machine learning in their solution. This was because they had to build all the missing platform parts in their software – lots of accidental complexity.
Another client insisted to use Cassandra as their sole storage system for a bigger initiative. Again, the reasons for that decision are irrelevant here. Unfortunately, the initiative had some sub-projects that had very strict consistency requirements. Other sub-projects needed to access the data using arbitrary ad hoc access patterns that could not be defined upfront. Both are requirements that a relational database can serve easily but which are extremely hard to implement if you only have Cassandra.
As a result, the software development teams had to implement solutions that were orders of magnitude more complex than they would have been if they would have had access to a relational database.
An important point here is that from a company’s perspective additional software complexity often is harder to cope with than with additional platform complexity. Added platform complexity means higher operations efforts, added software complexity means higher efforts along the whole IT value chain (including operations).
Additionally, the biggest bottleneck of companies often are their IT departments, how many requests they can serve in a given period of time, how long it takes until the requests are implemented and live. If the same development teams need to support a lot of code that just compensates for a lack of platform support they become an even bigger bottleneck: a big part of their capacity is bound to maintaining the platform compensation code.
This does not mean that we should blindly put everything in the platform. If we try this, we easily end up with overly complex platforms which are as bad. The key point is that for any solution there is a sweet spot regarding the amount of platform support, and this amount depends on the given problem. If we miss this sweet spot by adding too much or too little platform support, we add accidental complexity to the solution. 3
Overly complex software
The third an last source of accidental complexity on the solution side that I want to discuss is overly complex software:
This is also a common source of complexity. The software is (a lot) more complex than required by the problem. As a result the solution complexity exceeds the problem complexity, i.e., we added accidental complexity.
There are many sources for this type of complexity. One source is “generic solutions”. In that scenario, usually software design leads failed to understand the problem domain properly and then try to compensate for it by providing lots of degrees of freedom in the solution – “just in case”, not knowing if they will ever be needed. The solution becomes a lot more complex than required to address the given problem, harder to understand, to maintain, to extend, to test, to deploy and usually also to operate.
Another typical source is that a software design lead wants to use a new (often fashionable) paradigm, be it for the sake of curiosity, be it for having it on the CV. In the recent years, we have seen that in a lot of places, e.g., with NoSQL, microservices, CQRS, event-driven, reactive and many more. While they all have a value being used in the right places, i.e., for matching types of problems, they add unneeded complexity in other places.
There are also more subtle sources of unneeded software complexity. E.g., I sometimes have discussions regarding test coverage of exploration prototypes. The whole idea of exploration prototypes is to validate a series of ideas with the least effort possible and throw away the prototype afterwards. The value of the prototype are the learnings it facilitated, not the code.
Thus, extensively testing those prototypes usually does not add any value. Actually, by reducing the number of ideas you can validate per unit of time, it even destroys value.
Yet, developers coming from a background of building and maintaining core enterprise software, often insist in always testing software as if it were mission-critical production software. They simply do not realize that they just left their default context and thus the quality requirements are different. Situations like these also often lead to accidental software complexity.
These were just a few examples. There are a lot more sources of accidental complexity in software. E.g., basically everything we would call “technical debt” 4, adds to accidental complexity as it makes software harder to understand and thus harder to maintain, to extend, to test, to deploy and to operate. I will come back on this in more detail in one of next posts of this blog series.
In the previous post we have discussed the difference between essential and accidental complexity. While we cannot avoid the essential complexity of a given problem (at least if we want to solve it properly), accidental complexity makes things unnecessarily harder without adding any value to the solution. Therefore we need to tackle it relentlessly if we do not want to fall off the cliff regarding the ever-growing complexity in IT.
By taking a closer look at accidental complexity in this post we have discovered 4 different types of accidental complexity:
- Due to unnecessary requirements not adding any value to the solution
- Due to overly complex platforms regarding the given problem
- Due to a lack of adequate platform support regarding the given problem
- Due to overly complex software regarding the given problem
Those four types of accidental complexity provide us with a little framework when trying to fight increasing complexity. They help us to identify the places where to look for unnecessary complexity. Consequently, I will use them in the next posts of this series to discuss some ideas how to simplify IT.
But before discussing mitigation ideas, it makes sense to understand better where the current excessive complexity comes from. What are the drivers?
Typically it were not a few isolated decisions that lead to an overly complex IT landscape but but an evolution over a long time. Additionally, as in many complex systems as IT is one, all people involved often act with their best intentions and still end up in a place where nobody ever wanted to be 5.
Thus, in order to tackle the accidental complexity we currently face better we first should understand how it emerged. Starting with the next post, we will start looking from different perspectives at the evolution and remediation options. So, stay tuned …
Discussing these reasons would at least be a whole blog post on its own. The reasons range from that the creators of the requirements hope to enthuse the users with a surprising feature to that they are solely focused on internal company (or department) needs and therefore neglect the users' needs – plus several more aspects. ↩︎
If you do not deal with the intricacies of distributed systems in your microservices implementation as we can see in quite some hype follower projects, you will pay the price in production. And do not fall for the illusion that a service mesh will solve them all for you. It does not. I will write about this in detail in some future posts. ↩︎
This does not mean that standard platform stacks do not have any value. Sometimes it also makes sense to use a well known and tested platform even if it does not perfectly match the needs of a given problem. The upsides of having a proven platform then outweighs the downsides of not having an optimal match regarding the problem. The problem is that companies often try to get away with a single platform for each and every type of problem. This “one-size-fits-all” mindset (not only) regarding platforms usually leads to poor results, i.e., you end up with lots of accidental complexity that outweigh the advantages of having a proven platform stack by orders of magnitude. ↩︎
As written in footnote #1 of my post about legacy systems, I know that there are quite some discussions if the term “technical debt” is a good metaphor or if we rather should look for a more appropriate term. Still, the discussion has not yet settled and as far as I can see a better term is not imminent. Thus, I still stick with the well-known term “technical debt”, even if the metaphor has some issues. ↩︎
I will discuss system theory and some of the surprising effects of systems in some later posts. If you do not want for my digest, you can find an excellent introduction in “Thinking in Systems” by Donella Meadows. ↩︎