Simplify! – Part 9
With the previous post, we finished discussing the complexity drivers and mitigation options “above” the technology level. Starting with this post, we will discuss the technology level and “below”.
In this post, I will start with the technology evolution of the recent years and its implications. In the following posts, I will complement the discussion with some typical misconceptions regarding OSS (open source software), architecture narcissism and quality theater on the development level.
Changing markets and digital transformation
Most efficient markets have become post-industrial. Post-industrial markets create a lot of dynamic and uncertainty which means that companies living in such a market need to
- Go fast
- Gather feedback fast
- Learn and adapt fast
- Offer available and reliable products and services (basic customer expectation)
In other words: They need to adapt fast to ever-changing needs without compromising quality (as perceived by their customers).
Digital transformation means that IT becomes a more and more indispensable part of the business, a vital ingredient and enabler of all services and products the companies offers. The interface to the market and the customers as well as the products and services themselves are supported or even completely powered by IT.
The IT departments need to respond to these drivers in multiple ways. One response is implementing a post-industrial working mode that enables them to move fast without compromising quality.
The technology evolution cycle
Another response to the aforementioned drivers were a lot of new concepts, tools and technologies that support better
- to go fast
- to adapt faster
- to use IT as an integral part of the customer interaction.
E.g., the rise of cloud and cloud native, of container technologies, microservices, SPA or the whole continuous delivery movement basically are responses to the need of becoming faster, of adapting faster, of addressing the customer interaction in better ways.
At the same time, all bigger technology evolution cycles in IT tend to follow a repeating schema:
- A new demand arises that the current technology cannot respond to
- New technologies emerge addressing the new demand emerge, often still bleeding edge, offering little or no abstractions
- Companies start picking them up, the technologies become fashionable
- A Cambrian explosion of technologies occurs, all addressing the same demand
- Consolidation kicks in, predominant abstractions emerge, a few technologies prevail
- The remaining technologies and their abstractions mature, becoming easier to use
- The technologies turn into standard products or services, over time developing very powerful and easy to use abstractions
- The technologies become part of default IT infrastructure, eventually becoming invisible behind higher-level abstractions
Meanwhile, usually the next cycle already started.
If you are into Wardley maps 1:
- Phase 1 is a novel user need
- Phases 2 and 3 roughly correspond to “genesis”
- Phases 4 and 5 roughly correspond to “custom built”
- Phases 6 and 7 roughly correspond to “product”
- Phase 8 roughly corresponds to “commodity”
Nevertheless, be aware that the two models cannot exactly be mapped on each other.
Stuck before the consolidation
Regarding the concepts, tools and technologies that support coping with the needs of post-industrial markets and digital transformation, currently we are basically somewhere between the phases 4 and 5. The last years were clearly dominated by the Cambrian explosion, while we start to see some consolidation taking place meanwhile. Still, we are quite a bit away from the phases 7 and 8 regarding those technologies.
E.g., at the moment, Kubernetes is the predominant container scheduler technology. Kubernetes, in its beginnings having lots of edges and not being particularly usage-friendly, gave rise to a whole ecosystem of tools and add-ons that completed the different use cases of Kubernetes and made it better usable.
This results in a whole lot of intellectual load for operations people if they want to use Kubernetes. Either they need to understand all the intricacies and pitfalls that Kubernetes comes with at the low abstraction level it offers. Or they need to understand all the other tools from the ecosystem that fill the gaps, offering higher level abstractions and are thus easier to handle.
Also developers need to understand a lot of the concepts of Kubernetes and the tooling to be able to develop their services right. You cannot just say, here is a piece of software wrapped in a container and that’s it.
You need to describe the resources it needs, the dependencies it has, additional assumptions it makes, e.g., about the availability of other pieces of software it uses, expectations regarding the roll-out of new versions of the software, and more in the language of Kubernetes or one of the ecosystem tools used. This also means a lot of additional intellectual load for the developers whose software is intended to run on Kubernetes.
Yet, in the end, Kubernetes is nothing but a scheduler that manages underlying resources for containerized applications. So, basically it does the same job as a regular OS scheduler, just for containerized applications. I still remember times in the 1980s and 1990s when you needed to understand quite some details about the different OS schedulers, their intricacies, how to configure them, and more to run an application on a given system as desired.
Around 2000, this knowledge became obsolete as OS schedulers became so mature and their abstractions so simple, not revealing any intricate details and configuration options anymore, that you simply started your application on any machine – and it just worked. The OS schedulers eventually entered phases 7 and 8.
The same evolution will be seen with Kubernetes in the next years. Services like, e.g., AWS Fargate (while not being perfect at the moment) show the direction: You use a higher level abstraction to use the scheduler. You do not need to know the details of the scheduler anymore. You do not need to describe all the aforementioned details anymore. The scheduler takes care of the details.
Eventually, we will build and run containerized applications as we build and run normal applications today, that run on top of a regular OS scheduler. We will just build and test our containerized piece of software, we will drop the container into some repository and the scheduler will pull it as needed, without any additional configuration required.
Yet, we are not there yet. We are still stuck between phases 4 and 5 for most concepts, tools and technologies supporting the needs of post-industrial markets and digital transformation. This still means a lot of additional complexity resulting in a high intellectual load.
We still suffer from the tool explosion as well as from the still relatively low-level abstractions requiring us to understand a lot of the internal concepts of the tools and technologies we use.
FOMO as a reinforcing driver
The Cambrian explosion lead to a wave of technology FOMO (fear of missing out), i.e., the fear of not keeping up with the technological evolution. We already discussed the negative effects of such a setting at the people level in the second post of this series.
On the technology level this lead to an explosion of tools being used, often without understanding the trade-offs of the new tools or if they created any additional value at all. In the worst case, existing business models were merely rebuilt in a less reliable way using new tools, without creating any additional value.
E.g., we have seen that quite often in the area of NoSQL databases. NoSQL solutions fill the spots where RDBMS and file systems – the traditional storage options – have their weaknesses: They provide a higher data structure flexibility than RDBMS while still providing acceptable random access patterns 2, or they provide a better read or write scalability while still providing acceptable consistency guarantees, and so on.
Unfortunately, NoSQL databases became very fashionable a few years ago and every new project insisted on using a NoSQL solution instead of a RDBMS for data storage, typically without really understanding the consequences. This of course resulted in a lot of accidental complexity due to database properties not matching the actual project requirements.
Or the projects went for “polyglot persistence” and employed multiple database solutions at once. I have seen several projects that wanted to use Mongo, Cassandra and Neo4J at the same time in a single project. Later they realized that they also needed a RDBMS and added that, too. Yet, in most situations, a RDBMS would have been the obvious and simplest fit for all their data storage and access requirements. Again: accidental complexity increased by at least an order of magnitude without creating any additional value.
This FOMO response pattern is a problem. If you remember the beginning of this series, many IT departments already are completely overstrained with running and maintaining their existing IT landscape, even lacking the capacity to step out of their daily treadmill, not to mention the capacity needed to introduce anything new.
If you then, instead of mitigating the situation, pile up more and more new shiny tools and technologies at a dizzying rate driven by FOMO, you will not solve anything, but just pile up more complexity, making things worse than before.
This does not mean that should not introduce any new technology. But it should actually add value, i.e., the added complexity needs to be outweighed by a significant simplification and increased business value on the other side.
Blindly copying concepts
Another amplifier is the tendency to blindly copy concepts from the hyperscalers (e.g., Amazon, Netflix, Google, etc.). I already discussed that effect briefly in the context of accidental complexity at the company level.
The typical reasoning is sort of: “The hyperscalers use <X>. The hyperscalers are successful. If we use <X> we will be as successful as the hyperscalers.” (plus of course the “coolness” factor of those solutions).
As a result, many of the hyperscaler solutions were copied blindly, without really understanding them and without adapting them to the own context. My preferred example for that pattern are microservices:
Some of the hyperscalers ran into problems that were completely unique, i.e., nobody else had these problems, and they figured out that microservices could help them addressing their problems. They also understood that microservices come at a high price and that they needed to do a lot of additional work to not end up in chaos with them. But it was worth the effort for them as the service-based architecture style helped them addressing their unique problems.
As this architectural style was successful for the hyperscalers, traditional enterprises became curious to pick it up due to the reasoning described before. And after someone started to feed their industrial mindset by telling that microservices amortize due to reusability, there was no holding back anymore.
Unfortunately, they neglected the big price tag attached to microservices that the hyperscalers were willing to pay to solve their unique problems. But regular enterprises are simply incapable to pay the price as they lack all prerequisites to do so. I will probably discuss this topic in more depth in a future post.
To be clear: This is not a case against microservices. Used in the right context they do a terrific job. But using them for the wrong reasons, they just cause a lot of accidental complexity without creating any additional value.
Increased solution complexity
Also drivers like misconceptions regarding OSS or architecture narcissism reinforce the solution complexity explosion. As I will come back to these drivers in the next posts of this series, I skip them here.
As a consequence, we often end up with traditional business models being rebuilt with new technology without creating additional value. We see lots of DIY (do it yourself) approaches trying to mimic the solutions of the hyperscalers. We see settled business models built on bleeding edge technology. We see a multiplication of concepts, tools and technologies due to hype or FOMO, many of them addressing the same types of problems, sometimes not even being a good fit for the problem at hand, just increasing accidental complexity.
Going back to the little framework from the 4th post of this series, we identified 4 core sources of accidental complexity in IT:
- Unnecessary requirements not adding any value to the solution
- Overly complex platforms regarding the given problem
- Lack of adequate platform support regarding the given problem
- Overly complex software regarding the given problem
The developments discussed in this post (and the next ones) describe the rear three sources: we create overly complex platforms using the new stuff. By using the wrong tools due to hype or FOMO, we sometimes create platforms that do not support the solution adequately. By adding too many new concepts and technologies to our solutions, we create overly complex software.
Additionally, we create variants of sources 2 and 4: By piling up new platforms and solutions, each based on a different set of concepts, tools and technologies, we create a lot of accidental complexity not in a single place but across the IT landscape, needing to understand and support many different solution approaches for the same type of task. We will dive deeper into that type of problem in a later post of this series.
Improving the situation
Again, his leads us to the question: How can we do better?
As so often, I do not think that there is a single simple solution (at least I do not know it). From what I see it is a mixture of several approaches.
It starts with understanding the technology evolution cycles. We do not only experience a single cycle at any given point in time. Usually, at any point in time we experience a good amount of overlapping cycles 3. It is crucial to examine the drivers for the cycles and to understand where you currently are. A good litmus test is checking the abstraction level:
- How easy is it for an application developer to pick up the concept/tool/technology?
- How easy is it for an operations administrator to pick up the concept/tool/technology?
If the answer to these two simple questions is “not easy” then chances are that the concept/tool/technology is not yet really mature. If you understand that you are dealing with a solution at a low maturity level you should only cautiously use it: it is hard to use, it increases complexity a lot and very likely in a short time a better abstraction will emerge making the given concept/tool/technology obsolete – just leaving a pile of accidental complexity in your IT landscape.
If you find yourself in an early technology evolution cycle phase, apply “More haste, less speed”:
Resist the urge to pick up everything immediately. Watch the evolution. Look for the emergence of better abstractions and standards.
Less is more
It is also important to note that more options, i.e., a lower abstraction level is not necessarily a boon, but often a bane. More options mean more complexity, more things everyone involved needs to understand, increased intellectual load, more places to get things wrong. Unfortunately, humans tend to have a complexity bias: In case of doubt, we tend to prefer complex solutions over simple ones.
As a result, we also tend to prefer lower abstractions over higher abstractions because they appear “more powerful”. Especially in software development, we are almost obsessed with options. We are so afraid of not having all the options because we “could need them one day” that we voluntarily pile up tons of complexity that we usually never need – pure accidental complexity of the worst kind.
In the end, the for many people, particularly software developers, counter-intuitive reasoning is true:
Simpler abstractions are usually more powerful.
If we have two choices both solving the same task, we should always strive for the one offering the higher abstraction level. If no higher level abstraction exists for the solution technology we have chosen, we should check if it is an option to still go with a different, maybe older and more mature technology that provides better abstractions.
This does not mean that you generally should avoid novel technologies. It just means that you should not jump the bandwagon because others do. The advantages of the new technology (besides having it on the CV) should outweigh its disadvantages and added complexity clearly before using it. As always in architecture, it is about trade-offs contemplated from multiple points of view, not about hypes. Well, at least that is how it should be …
Of course, there are more reasoning techniques that can help you to mitigate accidental complexity like, e.g.:
- Ask “Why” to create focus before acting as this helps to detect activities that just create complexity without adding value
- Think holistically to exhibit complexity that would remain undetected if you just watch the solution from a single stakeholder’s perspective
- Apply Occam’s razor to find the simplest solution possible that solves the task
- Use common sense – well, whatever that is …
As these (and some more) general techniques do not only help in the context of technology complexity, I will discuss them in more detail at the end of this blog series.
It is also important to note that you need to balance the recommendations in this post with a counterforce, which is to detect the right point in time to replace existing technologies. This does not only affect technologies that reached EOL (end of life). This also affects complex custom-built solutions that should be replaced with simpler and more mature product or commodity solutions which have emerged since building the original solution.
This means that sometimes it makes sense to replace a technology without creating immediate additional business value because it reduces complexity a lot and thus improves maneuverability of IT – a significant value that unfolds over time. I will discuss this topic in more detail in a later post.
In this post we have discussed that quite some part of the technology explosion that we could observe in the recent years is due to a new technology evolution cycle (or more precisely: a lot of smaller, overlapping cycles) and we have not yet arrived in the later consolidation phases of this cycle. Additionally, FOMO and the tendency to blindly copy concepts work as reinforcing drivers.
There is not a simple way to mitigate the complexity problem on the technology level. Understanding the evolution cycle and not falling for the typical FOMO pattern is a first step. It is also important to aim for simpler, higher level abstractions, always having the implications of the planned decisions for the whole IT landscape and the different stakeholder groups in mind.
There would be a lot more to write about this topic, but as this post is way too long already, I will leave it here.
In the next post of this series, I will start to discuss accidental complexity on the architecture level and how to tackle it. But as this requires a discussion of OSS as a prerequisite, its rise, its benefits, some typical misconceptions and its role today, I will insert the OSS discussion first. Stay tuned …
Wardley maps are named after their inventor Simon Wardley. They are a tool for strategic decision making. After being sort of an “insider tip” for quite a while, they started to become widely popular around 2019. A good introduction written by Simon Wardley himself can be found here. ↩︎
This is often called “schema-less”. But the only halfway efficient data access patterns on a real schema-less database would be full text search. So, usually people mean relaxed schema constraints if they talk about “schema-less”, e.g., that different entries can have varying attributes, or alike. ↩︎
You might argue that I wrote before currently we are between phases 4 and 5 regarding our current technology explosion. This was a deliberately simplified statement. While from a high-level perspective the statement basically is true, we realize that this picture constitutes from a multitude of more fine-grained cycles. ↩︎