Simplify! - Part 13

Layers of systems never cleaned up (part 1)

September 18, 2020 Uwe Friedrichsen

11 minute read

Simplify! – Part 13

In the previous post, I discussed several drivers of accidental complexity on the implementation level and what we can do about it.

As announced there, I would like starting to discuss a final source of accidental complexity in this post, which are layers of technology that built up over time in an IT system landscape and never get cleaned up.

Based on my observations there are several drivers that support this source of accidental complexity. Here I will discuss the most popular and influential ones that I have observed.

As discussing drivers and mitigation options in a single post would result in a way too long post, I once more decided to split up the topic in two parts: In this post I discuss the drivers. The next post discusses mitigation options.

Layers over layers of systems

The situation is the same in almost every company that runs its own IT: The IT system landscape consists of lots of different architectures, tools and technologies that grew over time, that do not fit to each other, that do not integrate well, that all have their own learning curves and intricacies.

The reasons why this system chaos emerged in the first place vary. I will discuss some of them in the upcoming sections. But no matter what the actual reasons were, the effects are always the same:

The IT departments drown in the complexity caused by the abundance of concepts, tools and technologies that accumulated over time.
Most of their capacities are bound maintaining and running the plethora of existing systems.
Capacities for thinking forward and acting from a more holistic, strategic stance are very scarce (or non-existent).
The diversity of the concepts, tools & technologies that are in use does not create a competitive advantage, rather a disadvantage, i.e., most of it is accidental complexity.
As there are so many concepts, tools & technologies in use that the knowledge of the people involved is stretched very thin.
Change response times are bad.
Errors accumulate.
Frustration is high.
…

The aforementioned effects usually lead to reactions from other departments that work as reinforcing drivers, leading to even more of this complexity.

Overall, this is a place where no company wants to be, but almost every company is – tendency getting worse a bit every day. Thus, what are the drivers?

Missing responsiveness of IT

A first potential driver are responsiveness expectations that are not met. The business departments have a need that requires a new IT system or changing an existing one. The implementation time, the IT department offers is too long for the business departments. As a consequence they push for a quicker solution.

Sometimes they force their IT to add a solution that breaks the existing stack. Sometimes they bypass IT completely, engaging an external solution provider that usually creates a solution totally incompatible with the existing solution landscape. Or something alike.

The result is an increase of IT landscape complexity.

Fear of decommissioning old systems

Often nobody dares to switch an old system off. Everybody knows that the system has reached EOL (End of lifetime) but nobody really knows what would happen if the system would be shut off. The people who built the system and really knew it usually left the company a long time ago. The people who maintain the system now usually only have a vague idea of the internals of the system and often do not dare to touch it – up to a point where required changes are only implemented in a facade layer encompassing the system, the system itself left untouched.

As a result the system is left running, even if everybody knows that this makes the situation worse. They know that just building new systems without decommissioning old systems leads to an ever-increasing IT landscape complexity, that more and more systems based on very different concepts, tools & technologies need to be run and maintained. But what if they would miss a relevant detail when decommissioning the old system? As they cannot answer that question with firm conviction, most people prefer not shutting the old system off.

As a result, the complexity of the IT landscape only grows, but never shrinks.

Missing technology evolution

When discussing accidental complexity on the tool & technology level, I recommended not to pick up every new trend or hype, but to make sure that it really creates additional value with respect to your actual needs before picking it up. While this is important, it is also important not to miss the point in time when you better pick up an evolution.

This creates sort of a counterforce to the one mentioned before which is often neglected:

Only pick up new trends if they create additional value for you, but also do not wait past the time when your existing solutions become a liability.

We need to understand that living in constant technology evolution cycles also means that the boundaries between essential and accidental complexity constantly shift. What was essential technological complexity yesterday, can have become accidental complexity today.

A novel, complex technology that were needed yesterday to create a competitive advantage can meanwhile be superseded by a new technology offering better abstractions and lower complexity. In terms of a Wardley map this is technology moving “rightwards”:

Starting at the far left in “genesis”, a new technology emerges. It offers novel possibilities but comes with poor abstractions and lots of rough edges.
You pick the new technology up and wrap it in mediocre abstractions you created yourself in the “custom built” phase.
At a certain point in time it does not make any sense anymore to maintain the custom-built solution yourself as ready-to-use products with better abstractions are available. You move to “product” – or at least that is what you should do.
Eventually the technology will become “commodity”. You just use it, often without noticing its existence anymore, being abstracted away behind a higher-level abstraction.

Companies often pick up technologies in the “genesis” or “custom built” phase, building their own solutions around it to create a competitive advantage – which is reasonable. But often they have a hard time spotting when to move from “custom built” to “product” (or even “commodity), i.e., when the competitive advantage of the custom-built solution is gone.

Sticking to the “custom built” solutions further costs time, money and capacity that in turn is missing for new, more valuable business opportunities ¹. In short:

Missing the point in time when to switch from “custom built” to “product” or “commodity”, a former competitive advantage turns into a disadvantage.

From what I have observed, there are several drivers for this:

Habit – The companies and people affected have become accustomed to the custom-built solution, including all its edges and pitfalls. People know how to use it and how to navigate around its edges. Therefore they hesitate to move to a different solution.
The sunk cost fallacy – People tend to cling to solutions they invested a lot of time and money into. As a consequence companies hesitate to retire custom-built solution because they cost a lot of time and money in the past. The problem is that due to the fallacy any economic disadvantage of sticking with the solution is typically ignored and more sensible alternatives are not taken into account. Prospective (future) costs of the existing and alternative (product or commodity) solutions are not calculated and compared. Also lost opportunity costs from not being able to respond to other evolutions because it would be too complex and expensive to integrate them in the custom-built solution are not taken into account.
Wrong value evaluation – During assessing if it makes sense to replace an existing custom-built solution with a product solution, it is only evaluated how much new business value the new solution would generate. Increased maintenance and operation costs for the existing solution are neglected as well as the costs due to increased personnel capacity bound and not available for other opportunities (a variant of lost opportunity costs).
The OSS movement – If it is about time to make the transition from an OSS-based, custom-built solution to a purchased product solution, especially developers often have a hard time letting go of the OSS solution. I already discussed OSS and some of the misconceptions associated with it in a little blog series.
The too-complex-system trap – Often systems grew so big over time and are so tightly coupled internally as well as with other systems that it becomes merely impossible to replace them within reasonable time, cost and risk boundaries.

There are probably more reasons for missing when to move rightwards on the Wardley map. But no matter what the reason is, if you miss to replace meanwhile outdated solution approaches from early technology evolution phases, you will retain a lot of accidental complexity that does not create any value anymore.

The result is an increased complexity of the IT landscape without any benefit.

The no-time to clean up fallacy

While it is important to manage the complexity of the IT landscape, it usually is not urgent.

In theory important tasks also get done (see the “Eisenhower method”). Yet, in practice urgent continuously ousts important, i.e., a constant inflow of urgent tasks stifle the execution of important tasks because the urgent tasks absorb all capacity available all the time.

As a consequence, the needed complexity reduction tasks are always postponed because “we do not have time for that now”, this way accumulating more and more complexity which makes you slower, resulting in even less time for accomplishing any tasks. This creates another self-reinforcing loop.

As a result, the accidental complexity of the IT landscape continually grows.

The big clean-up initiative

Sometimes important becomes urgent, meaning all the complexity has accumulated to such a degree that it can no longer be ignored or postponed. This is the hour of the “once and for all” clean-up initiatives: Let us replace the whole IT chaos with a single coherent (and currently fashionable) approach! Very often this happens in conjunction with the arrival of a new CIO, starting their new job with a pithy action.

Of course, this costs a lot of money, but since somebody said “reusability”, quick amortization is guaranteed – at least on paper.

And then you start …

Phew, that is a lot harder and more expensive than expected. How unpleasant! Let us maybe first build some new system to become more familiar with the new technology, something we already needed for a longer time, but have not gotten around to building yet. (New ways of customer interaction or other digitization topics are very popular in that context.)

Yikes, we need to exchange data with the old systems! Well, no problem, we will add a decoupling layer (introducing a shiny new API management solution), so that the immaculate new world does not get tainted by the ugliness of the old world.

And so on.

It takes longer than originally planned, it is more difficult and expensive than expected, the reuse does not work as expected, and nothing of the original system landscape has yet been replaced.

At some point, the initiative runs out of steam – often because the CIO has vanished to the next, more prestigious position. Some people speculate that the CIO was reluctant to be confronted with the results of his/her initiative and left it to their successor, but we do not know for sure. So, maybe that is just malicious gossip.

What is left is yet another complexity layer piled up on the existing system landscape lasagna – does not fit in well, contains a lot of gaps and rough edges, does not integrate with anything else, but for sure adds all sorts of new concepts, tools and technologies that need to be run and maintained for the next decades.

And then it’s time for the next big, new initiative to put everything in order once and for all, often just after the new CIO arrived …

The key problem of such initiatives is, that they are unrealistic nonsense. It is not going to happen. In order to lift thousands of person-years onto a new technology, you usually need – exactly! – thousands of person-years, no matter what migration approach you choose.

All the old knowledge encoded in the systems must be exposed, must be brought into line with today’s requirements, must be developed anew, must be adapted to the special features of the new technology, and much more. This usually takes as long as the original development. Stating that observation in a different form:

Principle of technology migration effort sameness

It takes the same effort to migrate an existing system to a new technology as it took to develop the system up to its current state.

Therefore, we need to accept that we will not get rid of the existing systems (sorry if I just have killed a long-cherished dream of many people). Sometimes a system is being lifted onto a new platform and technology when the risk becomes unbearable and there are not any other alternatives. But that is it. The rest will stay as it is.

Neglecting this, you will just increase accidental complexity massively with each new initiative.

Summing up

In this post, we have discussed several relevant drivers that lead to more and more layers of concepts, tools and technologies, building up over time, never getting cleaned up, leading to an ever-increasing complexity of the IT system landscape. There are certainly more drivers, but based on my experience the discussed ones are the most influential ones.

In the next post, I will discuss options to mitigate the problems and ways to fight this ever-increasing complexity. Stay tuned …

Note that this consideration does not only apply to whole applications but also to application parts that were created as custom-built solutions. E.g., this could also mean clinging to a custom-built framework as basis for all self-built application while an industry-standard framework meanwhile has become mainstream. ↩︎

blog

Home

About

Blog

Resources

Categories

Contact

Recent Posts

Thoughts on AI and software development - Part 4

Thoughts on AI and software development - Part 3

Thoughts on AI and software development - Part 2

Thoughts on AI and software development - Part 1

(Un)coupling in distributed systems - Part 2