Throwing things away
This post sort of complements the “Simplify!” blog series as it also points towards the simplification of the IT landscape. Still, based on a different train of thought, I decided to make it an independent post.
“Removing code does not create value”
The problem starts with an essential fallacy: Removing code does not create business value.
Whenever decisions are made what tasks IT development budget will be spent on, the question is: “Does it create business value?” 1
For bigger projects you often see some business case calculations, for smaller tasks it is often based on a prioritization made by the responsible product manager 2. All the calculations and prioritizations take (or at least should take) the expected value of the task into account: How big is the expected return of investment? The higher, the more likely the project or task get implemented.
Up to that point everything is fine. It is a lot more reasonable to invest your money where you expect high returns than investing it somewhere where you expect little or no returns.
It becomes problematic if you only take the immediate value of an activity into account: How much more money do I expect to make if I implement the project or task?
I do not talk about the effects of uncertainty, that you cannot reliably predict the value of a task in a dynamic market environment. That is another problem of this approach, but not the one I want to discuss here.
Here I want to discuss that this approach is too short-sighted and inevitably leads to problems over time. This way of thinking will never allow clean-ups and removing functionality from an existing code base.
Based on the given thought model, removing code cannot create value because you do not create anything new that could create additional value. Thus, following the reasoning, cleaning up never creates value, sometimes even potentially destroys value if you remove code that could create revenue, no matter how small it is.
Secondary complexity costs
This train of thought completely ignores secondary complexity costs.
Let me illustrate this with two little examples from the world outside of IT.
Assume you are a logistics company. You own a truck fleet that transports goods. Applying the train of thought from above to this scenario would mean you never send your trucks to maintenance because it does not create value. If any trucks are broken you try to get away with duct tape fixing as properly fixing the truck does not create any value. If a truck is broken for good you just leave it in your garage because disposing it does not create any value. You just buy new trucks all the time cluttering your garage with more and more trucks.
You say this behavior does not make any sense if you want to stay in business for the long run? Totally agree. But this is how we typically decide about our IT landscape.
Another example: Assume you are a grocery store. Applying the train of thought from above to this scenario would mean you never clean up your shelves because it does not create value. You do not remove expired or spoiled food. You just refill food all the time. You also do not remove products from your portfolio because it does not create value. You just add more and more products. You do not clean your store because it does not create value. You just put more stuff on the shelves, you fill up the aisles with food until eventually the whole store is completely filled with food.
You say only a complete idiot would do that? I agree again. But this is how we typically act with respect to our IT landscapes.
Of course the given examples do not exactly describe the situation in IT. At a certain point they break as all analogies tend to break. But they nicely illustrate the point that maintenance and cleaning things up have a value and that neglecting them is a bad idea. This is as true for IT – and software in particular – as it is for the physical world, even if software – opposed to the physical world – is invisible and thus a lot harder to grasp.
The intricacies of software
Actually, it is even more complicated with software (that is where the analogies of the examples completely break). All the functionalities woven into software have dependencies, usually a lot more than you would like to have.
To (mis-)use the grocery store example a last time: This is as if all the food items in the store would be connected with threads and whenever you try to move any item or add a new item, you need to make sure that this does not create a mess somewhere else. To make things worse, the number of threads grows a lot faster than the number of items. They even grow if you keep the number of items constant. Eventually all the existing items and their threads will make it practically impossible to add any new item.
I will not take the analogy any further. I just wanted to illustrate the concept of dependencies at least a tiny bit. In software development we use and reuse other parts of our software all the time. This is okay. It is part of what makes software so powerful. But if we do not maintain our software and groom the dependencies they will just grow all the time until our software becomes a tightly entangled, extremely complex knot of dependencies that nobody understands anymore.
This costs more and more money with every feature that we additionally try to weave into that knot. It feels a bit like a big administration where you are only allowed to add new roles and communication paths, but you must not remove or reorganize the old roles and paths. The overall mission of the administration changes all the time and you must make sure that all changes can be executed properly by the organization. But again, you are only allowed to add, not to remove or restructure.
Does that sound like a total nightmare and you never want to be in charge of implementing new functions for this administration? Welcome to software development where only new features have value!
The value of maintenance and cleaning up
As wonky as the analogies are they should have helped to illustrate the problem of a software development where only new functionalities are considered valuable. As described before, it is even more complicated with software than in the physical world and the effects of neglecting maintenance and cleaning up are even more drastic.
Thus, we need to change our evaluation of “value”. If maintenance and cleaning up has a value – and we have seen that it has – we need to take it into account when we “calculate” the value of a project or task. The existing evaluation schema does not help. We need a schema that also takes the secondary complexity costs into account.
I do not want to derive a complex calculation schema here as it would distract from the core message of this post (maybe I will do it in a later post). 3
Let us just briefly look at the negative consequences of lacking maintenance and clean-up. Basically not maintaining and cleaning up a software-based solution has two basic consequences:
- The software only grows and never shrinks
- The internal condition of the software continuously deteriorates
Growing software means more and more functionalities that are usually highly intertwined. While in theory you could design solutions that are only as intertwined as the problem domain requires, i.e., it does not introduce any dependencies on the solution side that do not exist on the problem side, in practice this does not happen. In a setting where only implementing new features is considered valuable, careful design is considered wasteful and thus will be suppressed.
If everything is connected to everything, the number of dependencies would grow quadratically with the size of the software. In practice, not everything is connected to everything, but still the number of dependencies tends to grow quadratically with the size of the software.
A dependency means that you cannot add or change functionality without taking other things into account. The more dependencies you have, the more things you need to take into account if you want to add or change functionality. This makes it increasingly harder to design a solution that behaves as it is intended to do and does not accidentally break existing parts of the solution.
Additionally, the human mind can only ponder a limited number of things at the same time, and this number is quite low. Based on my experience, I would say that in average we are able to ponder about 20-30 weakly connected things or about a dozen highly connected things – of course with the help of some graphical visualizations and alike, not just in the mind. If we need to ponder more things, we need a strictly hierarchical decomposition schema to not lose the overview.
This means that the probability that we get things wrong, i.e., that we create unwanted behavior in our solution grows exponentially beyond a certain complexity in terms of items and connections between them – and as I have written before: The number of items and connections where we start to lose overview is surprisingly low. 4
Of course, over time we have created a big toolbox in software engineering to reduce dependencies and organize functionalities in a hierarchical way. And if applied correctly, the tools in the box can help a lot. Yet, beyond a certain size all those tools do not help anymore. The sheer number of items overpowers our minds.
Additionally, a lot of the dependencies are rooted in the problem domain. I.e., the functional requirements demand dependencies between the different functionalities. These functional dependencies rooted in the problem domain often are really complex and I have seen many situations where the requiring people were not aware of the complexity they called for.
We have no way to get rid of these dependencies and the resulting complexity. A solution cannot be simpler than the problem it needs to solve (in terms of the requirements posed). We can avoid accidental complexity on the solution side, but we cannot get rid of the complexity of the requirements unless we change the requirements. 5
Overall, we can state that the effort to add or change things grows about quadratically with the size of the solution. Also the probability to get things right decreases roughly exponentially with growing size and complexity – especially if a lot of complexity is rooted in the problem domain.
A lack of maintenance additionally means that the internal condition continuously deteriorates. As a consequence you cannot rely anymore upon that anything you want to use works as expected. Instead, you will often be surprised by unexpected behavior which adds to the efforts and reduces the probability to get things right.
Again, this leads to a disproportional rise of efforts and reduced correctness probability. Picture the deterioration as a series of defects that creep into your solution – some old code that does not fit the new code anymore, a little inaccuracy that might not work as expected in some edge cases, some quick hacks to shorten the implementation time for the new feature, and so on.
Each of those defects is like a feature: You need to take care of it when you try to add or change a functionality. This way, a growing number of defects have a similar effect as a growing number of functionalities and dependencies in your solution: The effort to add or change things grows about quadratically with the degree of deterioration (the number of defects) and the probability to get things right decreases almost exponentially with it.
Reassessing the value of a task
If we take the disproportional rise of efforts and reduction of correctness probability into account that results from a rising number of functionalities and a rising degree of deterioration, we end up with a different evaluation schema for the value of a task. Roughly said, we need to add two terms to our existing value evaluation schema:
- A penalty term that at least quadratically takes the existing and resulting size of the solution into account
- A penalty term that at least quadratically takes the degree of deterioration of the solution into account
As written before: I do not want to derive a complex calculation schema here. But the basic idea is clear:
Continuously letting a solution grow reduces its value.
Not maintaining a solution properly reduces its value.
As the penalty terms grow disproportionately, beyond a certain functional complexity or deterioration degree the value of adding or changing functionality would become negative. At the same time the value of maintenance and cleanup tasks would become rather big.
Ideally, this leads to an effective balance between adding features, changing features, removing features (that turned out not to create the expected business value) and keeping the design clean enough. The key word in the previous sentence is “balance”: It is not about dogmatic minimalism and perfectly clean code. It is about finding the right balance.
Adding new promising features to a solution has a value, and keeping the solution manageable also has a value. It is about finding the sweet spot and not drifting into an extreme. As useless the one extreme is that we experience quite often these days, as useless is the other extreme. The actual value lies between them.
Putting it in practice
In practice you usually will not spend the effort to define an actual value evaluation schema. Instead, if you are aware of the problems of growing size and lack of maintenance you rather define a budget for clean-up and maintenance. From my perspective, this is a perfectly fine approach and quite simple to implement.
The challenge is to find the right ratio between investment budget (for implementing new features) and maintenance budget (for maintaining and cleaning up) to end up close enough to the sweet spot.
Alternatively you can implement some solution state metrics (tools exist that support you with that and that you can integrate in your CI/CD pipelines). Based on the metric values, you dynamically reallocate the budgets. This resembles Google’s SRE approach, more precisely the concept of unreliability budgets a bit. Again, the challenge then is to find the right metrics and thresholds.
But more important is the question: What do you do with your maintenance budget?
This is not only fixing “technical debt”, i.e., running a code quality checker and fixing the defects indicated. Sure, you should do that, too, but that is not the core of the work.
The core of the maintenance work should be removing, aligning and simplifying functionalities. What does that mean?
Not all functionalities that you find in a solution are really needed. Typically, solutions contain quite some “dead code”, code that is no longer needed because the requirements have changed over time. Additionally, many functionalities create so little value that the complexity they add exceed their business value by far. If you can identify such functionalities and “dead code” (usage metrics are your friend here), you should remove them from your code base.
This makes the solution simpler. You typically also remove a lot of dependencies. The code becomes easier to understand, easier to change, easier to extend. It makes you faster. It makes your solution more reliable.
The biggest problem is that most people shirk to remove functionality. It is not only due to the sunk cost fallacy. Most people are also sort of compulsive hoarders in IT (“Do not throw it away. We might need it one day in the future.”) and shirk the decision (“What if I make the wrong decision and we really need it one day?”).
It is a lot easier to get the okay to add something to a solution than the okay to removing something. But if we do not want to suffer from the problems of not throwing anything away, we need to bite the bullet.
The other topic is aligning and simplifying functionalities. Often you have functionalities that are implemented overly complicated, that create unneeded dependencies to other functionalities, that do the same thing in a different way and alike. It is not that the functionality is not needed or that it is wrong, but the way it is defined makes the solution a lot more complex and brittle than it would need to be.
Here it makes sense to sit down with the business owner of the functionality, to go back to the original business goals of that functionality and discuss if there are simpler ways to achieve the goals. If we are able to simplify the problem description (without losing sight of the business goals attached to it) or aligning it better with other parts of the solution, we often are able to simplify the overall solution a lot, achieving the same effects as with removing unneeded functionality.
Again, it is easier to get people discussing new functionalities than how to simplify existing functionalities, especially as humans tend to have a complexity bias, i.e., they tend to believe that something more complex must be better than something simpler (which is a fallacy). But again, we need to bite the bullet if we want to improve the situation.
It is a widespread practice to judge implementation projects and tasks solely by their immediate expected business value. As a consequence, IT solutions only tend to grow and are poorly maintained. But growing complexity comes at a price and solution deterioration also comes at a price. As the price grows disproportionate with growing price and deterioration, soon the price will exceed the expected value of a new feature.
This means we need to clean up and maintain our solutions on a regular basis. Ideally we change our value evaluation process to take the costs of growing complexity and deterioration into account, tough in practice a balanced maintenance budget can also be a good solution.
It is not sufficient to clean up the usual “technical debt” because much of the complexity is rooted in the original requirements. We need to actively groom the functionality, identify features that are no longer needed or do not create enough value to justify the extra complexity and remove them. We need to figure our ways to simplify and align functionalities that are defined in complex ways without compromising their original business goals.
These are challenging tasks – not only because the fallacy is persistent that removing or simplifying features does not create value. But we need to do them if we do not want to pay an ever-growing price for each new feature that exceeds its value by far.
Additionally, the question “Do we have to do it due to legal or comparable obligations?” is asked. In those cases, the business value question is not posed as not doing it is not an option. ↩︎
This also includes agile approaches. Just replace the terms “task” and “product manager” with the corresponding agile jargon. The underlying meaning does not change by going agile, even if the wording changed: There are still things to be done (e.g., user stories) and persons who decide if and in which order those things are implemented (e.g., product owners). And business case calculation are still common in agile approaches (not discussing here if they make sense or not). ↩︎
In a talk I gave 2011, I started developing a calculation schema for secondary complexity costs. While that schema is probably suitable as a starting point, today I think it needs quite some additional work to become actually useful. Unfortunately, the slide deck is in German (I only switched to English as my slide language later). Still, if you are capable of reading German slides and should be interested, you can find the slide deck here. ↩︎
With “exponentially” I mean that beyond a certain size and complexity the likelihood that we really overlook all relevant details of the solution extremely quickly drops to zero. So, the probability to miss something important quickly grows to one. ↩︎
Of course, a solution can be simpler than the original problem if you decide only implement parts of the problem as IT solution. That is a decision you need to make when you decide about the scope of the solution (which can be upfront or ongoing, depending on your approach). There lies a lot of power in not implementing things. I touched that topic when I discussed accidental complexity on the requirements side in my Simplify! blog series. Still, if I write that a solution cannot be simpler than the problem it needs to solve, I look at it from a software development point of view. The requirements posed define the problem for me. If there is more to the original problem than is written down in the requirements, I do not know. For me the requirements define the whole problem. Based on that point of view, my solution cannot be simpler than the requirements I need to implement. That is the core principle of essential complexity. Yet, as written before: There is a huge lever in influencing which requirements are posed (Does it economically make sense to implement all special cases even if they only happen once every 10 years?) and how they are posed (Is there a way to define the requirement that it does not contradict the existing design without compromising its intention?). Thus, if you can you should use that lever. ↩︎