Let’s (not) break up the monolith - Part 2
In the previous post, we started with the observation that companies (still) want to break up their monoliths into microservices. If you ask them what they expect from this measure, they typically expect to cure the “big ball of mud” issue with microservices or to improve their time to market with them.
We then discussed that simply changing the runtime artifact style from monolith to microservice will not help with the “big ball of mud” issue because the actual problems lie in the organization, the processes and the people, but not in the technology – especially not in the runtime artifact style.
In this second (and last) post of this little series, we will discuss the “time to market” issue and sum it all up afterwards.
The time to market issue
The other typical response you get from companies why they want to break up their monolith into microservices is they need to improve their time to market. Often this response is based on pressure they get from their business departments. The complaint, IT departments are too slow, can be observed in many companies, especially in bigger enterprises.
The reasoning that lead to microservices as the solution for the problem usually goes along the following lines: “It takes us so long to apply code changes to our monolith because it is so complex and so hard to understand. We also need a long testing phase to make sure, we did not break anything involuntarily. If we break up the monolith into microservices, the individual services will be small. The complexity will go away because the services are small and thus easy to understand. That means we can apply changes faster and do not need such a long testing phase anymore, solving the time to market issue this way.”
Of course, the reasoning tends to contain some more details. But in its core, it boils down to the reasoning sketched very often.
It is not the fault of the runtime artifact style “monolith” that you time to market suffers.
It is your fault, the fault of the people involved!
Looking at the IT value chain
The first thing, I would recommend before going for microservices to improve time to market, would be a honest value chain mapping. Lay out all activities from a business idea until it is live in production and customers can experience it. Then add the average processing times and the average wait times. Again, be honest!
Typically, this results in a chain of a dozen or more steps, including several approval steps and gateways. The business idea – or let us call it a “feature” – will wait in a queue in front of the next activity very often, waiting to be picked up. E.g., the feature waits to be bundled with other features into a project, a release, a program increment, or however your software development process calls it. It waits for project approval and budgeting. It waits to be prioritized, to be refined, to be scheduled, to be implemented, to be tested, to be deployed, and so on.
If you add up the wait times, you often realize that 80% or more of the overall time, the feature waits for the next activity to start – especially if the overall IT value chain contains activities that are only done periodically. E.g., bigger companies often have several “boards” and “committees” that only meet once in a while to evaluate and (hopefully) sign off all the items that have queued up since the previous meeting.
Klaus Leopold visualized this nicely in his book “Rethinking agile”. He showed a typical IT value chain with all the stages and approval gateways an idea needs to go through in many companies before it can be experienced by the customers, including all the wait times. As mentioned before, in most companies the approval committees only meet periodically, i.e., monthly, quarterly or even yearly (e.g., budget planning or project portfolio planning). And somewhere at about 2/3 down the value chain, you find that small box “develop” with a comment “we are so f***ing AGILE, yay!!” attached to it. 1
Developing faster does not help
If you take the results of the value chain mapping and look for the actual implementation time (you remember: the activity you want to speed up by breaking up the monolith into microservices to improve your time to market) most likely you will realize that it is less than 1% or 2% of the overall time to market, the time the feature requires from being an idea until it can be experienced by the customers.
Let us assume a lead time for a feature (from idea to production) of 200 days – a number that is not uncommon in enterprise agile environments 2. Let us also assume the average implementation time is 4 days which equals 2% of the overall time and is not uncommon for a typical business feature. Now let us assume, you are able to reduce the implementation time by 50% which would be quite a good improvement (do not forget that you still need to implement the actual business logic of the feature).
Hooray, you just reduced your lead time from 200 days to 198 days! And to do so, you needed to do a lot of work upfront. You needed to break up your monolith into microservices which typically cost several million Euros. Doing it did not create any business value at all and you introduced a runtime architecture that is orders of magnitude more complex than the old one and thus is much harder to operate and maintain at runtime. Even if turning your application into microservices would have saved you 10 days, it would still only be a 5% improvement, hardly worth all the efforts and added complexity.
But the throughput, you might say now. We doubled the throughput. That must be worth something.
Sure, it is worth something. But it pays a different bill. You may be able to implement more features per unit of time. But the lead time of a feature did not change noticeably. If we talk about “time to market”, we talk about the lead time, not the throughput.
Additionally, you probably will not double your throughput because the actual software development almost never is the bottleneck in the IT value chain. Even if it should be the current bottleneck, it is not the only bottleneck in the IT value chain. As soon as you are able to implement more features per period of time, other bottlenecks along the value chain will become visible. If you shorten your implementation time by 50% as sketched in the example above, your overall throughput will most likely only increase by 10% or 20% due to other bottlenecks along the IT value chain.
Hence, only improving development time most likely will not have a significant effect on your time to market.
Reducing wait times helps
To really improve your time to market, you would need to honestly measure the processing times, the wait times and the queue lengths of your features and continuously work on the bottlenecks along the whole IT value chain. But that is a lot more than only changing the runtime artifact style of your software. This would be mostly about organization, processes and people.
If you would do all of that, at a certain stage breaking up the monolith into microservices could support becoming even faster because it would allow the cross-functional teams aligned with market capabilities to independently deploy their functionalities if done right 3.
But this does only work if you change your organization and processes and upskill your people first to significantly improve your time to market. This is a change that does not only affect the IT department but also the business departments, the reporting lines, the distribution of decision authority and responsibility and a lot more. In short: Actually, improving your time to market means a big change and not toying around with technology. 4
You also need to get the design of the services right, how to distribute the functionality adequately across the services. Unfortunately, the forces that drive the design of distributed applications (which microservices are by definition) are very different from the forces that drive the design of in-process modularization (i.e., the module design of a monolith).
While designing in-process applications is relatively good understood, designing distributed applications is not. And if you get your service design wrong, your software development will not be any faster than it was with a monolith. So, chances are that you will not become any faster after switching to microservices. You will only have made operations a lot more complex. 5
Finally, always keep in mind that also the monolith started as a clean, well-designed and structured application. It only deteriorated into a big ball of mud over time for reasons that are not related to the code itself (or the chosen runtime artifact style), but to organization, processes and people. I discussed those reasons before in the “big ball of mud issue” section of this blog series.
Hence, if you do not address any of those issues but only break up the monolith into microservices, the deterioration will kick in very soon and you will be on your road to a distributed ball of mud – which means any temporary gain of delivery speed will be gone soon.
These were the two typical responses, I hear if I ask customers why they want to break up their monolith: They expect to solve their big ball of mud issues and their time to market issues.
Sometimes you may get some other answers like that microservices support reusability and thus help to save money. Or that they are needed for scalability reasons. Or that they are needed to support team autonomy. I discussed all these fallacies and some more in the microservices fallacies blog series, I already mentioned at the beginning of the first post. Thus, I will not discuss them here again. If you are interested to understand why all these apparent arguments for microservices are just fallacies, I can recommend to read that blog series.
This little blog series started with the observation that companies (still) want to break up their monoliths into microservices. If you ask them what they expect from this measure, they typically expect to cure the “big ball of mud” issue with microservices or to improve their time to market with them.
I have then discussed that simply changing the runtime artifact style from monolith to microservice will lead to neither of them. The actual problems lie in the organization, the processes and the people, not in the technology – and especially not in the runtime artifact style.
Still, we have a long tradition of tackling organizational, process and people problems with technical solutions because it feels a lot easier to introduce yet another technical “solution” than to address the actual (much harder to solve) problems. And because we are so used to taking the technical false solution path for so many years, it started to feel so “natural” that most people stopped questioning this path a long time ago.
But it is just a false solution. Nothing gets solved by taking the technology path in such a setting.
If the dust settles, you will find yourself with the same problems you had before – plus another layer of problems because you added another technology, another layer of complexity to your already way too complex IT landscape (see my “Simplify!” blog series for a comprehensive discussion of the complexity problem in IT).
Does this mean you should avoid microservices by all means?
No, it does not.
As I discussed in the microservices fallacies blog series, you need microservices less often than most people think they do. Still, if used for the right reasons in the right way in the right context, microservices can be a very useful architectural style.
But – and this is the core message of this whole little blog post series – they are not a cure for problems rooted outside technology. They can support other measures needed at the process, organizational and people level. Without getting those other measures right, microservices are basically pointless – usually even counterproductive.
You can also find the image I mentioned at slide 36 of the presentation “Rethinking agile” by Klaus Leopold. ↩︎
Be careful not to confuse development iteration lengths (typically named “sprints” based on the Scrum terminology) and lead times. Companies often use iteration lengths of 2 weeks or alike. It typically still takes a feature 6-9 months from idea to production because of all the other steps in the IT value chain. The actual implementation duration usually is negligible when it comes to the overall duration of the IT value chain. ↩︎
Note that usually a well-structured monolith would be more than sufficient for most companies to release new features quickly and reliably. The actual benefits of microservices tend only to pay out if you face quite extreme conditions like, e.g., the big tech companies do. They tend to live in a winner-takes-all market and feature lead time is essential for their viability. Most companies do not face such extreme conditions. ↩︎
This is probably one of the core reasons why DevOps which originally was all about improving the IT value chain lead times quickly degenerated into a technology topic which mostly revolved around CI/CD and container technology: Most companies were not willing or able to rethink their organization and processes to accelerate the IT value chain but too discussions are a welcome compensatory satisfaction if fixing the actual problem is not possible. And a lot of money can also be earned with it … ↩︎
Writing a blog series about service design is still on my list. I will try to remember to add a link here when I will have written it. In the meantime, I can recommend you the two slide decks “Resilient functional service design” which discusses the differences between in-process and distributed application design and “Getting (service) design right” which shows a way to design distributed service-based applications better (including the trade-offs). ↩︎