The value of speed
The speed of the IT value chain often is a topic in discussions I have with decision makers. The common theme is: The IT value chain is too slow. We need to deliver faster.
But how fast do you need to become, is my question then. I tell them, that the best in class, e.g., Amazon deploy roughly twice per second. Of course that is the deployment rate of all their teams united, but also a single team usually deploys many times a day. Combined with an appropriate management of requirements this means lead times from an idea to production of a few hours, sometimes just minutes.
Thus, IT value chains with intraday lead time (from the idea until the user experiences the result in production) are today’s benchmark – especially if you are afraid that one of these unicorns or a comparably fast challenger could enter your market.
Still, usually I get the response that such short lead times are not needed. Most of the decision makers claim that monthly or (bi)weekly deployments would be fast enough for them. They do not even consider the lead times. They just talk about the deployment frequency. In their minds it seems to be perfectly fine if it takes several months until a business idea is delivered to production. 1
While I understand that not all companies live in highly dynamic post-industrial markets with lots of uncertainty that require very different approaches including very short lead times, I still think intraday lead times would have a huge value for them.
The 2019 Accelerate State of DevOps report lists several advantages like, e.g., improved organizational performance and a better working environment (less stress and burnout, better work/life balance). While these alone should be worth aiming for intraday lead times, I think there are even more advantages.
Here are some of them.
Most applications contain a lot of configuration options which add a lot of complexity to the solution. More code is needed, the code is harder to read and understand, testing and debugging becomes a lot harder, changes become slower and more expensive, more bugs creep in, and so on. I have seen several solutions where configuration at least doubled the complexity of the solution and thus the efforts needed to evolve and maintain it.
Yet, all these configuration options are only there for a single reason: The behavior of the solution needs to be changed faster than the deployment frequency allows. If you, e.g., deploy twice a year but must be able to change the behavior on a monthly base you need a configuration option for this type of change.
Now let us make a simple thought experiment: Assume that the IT lead times were below the change lead times needed from a business perspective. E.g., a change can be implemented, tested and deployed in 10 minutes and the shortest change time needed from a business perspective (from the moment I realize a change is needed until it is live) is 30 minutes.
In such a setting we would not need any configuration options in the solution but could change the code instead. After checking in the code the new behavior would be immediately deployed. 2
As a consequence the code would become a lot simpler. No alternative lines of execution need to be maintained. No complex decision logic to determine the right behavior. No configuration options frontends, files, data stores and alike. Just the currently valid code.
This would make changes a lot simpler and thus faster and less error prone. Also robustness of solution grows. New development team members need less time to understand the solution and become productive. And so on.
Overall, it would save you a lot of money, make you faster and more robust in production.
Even if you would not get rid of 100% of your configuration options but just 80%, the effect would be significant.
If your IT lead times are long, typically several months or longer, there are always huge amounts of business ideas competing for implementation at the same time. The amount is so big that a simple queue for organizing the development of the ideas is not sufficient anymore. Instead a more complex organization is needed.
Typically all the ideas are bundled and organized in projects. All these projects need to be defined, budgeted (see next section), scheduled, staffed, set up, executed, controlled, often readjusted using steering committees or alike, and more. The sheer amount of costs and efforts for setting up a project prohibits projects below a certain size which often works as a reinforcing loop.
Projects are by definition needed to organize and handle work that the regular organization cannot handle. This also means projects require extra efforts on top of the normal organizations work. Almost every company with significant IT work I know has built a big apparatus that does nothing but managing projects. 3
Now assume that you can implement and release new ideas in such a short time that a simple idea queue with a bit of prioritization would be sufficient to organize the whole IT work. Picture it as a big Kanban board (or whatever you prefer) where the cards move so fast from left to right that there never is a huge jam at the left, where everyone with a new business idea can be sure that the idea will be implemented in a very short time. 4
The result would be a lot leaner and simpler organization. All the project overhead needed to organize and manage the huge idea backlog could be avoided. All the efforts and costs would simply disappear.
Additionally, you would not need to staff new project teams all the times, wasting lots of efforts until the teams actually become productive. Instead, you could have stable product teams as part of your regular organization that are long past their storming phases and are highly productive.
A final nice effect would be that everyone would be more relaxed: The business owners are relaxed because they know their ideas are quickly implemented. The IT managers are relaxed because they are not constantly put under pressure and accused of being too slow. The product team members are relaxed because they work in a highly productive environment they know and trust.
Overall, it would save you a lot of money, make you faster and reduce stress.
Budgeting frenzy is part of the project overheads. The long backlog of ideas due to the long lead times results in a continuous shortage management. There are always much more ideas competing for implementation than can be implemented in the considered amount of time.
This fight for (often needlessly) limited capacities leads to strange capacity allocation rituals that over time became an end in itself. The probably most annoying of these rituals is the budgeting process. In quite some companies the budgeting process keeps the highest payed employees occupied for 15% or more of their whole time.
Now assume that you do not need all that anymore. You can implement and release ideas so fast that IT is no longer a bottleneck that needs to be managed according to the principles of shortage management. Instead, budget controlling can be simplified tremendously. It is even possible to align it with market stimuli which creates a completely new lever: Instead of just thinking about cost savings in the context of IT, you can use it to increase revenues.
Overall, it would save you a lot of money, maybe even increase revenue, make you faster and reduce stress.
Cutting off manual tasks
Short lead times and a high deploy frequency to production require a very high degree of automation. Lots of manual tasks are simply not possible in such a setting. Besides speeding things up, the high degree of automation also has other desirable effects:
- Less errors happen. Repetitive routine tasks are boring, people get distracted, more errors happen. By automating all repetitive routine tasks, this source of errors has gone.
- People are happier. Nobody likes boring repetitive tasks, especially in the domain of knowledge work. You can get used to them, routine may feel comforting to a certain degree, but nobody actually likes dumb repetitive tasks. People are less motivated. People are less productive. People do not bear new ideas that improve the company. By automating boring repetitive tasks you remove this source of dissatisfaction and create the conditions to release the creativity and productivity hidden in the employees.
- More time for valuable work. Instead of being bound by boring repetitive tasks that need to be done but do not create additional value, the people are free for doing more valuable work, this way creating more value (and revenue) with the same number of people.
Overall, it would save you a lot of money, make you more productive and more robust in production.
Shortening lead times and increasing deployment frequency does not only lead to a high degree of automation. It also ruthlessly points out where things have become an end in itself. Bloated requirements validation processes, complex design rules from the ivory tower, expensive quality gateways with questionable value, documentation without a reader, and so on.
A focus on lead times brings them all to light and helps streamlining them, cutting off the fat and only leaving the valuable parts (if there are any). This leads to a much more streamlined and simplified process along the whole IT value chain that, focusing on value and not on (often power-driven) rites. 5
Overall, it would save you a lot of money, reduce stress and make you more productive.
Fixing production problems becoming ordinary work
In traditional IT organizations project work and “maintenance”, including fixing production problems are strictly separated. Most of the capacities are allocated to development projects. Especially, the most experienced engineers are always allocated to these projects.
This regularly leads to unfortunate situations if serious production problems occur. The engineers allocated to maintenance, typically not the most experienced engineers (because they are allocated to development projects), are often not able to fix the problem.
As a consequence, an “emergency workforce” (names vary from company to company, but the underlying pattern is the same) needs to be set up, consisting of all experts needed to understand and fix the problem as quickly as possible.
Basically, this “emergency workforce” is a project organization built on top of an existing project and line organization. Typically, the most pivotal people from the projects are temporarily transferred to the workforce, crippling the projects, impeding the development projects' progress massively while the workforce exists.
“Crippling” means that not only the one expert is missing in the team, that her capacity is missing in the development project while the workforce exists. Usually, the experts are crucial for the progress of the development teams. Thus, removing her from the team for a while typically means that the development team’s progress stalls for that period of time. For a development team of, e.g., 10 persons removing the most pivotal expert for a while does not reduce the team’s productivity to 90%, but usually to 20-30%.
Additionally, setting up and managing such a workforce means a lot of unplanned additional effort. If this would happen once or twice a year, it would be unfortunate, yet acceptable. But I know companies where this happens on a regular base, often up to two times a month, not seldom taking 3 or 4 days until the workforce has completed their work and can return to their normal work.
Taking the crippling effect on development projects into account this means that the overall development project productivity is reduced to roughly 50% of the normal productivity in such a setting.
With intraday lead times and arbitrary deploy schedules, setting up “emergency workforces”, thus crippling development projects is not needed anymore. Production issues and new development ideas can be handled with the same process by the same teams. No extra organization is needed. It is not necessary to cripple whole teams over and over again, unduly slowing down development work.
Instead, the production problem is “just another item” that needs to be done. It gets prioritized as all other items and then implemented. As intraday lead times are technically not a problem and deployments always go to production, the regular development organization is perfectly set up to also handle production and maintenance issues. In the end, for the teams it all blends into the single theme of continuously improving the value of the software they work on.
Overall, it would save you a lot of money, reduce stress and make you more productive.
Less worthless work
Quite some research exists that shows that a high percentage of code implemented is never or rarely used. This does not mean the annual reports code that by definition is only needed once per year. It is about code being part of applications that are used all the time. Many features that had been implemented in these applications are rarely or never used by the users. Implementing them was not worth the efforts – using Lean terminology they are “waste”.
The problem in traditional slow organizations is that all requirements are “best guesses”. Someone thinks that based on some more or less reliable indicators a new feature X might be desirable for the users. Still, the requesters are usually detached from the users in time and space.
Often they do not have access to the users and the long lead times make it impossible to collect feedback along the way. The feature needs to be described in its whole and then put into development – and months or years later the result is presented to the users. To make things worse, the acceptance of the features usually is not measured which deprives the requesters from their last feedback option.
As a result, big features with at best questionable success chances are requested, many of them resulting in little or no value – and in the worst case they even annoy the users, i.e., they destroy value.
With intraday lead times it becomes easily possible to test features at a small scale before completely implementing them. It is possible to set up a continuous feedback cycle with the actual users of the software. With this feedback it is possible to test variants, to identify and cut dead ends (features without value) early before wasting lots of time, effort and money on them while reinforcing promising features.
If played really well, it is possible to create much more value with a smaller IT team, this way outsmarting the challenges of the demographic change.
Overall, it would save you a lot of money, increase revenue and make you faster.
Many companies seek to improve “business agility” (names vary from company to company, but the underlying pattern is the same), the possibility to respond more flexible to changing internal and external demands. What they do not take into account that this kind of flexibility requires very short IT lead times.
The ongoing digital transformation means that IT has become inseparable from business. IT and business have become the same. You cannot change any business feature anymore without touching IT. If you still think in strictly separated IT and business departments you have missed an essential consequence of digital transformation – you are not a digital leader but a digital laggard.
This also means that your IT lead times delimit the agility and flexibility you can achieve in your business. The shorter, the better, the more flexible you can (re)act. Intraday lead times enable you to become really flexible and responsive with respect to changing external and internal demands.
Overall, it would make you more flexible and potentially increase revenue.
These were just some of the effects of intraday lead times, resulting in significant cost savings, systems that are more robust in production, increased productivity, less stress, more revenue and better business agility.
All this did not even touch the inherent uncertainty of post-industrial, highly dynamic markets and the essential competitive advantage that short lead times give you in such an environment. Thus, even if you consider your market not that competitive, intraday lead times have a significant advantage.
Overall, going for intraday lead times and deployment cycles leads to:
- Significant cost savings
- Simpler, more effective organizations
- Higher productivity
- Increased robustness of the IT landscape
- Higher employee satisfaction (with all its positive effects)
- Increased company resilience
- Improved business agility
- Higher user/customer satisfaction if played right
- Relief for the omnipresent shortage of IT experts
Thus, I think going for intraday lead times is not only a topic for Internet unicorns and companies living in highly dynamic, post-industrial markets. If thought through, economically it is the only sane choice.
Of course it does not come for free. We need to rethink, drop or adapt lots of the habits we grew fond of (even if we claim to hate them). We have to establish many new, still unfamiliar practices. “Good enough” is not good enough anymore, but continuous improvement becomes the name of the game. And traditional positions of power will lose their relevance, being replaced with new positions of power – probably the biggest impediment.
I hope this post gave you some food for thought – and some arguments the next time someone tells you that monthly releases are good enough.
Based on my observation, only few people understood the fundamental difference between IT value chain lead times and deployment frequency. Most “agile” projects I see use Scrum with bi-weekly sprints. While this means that theoretically every two weeks a new version of the software could be deployed to production, it usually is not. Instead the first deployment to production typically takes place after the whole backlog is implemented, i.e., at the end of the project (counteracting the whole concept of agility). Additionally, typically there are still the established cycles from generation of new business ideas, bundling multiple ideas in project proposals, annual budgeting cycles and approvals, project portfolio planning and project approvals, project staffing (often including negotiations with external service providers), preparation and initialization, before the implementation starts. After completing the implementation there are still extended QA and pre-production phases before the software goes live in production. This way, even in an “agile” setting the lead times for new ideas can be up to 18 to 24 months, taking the whole concept of agility ad absurdum. But even if the setting is not as rigid as sketched here, the lead times from an idea to production often are still 3-6 months or longer because the ideas have to be validated, prioritized, bundled with other ideas, scheduled in the “release train” (or whatever more or less agile method is used), and so on. The key point is: Deployment frequency and lead times are two completely different topics. While you cannot have a lead time that is shorter than your deployment frequency, very often lead times are orders of magnitude longer than the deployment frequency. Speeding up lead times is a much more comprehensive task than speeding up deployment frequency. ↩︎
This of course also requires that the person who needs the change and the development team have a means to communicate directly. This is another reason for the widespread configuration frenzy: That the people who need a change in behavior have no effective means to order it from software development. As a result, the person orders a configuration option that she can use whenever she needs a change of application behavior. Yet, if we really had sped up lead times to allow for intraday lead times we also had to implement effective communication means, i.e., we can assume this problem solved for the sake of this post. But be aware that if you actually attempt to reduce lead times, the existing organizational and communication shortcomings tend to be a lot bigger issues than the technical challenges. Thus, even if we can assume this issue to be solved in the context of this post, in practical implementations these issues will haunt you most. ↩︎
I have seen many companies where the whole IT department (of a significant size) is nothing but a project management apparatus and the actual implementation work is outsourced to external service providers. ↩︎
Actually, a single big Kanban board would not be sufficient for a bigger organization. You need to have a way how new ideas find their way to the affected teams. But also this challenge can be solved in quite simple, straightforward ways. So, it becomes a lot simpler, but not trivial. The details are beyond the scope of this post. Maybe I will pick the topic up in a later post. ↩︎
As sad as it is: Most likely the hardest part of the streamlining process will be fighting power-driven rites. Still, most people who are in power are more concerned about their power than the company. While this is a very normal human trait, it often turns out to be one of the biggest change impediments – no matter which kind of change you try to foster. ↩︎