The limits of standardization
Standardization is a two-edged sword. It has a value, but it also comes at a price.
If you solely think in terms of efficiency, standardization is the holy grail for you:
- You can leverage economics of scale, i.e., produce a single item at a lower price.
- You may get required resources or licenses at a lower rate because you buy higher volumes (another effect of economies of scale).
- You only have to design your processes and train your people once.
- You can leverage the fact that only one options exists, e.g., by creating an automation around it and offering services at a higher abstraction level.
- You can run and monitor more instances of the same with little effort added. At best the effort stays the same, no matter how many instances you run.
- And so on …
The bottom line is: Standardization allows you to do more for less money – which is the primary goal of everyone who tries to maximize efficiency.
To be clear: This is not a bad thing. Increasing efficiency was one of the biggest drivers behind industrial production. A big part of the increase of standard of living in many countries has its roots in increased efficiency. 1
Also inside IT, increasing efficiency has a value. It helps you to build and run your software and deliver your services at lower costs. The saved money can be used by the company to increase its competitiveness and maybe to have some reserves for rough times.
Finally, standardization helps to reduce cognitive load. The people affected only need to know one tool/product and the associated procedures to accomplish a given type of task, not multiple ones.
So, there is nothing wrong with standardization and the resulting increased efficiency.
But as with many things, more is not always better. There is a sweet spot for standardization and if you move beyond it, you may destroy value. The increased efficiency may be eaten up by negative effects not taken into consideration.
Here I will discuss a few of them.
Reduced options can backfire
Standardization means reduced options. Again, there is nothing wrong with that per se. Too many options can be overwhelming, sometimes even paralyzing. Too many options also mean high cognitive load if you need to know all the different ways to accomplish a given task. It can even compromise quality because the people affected may not be able to master all options to accomplish a given task.
Thus, reducing options and standardizing things can be quite sensible, not only from a pure cost perspective.
But if you take it too far, it can heavily backfire.
Let me illustrate that with an admittedly relatively extreme example.
I once worked in a client project where Cassandra was the default data store. This was a reasonable decision because Cassandra was a good fit for many use cases.
The problems only started when the decision was made and rigidly enforced that Cassandra was the only data store allowed – mostly for the reasons I listed in the beginning of the post. Now, the standard was Cassandra and Cassandra only.
The problem was that for some use cases Cassandra – frankly – sucked. I will sketch two of those use cases here.
In one use case some data needed to be selected and displayed in multiple ways, using a usual UI table element where the previously selected data can be sorted by any column via clicking on the respective column header.
As the underlying data selection typically was too big to be sent to the (usually) mobile device, a click on a table header triggered a new selection in the database, and the first N rows were fetched and displayed – basically a no-brainer if your data store is a RDBMS. 2
As you only could efficiently access data via primary key with Cassandra (everything else meant a full table scan by default), the only option to implement this feature via duplicating the data multiple times, each copy organized around a different primary key to support all possible access patterns.
This in turn lead to a lot of consistency issues as Cassandra per default implemented a very relaxed form of eventual consistency. It could be tuned a bit, but in the end there was always a risk that not all copies were in sync at a given point in time, i.e., that you would see different data depending on your search criteria and sort options – an effect the users regularly complained about. 3
Another use case required strong consistency at the level of serializability. It was necessary to derive a master data copy for entities that could be altered concurrently in many locations, even across multiple data centers. If you know the consistency guarantees Cassandra gives you (looking at the CAP triangle, Cassandra was designed as a pure AP database), you know that Cassandra is the wrong tool for such a job.
The built-in Paxos option also was of no help as it did not scale to the required transaction volume 4. As a result, we had to implement a very complicated solution that took several months to implement just to circumvent the limitations of Cassandra for that use case as good as possible.
With a RDBMS the task would have been easy. But as the data store standard for the project was Cassandra and we were not allowed to deviate from the standard, we basically needed to implement our own serialization layer on top of Cassandra.
If you look at the two examples, you see a commonality: Both use cases would have been no-brainers if we were allowed to use a RDBMS.
As written before, there were other use cases where Cassandra was a really good fit and making Cassandra the default data store was perfectly fine. But the “over-standardization”, i.e., limiting databases to Cassandra only massively backfired.
We wasted a lot of extra time and money for dealing with the lack of options. Cognitive load went up a lot as we had to understand the limitations of the standard in detail and figure out (very complicated) ways around it. It even reduced the quality of service of the resulting solutions.
In the end, the over-standardization ate up the advantages, the standardization advocates wanted to leverage. We wasted so much time and efforts on those use cases where Cassandra was not a good fit that it more than ate up what we saved in the places where it was a good fit.
To stick with the example one last time: Using a dozen different databases in your system landscape most likely is unreasonable. Reducing the options to a single database most likely is unreasonable, too. For most companies, the sweet spot will probably be 3-5 different storage options. This is where you hit the sweet spot between flexibility and cognitive load without compromising efficiency too much.
Bottom line: Do not overdo your standardization or it will backfire. Cognitive load and costs will go up due to a lack of required flexibility for solving your business problems. Efficiency on the other hand will go down – which is the opposite of what you tried to achieve with standardization.
Stick with the standards as long as it makes sense. But have the flexibility to deviate from the standard if it really does not fit.
Excessive rigidity impedes productivity
The second negative effect that can hit you if you overdo standardization, is basically a side-effect of the prior effect. If you over-standardize, you are more busy with working around the limitations, standardization imposes on you than with actually creating value.
If you take the example from the previous section, Cassandra as the sole standard made us slow. It repeatedly cost us a lot of time, effort and nerves to develop around the limitations we were confronted with.
There is also a second effect that you often can observe in companies that are highly focused on standardization and efficiency. They tend to optimize things so much that they develop a strong inertia towards changing their standards even if they become outdated.
Everything is organized around the standards. All processes and tools focus on the existing standards. The people’s work is organized around the standards. Often the employees are specifically trained according to the current standards and thus do not know anything else.
Therefore changes tend to take a long time. Often the built-in resistance of the organization is very high. Even if it is possible on paper to circumvent the existing standards, it often is not possible in practice. Big obstacles are set up to avoid deviation from the standard.
E.g., I know several companies that use sort of a preamble like this in their process documentations: “This is the standard procedure. We expect it to fit in ~80% of the cases. If you need to deviate from it, you can do it by …”
And then the obstacles are listed:
- You may need to submit an elaborate formal request with justification why you want to deviate from the standard.
- You may need to answer to one or more boards that need to approve your request.
- Some of the boards may only meet every 3 months.
- Some of the boards may feel more like a tribunal where you need to defend your case.
- You may need to report to extra committees and/or create extra documentation to “guarantee traceability” or alike.
In the end, the obstacles are so hard to overcome and cost so much extra effort that in practice – at least officially – you will not deviate from the standard procedure. I have seen quite some projects using their own internal procedures and tooling and then had some “project office” set up that mainly translated internal progress into the official standard tooling, reporting and procedures.
Of course, this is quite a waste of time, effort, money and nerves and should be avoided. Yet, the much bigger problem becomes visible if that company really needs to change their procedures quickly, e.g., because the market situation changed significantly and the current way of working is not sufficient anymore.
In those situations the over-optimization for standardization and efficiency will massively backfire. The whole organization and its culture is organized around following the existing procedures and penalizing deviation from the currently implemented standard. This makes quick changes of procedures almost impossible even if they are urgently needed.
In the end, you pay the efficiency benefits you may get from a high degree of standardization with reduced resilience. The ability of the organization to adapt to changed conditions deteriorates and if quick adaption of practices is needed, the company will not be able to do so.
Bottom line: Too much and too strict standardization makes you slower in all places where the standards do not really fit because you need to work around the limitations of the standards. Additionally, it reduces the companies' resilience, its ability to respond to adverse events.
Standardization increases efficiency, but impedes change and adaption to changed conditions.
Resilience is not only a topic on a company or department level, as discussed before. It is also relevant on an IT landscape level.
Standardizing your IT landscape as much as possible supports efficient processes, easy automation and relatively low cognitive load of the people involved. It can also reduce license costs based due to volume discounts.
On the downside, resilience of the IT landscape suffers. Too homogeneous environments become fragile as a little problem in one of the standard components can already have catastrophic effects on the system landscape. If, e.g., your switches are all of the same type and they have a bug or vulnerability, it can immediately tear down your whole IT installation at once. Or your OS. Or your firewalls. Or your load balancers. Or … you name it.
If you, e.g., would use switches from 3 different vendors and one type would fail due to some previously undetected bug, only part of your IT landscape would be affected. You might be able to keep your IT up at a reduced QoS (quality of service) level with the other two types of switches.
The downside of this approach of course is that you need to master 3 types of switches, monitor and manage 3 types of switches, automate 3 type of switches, and so on. Still, if you should run into a problem with one of them, the effects do not immediately become catastrophic, i.e., affect your whole installation.
Growing homogenization also leads to growing fragility: If you put all your eggs in one basket, i.e., just use one type of solution, it makes operations and management easier, but if the basket breaks, i.e., your solution has a problem, your whole landscape has a problem. If you put your eggs in too many baskets, the catastrophic failure risk sinks but operations and management efforts grow disproportionately.
Again, there is a sweet spot to be met between too much and too little standardization.
Standardization is a two-edged sword. It has a value, but it also comes at a price. Standardization allows you to increase efficiency and reduce cognitive load which both are good things.
On the other hand, standardization reduces options, impedes change and increases fragility – which all lead to reduced resilience.
If you overdo standardization, the negative effects will prevail the positive effects. As so often, there is a sweet spot between doing too little and doing too much and the challenge is to find the sweet spot of standardization.
I hope this gave you a few ideas to ponder. In too many situations, we still think in a simple “more is better” fashion and fail to realize that most of the times we also need to take counter-forces into account, that we need to find a sweet spot, a balance between positive and negative effects.
Standardization is just one the topics, where the challenge is to find the sweet spot and not to overdo it …
[Update] August 29, 2021: Made clearer in the initial “Cassandra” example that Cassandra was not the wrong decision per se, but that the limitation to using Cassandra only caused the problems.
There are also some other, partially not so “noble” drivers, like, e.g., shifting production to countries with a lower standard of living and thus exploiting the disparity of living standards. Still, increased efficiency due to standardization allowed for much lower prices of consumer goods leading to an increased standard of living, as people were able to afford more with the same amount of money. ↩︎
“Too big” not only means that there was too much data to fit in the device, but also that it would have taken way too long to transfer the data over a mobile network – sometimes only having EDGE connectivity – in an acceptable amount of time. ↩︎
If you ask yourself why we did not use secondary indices: Secondary indices had the same consistency issues, i.e., the data itself and the corresponding secondary index were not updated atomically, but in an eventually consistent fashion. Additionally, the index entries and the referenced data entity usually were located on different database nodes due to the way Cassandra distributed its data across the nodes. As a consequence, accessing data via secondary indices required roughly twice as many nodes accesses, i.e, was significantly slower than using the (resorted) table duplication method. ↩︎
While Cassandra itself almost linearly scales read and write capacity with the number of its nodes, its Paxos implementation did not. Additionally, the Paxos implementation only worked reliably inside a single data center. Across data centers, it bascially worked “best effort” which is not sufficient if you need reliable transaction serialization. ↩︎