Quality - The unknown entity - Part 2

Why quality probably is not what you think

Uwe Friedrichsen

10 minute read

View from a sailing ship under the boom

Quality - The unknown entity - Part 2

This is the second post of a blog series discussing quality.

In the previous post we discussed that quality is subjective and that quality has many dimensions.

In this post we continue the discussion regarding relevant properties of quality. First, we discuss we need to make quality measurable 1 to create the required shared understanding. Then, we discuss many people do not understand how their demands affect quality (and how we need to help them). Finally, we discuss quality attributes can conflict and how to deal with that.

Quality needs to be made measurable for a shared understanding

In the last post we ended with the observation that we need discussions about quality to make the different expectations and needs of the different stakeholder groups explicit.

A relevant part of the discussions is to agree upon acceptance criteria regarding the different quality attributes.

E.g., what is the expected system availability? Are 99,5% okay? Or are 99,9% required? How fast does the application need to respond? Within 300ms in 99% of all requests? Or are 500ms okay? Does it need to be 99% or is it okay if we meet response time in 95% of all requests? It is important to nail down what a vague attribute like “performance” actually means for the requester: Response time? Throughput? Both? Or maybe something else?

It is harder to come up with agreed upon metrics regarding development related quality attributes like maintainability or changeability. In the end, from a client’s point of view, criteria like average lead time per feature or change defect rate are relevant.

As a result, clients often have a hard time drawing the connection between a metric like unit test coverage or average cyclomatic complexity and their actual needs. And to be frank, those metrics do not necessarily affect the aforementioned needs of the clients in the desired way. I have seen places where metrics like test coverage and cyclomatic complexity had become ends in themselves, detached from the purpose they should serve.

Connecting leading and trailing metrics

Yet, there is (or at least should be) a correlation between, e.g., your test coverage and the change defect rate. It is important to create the causal connection between leading metrics like unit test coverage or average cyclomatic complexity and trailing metrics like change defect rate. For most clients, only improvements of the trailing metrics create value. Still, we need the leading metrics as early indicators for the trailing metrics.

Thus, simply measuring, e.g., unit test coverage is not sufficient. You need to measure the leading and the trailing metrics. You also need to observe if changes in the leading metrics affect your trailing metrics – and either drop or change the leading metrics if you cannot observe any (usually delayed) correlation.

If you lack to create those connections, there is a high probability that your clients ignore or even actively suppress the leading metrics and the activities to improve them – simply because they do not understand how those activities create value for them.

The discussion on how to measure quality helps to make demands explicit. A generic demand like “high performance” or “good maintainability” is not very useful as every person has her or his own interpretation what that means. The discussion also helps to uncover the connections between leading and trailing metrics and boosts the acceptance of measures to improve those metrics.

Finally, understanding the connections between “your” leading metrics and the trailing metrics relevant for the clients also help you implement our quality improvement measures in a way that not only boosts your leading metrics like unit test coverage but also improves change failure rate that is relevant to the client. This often makes the difference between powerful measures and mere cargo cults.

People often do not understand the consequences of their demands

Making quality measurable also helps to uncover unrealistic demands and to discuss the consequences of those demands. E.g., clients quite often demand 100% or 0% fulfillment for a quality attribute. The problem is that the costs to implement a quality usually grow exponentially with the degree of fulfillment.

E.g., achieving 90% availability usually is easy. 95% requires a little effort. 99% is a bit harder, but still achievable without huge extra efforts. 99,9% already requires quite some extra activities on the code and infrastructure level. E.g., with 99,9% availability, you have less than 9 hours downtime per year available. This means that you usually cannot afford planned downtimes anymore to have a buffer left for unplanned downtimes.

As a rule of thumb, each extra “9” raises the required implementation and operations cost by an order of magnitude. 100% would be infinitely expensive.

This means that requirements like “always up”, i.e., 100% availability need to be discussed. People very often do not really understand the consequences of their demands regarding quality. Software development and operations are complex beasts. Even people familiar with them often do not completely understand how their decisions will affect the result.

Helping people make the right decisions

Therefore, it is important to help clients to understand the consequences of their quality demands in terms of immediate and future costs, efforts and risks. It is not sufficient to just ask clients for their desired figures. It always needs to be a discussion where we support our clients in making the decisions that suit their needs best.

You need to make it explicit for them what their demands mean in terms of effort, money and risk and offer them options to choose from. E.g., if they demand “100% availability”, show them what 99% costs them in development and operations, what 99,9% means and what 99,99% means.

If they understand in their “pricing system” (time, money, risk) what different levels of fulfillment of a quality mean, based on my experiences they always make a sensible choice. It is our job as IT experts to provide them with the information they need to make that choice.

You need a bit different discussion if clients demand 0% fulfillment. Typically 0% are not explicitly demanded, but the client simply ignores the respective quality attribute because they are not aware of the attribute or do not understand its relevance for their goals.

If, e.g., clients prioritize down test automation, they typically have ignored the average change defect rate. This will also have very negative effects – maybe not immediately, but over time either the change defect rate or the average feature lead time will go up tremendously.

Both effects are highly undesired by the client. Again, it is our task to make these connections explicit and help our clients to choose the right values for the required metrics by providing them with options and helping them to understand the consequences of their choices.

Be aware that this is not a guarantee that the client will choose the option you prefer. They may have additional constraints that let them choose a different option (we will discuss some of these constraints later in this post).

Still, from my experience

  • having the discussion,
  • making quality goals explicit,
  • attaching metrics to them and
  • explaining options of choice in their language (time, money and risk vs. the degree of fulfillment of the respective quality)

lead to much better quality choices.

Resolving conflicting quality attributes

You also need to make quality explicit and make it measurable to resolve conflicting quality goals. E.g., security and usability often conflict with each other. Just think about 2FA (2 factor authentication): While it improves security, usability suffers as authentication becomes more tedious.

Or take availability versus resource utilization: Better fault tolerance improves availability, but usually requires more resources (e.g., for redundancy) which in turn compromises resource utilization needing more resources for the same task.

This means you cannot maximize the fulfillment of all quality attributes at the same time, but you will need to find a reasonable trade-off for the given context. You also need to find out which of the attributes is considered more important if two or more of them point in different directions regarding the implementation of a requirement.

As a starting point, I often take a list with all quality attributes relevant for the project 2, explain what they mean and let the client order them from most important to least important. It is important to insist on an order.

It is not allowed to have two or more attributes being equally relevant as this almost inevitably leads to an order where most attributes are “relevant” (70%-80% in position #1) and a few are “not as relevant” (20%-30% in position #2). Especially, the conflicting attributes tend to be in the same position. Therefore I always insist on an order.

This little exercise works best if all or most stakeholder groups are in the room as it immediately makes distinct quality needs explicit and negotiations start right away.

Usually, the group finds an agreement after a while (typically around half an hour). Often they are surprised about the order they have found because they found out that the relative importance of the attributes is quite different from they have thought upfront.

If it is not possible to get all stakeholders together at once, you may need several iterations of the exercise. Still, while being very simple, it is a powerful tool to drive a common understanding of quality. You still need to make the different quality goals measurable as described above. But you already have a good starting point. Especially, you have created a shared understanding which quality attribute trumps if two of them should conflict regarding the implementation of a requirement.

Summing up

In this post we discussed we need to make quality measurable to create the required shared understanding. We also discussed that many people do not understand how their demands affect quality and how we need to help them make the right decisions. And we discussed we need to deal with the fact that quality attributes can be conflicting.

In the next post we discuss that quality lives in a context that affects the different quality attributes. Thus, quality cannot be seen in isolation, but always needs to be discussed in the encompassing context.

Stay tuned … ;)

  1. You might argue that you cannot measure quality. And you are right: If you can measure it, by definition it is a quantity, not a quality. Still, to enable expedient discussions about quality, I insist on making quality “measurable”. I am totally aware that we will not measure the quality itself, but only a proxy quantity that (hopefully) approximates the quality. Still, we need the proxy quantity to have an expedient discussion basis and not to end up in clashing opinions (“This is high quality!”, “No, it is not!”). I just omit to explain the distinction between the quality itself and the proxy quantity we use to (usually only imperfectly) approximate the quality. From my experience, explaining this correct, but non-trivial distinction leads to a hell of confusion among the people involved and can kill the whole exercise before it started. Therefore, I simplify the statement to “quality must be made measurable” – works a lot better in most situations, even though it is not correct the strict sense … ;) ↩︎

  2. I tend to preselect the attributes I put on the list. The reason is twofold: First, it avoids rejection by the client. If you confront a client with a long list of attributes, they do not understand just to learn a few minutes later that most of the attributes are irrelevant in their setting, they (understandably) react irritated. Second, it shortens the time needed for reaching an agreement significantly if the list contains only a dozen or fewer attributes. Still, make sure that the list takes all stakeholder groups into account. E.g., do not forget to add attributes relevant for the developers (and make sure that the developers also have a say in the discussion). ↩︎