Software - It’s not what you think it is - Part 3
In the previous post of this blog series we discussed the broken abstraction dilemma, that abstractions help to create concise descriptions but take away degrees of freedom, and that breaking an abstraction usually means increasing the required size of the description by orders of magnitude.
In this post, we will first discuss what the broken abstraction dilemma means for AI solutions before moving on to the greenfield fallacy. Let us get started.
AI and the broken abstraction dilemma
Before discussing the effects of the broken abstraction dilemma on AI solutions, let us briefly recapitulate what the dilemma means. If we want to be able to describe our demands in a short and concise way, we need some kind of implicit or explicit high-level abstraction that enables us to specify things in a compact way. As soon as we break the abstraction, we face two problems:
- We work against the abstraction to get the deviating demands implemented which can be hard because usually abstractions are not created with violations of them in mind. As a result, trying to implement a demand that breaks an existing cohesive abstraction can become arbitrarily hard.
- Usually, it requires one or two orders of magnitude more specification details if we move to a lower abstraction level. Often, this results in brittle solutions that are hard to maintain and change.
In practice, most of the time we are either confronted with incomplete abstractions or requirements not accepting the limits of the abstractions, both leading to the aforementioned situation that we need to bypass the existing abstraction and implement parts of the solution at a lower abstraction level – with all its consequences.
AI to the rescue?
But now we have AI and everything will be different!
Well, AI solutions can for sure help to bridge the gap between natural language that requirements are usually written in and a formal description typically written as code (or diagrams that traditional “business experts create software solutions on their own” approaches used).
But the effects of the broken abstraction dilemma do not magically disappear because the underlying problems persist. If business experts want to express themselves at the level they are used to, i.e., in terms of the usual requirements, the AI implicitly needs to apply a high-level abstraction. It will create code based on some implicit abstraction that lives at the level of business requirements.
As long as the business experts accept what the AI creates, everything is fine. But as soon as they want to have things differently, they need to break the implicit abstraction. This leaves them with two basic options:
- Either they try to change the abstraction implementation (the “code generator”), i.e., they try to make the AI create always different code. But that means it will affect everything and previously created code cannot be reproduced anymore.
- Or they try to change just the implementation of a specific part of the solution. Still, breaking the abstraction will usually require the experts to specify a whole lot more details, often at least an order or two of magnitude more details. In the worst case, almost all details need to be specified just a bit above a 3GL level and the highly anticipated AI support quickly turns into a dreadful ordeal.
Overall, it can be said that the challenges of the broken abstraction dilemma persist in the face of AI-based code generation. It just moves the burden of dealing with the challenges from the software engineer to the business expert.
Creating new code is simple
Up to this point, we have only discussed what happens if the AI creates new code. We have not yet discussed what is going to happen if the AI needs to modify existing code.
Generating code from a concise set of requirements is nothing new. This is basically a solved problem and surprisingly good solutions without any AI support exist for quite some years meanwhile (just think about the latest generation of low code/no code solutions). The real challenge arises if you try to insert new code into an existing, highly complex web of interacting, often conflicting demands, expressed as code – which is the default scenario of almost all software development.
The greenfield fallacy
This brings us to the greenfield fallacy. In short, the greenfield fallacy is only reasoning about writing new code, neglecting what it means to modify existing code. This misconception could be rooted partially in the continuous comparison of software with physical goods. E.g., if you build a house or a car, you usually build it from scratch – each house and each car. But software is different. We rarely build stuff from scratch. Usually, we modify an existing codebase. Most of the time we live in the brownfield.
Writing new code on a so-called greenfield is simple, almost trivial compared to evolving a huge codebase that grew over many years, that represents the combination of all prior requirements, often interwoven, sometimes even contradicting.
The existing solution does not only comprise all prior requirements combined in a complex schema. It usually was also written by many people, each one having their individual style of doing things. The requirements also tend to have come from several people with different needs, priorities and points of view, and the order when which person added which requirement also influences the shape of the solution and how it works.
Therefore, understanding the existing solution and knowing, where and how to add the new requirement without breaking existing behavior is probably at least 95% of the work. Writing the code for the new requirement tends to be less than 5% of the work.
Let the AI rewrite the solution from scratch every time
We could now consider letting the AI rewrite the code from scratch every time to avoid dealing with the existing code base. After all, an AI solution can create code so quickly that it may appear to be a tempting approach. However, this would require the business expert to completely describe the whole system from scratch, including the new or changed requirements.
The whole prior conversations with the AI solution that led to the existing solution need to be preserved and the business expert needs to understand them all in order with all their details and implicit and explicit dependencies to figure out where and how to extend and modify them to achieve the desired result.
Up to now, the business experts come up with requirements in isolation, like “I want the solution to do XYZ. Everything else should work as it did before”. The second sentence is usually omitted but implicitly assumed.
Then, the software engineers have to figure out how to add XYZ while keeping everything else as it was. With an AI solution that turns requirements into code, this task would move from the design and implementation level to the requirements level.
The business experts would have to take care that their requirements are not conflicting, that they do not interlock, how they interact with each other, that they do not interact in unexpected ways, that adding a new requirement does not break an existing requirement, and so on.
Up to now, the software engineers need to sort this out. This is a lot of work – as written before the majority of the design and implementation work and probably one of the primary reasons why software development appears to be so “slow” from the outside.
With an AI solution, this work would move from software engineers working with code to business experts working with requirements.
We cannot shun the work. It needs to be done because computers cannot deal with ambiguity. They execute statement by statement in a deterministic fashion and if we make them execute the wrong statements, we do not get the desired results. Hence, we need to remove all ambiguity from our demands and turn them into something deterministic that matches our expectations, i.e., that the computer does what we expect it to do. 1
In the end, the business experts would need to learn how to describe the whole system in the right order (remember: the order of requirements matters regarding the resulting solution) in an unambiguous way using natural language.
As natural language is always ambiguous to a certain degree, this approach most likely is doomed to fail. Either the experts will need to add clarifications all along the way or find a way to reduce the natural language to a more formalized, unambiguous language.
Well, we tend to call such formalized, unambiguous languages “programming languages” and such approaches are covered by 4GL, MDA, DSLs and low code/no code environments for a long time already.
Additionally, the AI solution would need to be able to digest the whole natural language specification which can become arbitrarily long at once. Even the most powerful AI solutions, wasting unbelievable amounts of energy for their training and also in production compared to existing non-AI solutions, are limited regarding the amount of information they can take into account at once.
So, this appears to be a dead end for any non-trivial existing system – at least in the foreseeable future.
Let the AI reason about the existing code
Which leaves the option the AI solution needs to figure out on its own how to integrate the new requirement into an existing, complex codebase. This means, it needs to understand and reason about the code that it or a previous version of it or a different AI solution or a human or a code generator or who- or whatever created.
It needs to be intellectually capable to extract all the existing business-level intents, demands and requirements from the existing code base, resolve any potential conflicts and decide where and how to add the code for the new requirements best – without creating any new conflicts, without unintentionally changing any existing behavior, but resulting in a working system that exactly behaves in the expected way, based on a few sentences the business expert uttered to describe the new requirement.
Again, this is a very different story than writing code the first time and most likely your favorite AI solution will soon say:
“I’m sorry, Dave. I’m afraid I can’t do that.” 2
Let us not change the system after its creation
And what about simply not changing a system after it has been created, not continuously updating it as we currently tend to do it? Just implementing it once and using it without changing it after its first release? Well, we are going to discuss this idea in depth in the next post of this series where we will discuss the value preservation dilemma. A little spoiler upfront: It does not work with software.
Using the car metaphor in a useful way
Before wrapping up, let us briefly return to the car metaphor because even if the metaphor is broken in many ways, it helps to understand the problem better 3:
- We do not design a new car by uttering a few sentences. It will take hundreds and thousands of pages to describe a car in a way that an AI could create a blueprint for it which exactly matches your demands. The same is true for software.
- We do not change a car design by uttering a few sentences. We have to check and double check the change demand against the existing design to figure out how it affects the existing design, if there are contradictions to be resolved, if things interlock, if the existing design or the new demand needs to adapted, and so on before we could give it to an AI to create a blueprint. The same is true for software.
Applied to software development, we could phrase this observation as:
Writing code is not the challenge. Uttering a demand is not the challenge.
Translating a demand into something that can be turned into code is.
This activity is called “design” and is the biggest part of the job of (most) software developers. And whenever we need to integrate a new demand into an existing solution – which is the default situation – the job becomes much, much bigger.
In this post, we discussed what the broken abstraction dilemma means for AI solutions. AI solutions can help to bridge the gap between natural language that requirements are usually written in and a formal specification. However, they still need to rely on some implicit high-level abstractions to allow for concise specifications and not to send the business experts into specification hell, just using natural language instead of a formal specification.
We also discussed the greenfield fallacy, that the actual challenge is not to create some code for a given requirement. The challenge is to integrate it into an existing, highly complex web of interacting, often conflicting demands, expressed as code. In practice, this is the main job of software development and up to now there is little evidence that AI solutions will be able to handle this task in a reliable way in the foreseeable future.
In the next post of this series, we will discuss the next misconception, the value preservation dilemma and its consequences. Stay tuned …
To be totally clear: A computer does not care if the order of statements it executes makes any sense or not. It will always simply execute the statements as they are provided. We care if the stuff the computer does makes sense. Or to be more precise: We care if the stuff the computer does matches our expectations. Thus, in the end it is our expectations we are struggling with, not the computers. They only do how they are told. But we expect computers to do things that match our expectations. Therefore, we need to take on all the hassle to translate our expectations into something the computer can turn into statements that match the expectations – and it does not matter if that something is code written by software engineers or requirements written by business experts. ↩︎
It is funny in an odd way that on the non-working comparisons with physical goods are used to come up with totally nonsensical software development approaches but are neglected if they would be useful to detect misconceptions regarding software development. It feels a bit like applying double standards. ↩︎