The importance of comments - Part 2
This is the second post discussing the value of comments in code.
In the first post we discussed the problems of the widespread dogmatic “clean code” interpretation that you must not write any comments. We also discussed that we should not document the “how” as the code already does it.
In this post we will continue the discussion by looking at how documenting the “what” of code can create a lot of value for the readers and users of the code.
In the next and last post of this little series we will discuss some typical counter-arguments regarding the “what”-documentation plus some additional considerations how to find the right balance. Additionally, we will discuss how to document the “why”. Finally, we will bring it all together.
But first we will discuss the value of documenting the “what” in this post.
Documenting the “What”
My personal rule of thumb is:
Use building block header comments to document the “what”.
(Building blocks being methods, functions, classes, modules, …)
A building block header comment should document “what” the building block (e.g., a class or method) does. It should not document “how” it does it. That part is documented in the code.
Note: For the sake of simplicity, I will focus on methods/functions as building blocks in the remainder of this section. Nevertheless, the considerations can also be applied to other building blocks like classes or modules – of course with a bit of readjustment what exactly to document there. E.g., for a class it should document what responsibility the class encapsulates (like in CRC cards) – which is BTW a great test to figure out if the class boundaries make sense in the first place.
But back to methods now:
The user of a method does not want to read the code of a method and all the methods it calls to understand what that method does. The user wants to see at a glance what it does, what parameters to put into it, what to get out of it, if it has any side effects, what else is important. The user wants to know about the contract of a method when using it and not having to derive its contract from the implementation.
This becomes even more important if you want to implement a change: You identify the place where you need to implement the change. Someone else wrote the code. Or maybe you wrote it, but it is too long ago that you still know what you did back then. Before changing the code, you first want to make sure that you understand what the code does, that you apply your changes at the right place in the right way.
So you read through the code. There is a method that gets called. You think that you have an idea what it does, but you want to make sure that it actually does what you think it does. You place your mouse pointer over the method name and expect a short contract popping up as tooltip, explaining what the method does. You read it, maybe adjust your understanding of what the method does and then move on, reading through the rest of the method. After that you are sure about what the method does and confidently apply the change.
Now imagine that the tooltip is empty. No explanation of the “what” available. This means you have to jump into the called method, read and understand its code, jump into the methods that this method calls for its part, and so on.
A real-life example
I remember a situation when I wanted to fix a bug. It was part of some quite tricky document processing, doing some weird “magic”. I identified the place where I thought the problem was. As I had not written that part of the code, I went through it to make sure I understood it correctly before applying a change.
The code in the method called several other methods that had long clean-code-ish names, still not being totally unambiguous. Hence, I tried to validate if the methods really did what I thought they did. I moved my mouse point over the methods to see their contracts (Javadoc in that specific situation).
But the tooltips of the called methods were empty – no documentation of the “what” available. After a short sigh, I went into the implementation of the called methods. Three hours later I had gone through several thousand lines of code and several hundred test cases and still was not completely sure what the methods exactly did.
The names of the methods and their implementations did not feel in sync. Also, the calling context and their implementation did not feel to match. Additionally, I had the impression that the implementation of some of the called methods had some – still undetected – bugs. But I was not sure as a brief description of the “what” was missing.
Did I miss something the author had in mind when he wrote the code or was the code actually buggy? I had no clue as there was not a single comment line telling me “Well, this code does X” that would have helped to validate or disprove my impression.
As written, the method names did not help, either. While following clean code standards of using meaningful names (which is a good thing), the names could not replace a contract, making the subtleties explicit, I stumbled upon. Additionally, they felt out of sync, i.e., they felt like the implementation of the methods did not match their name anymore. 1
Also, the test cases did not help. Of course, they did not cover what I was not sure about. The aspects I was not sure about were not tested (which would be obvious if they were undetected bugs as I suspected). Writing an additional test case would also not have helped, as I could not judge if it was me or the author who missed something. Again, I did not have any indication if my impressions were justified or if I missed something relevant.
So, after the three hours I gave up. Applying a change to a code base I did not understand properly was not an option. Chances were a lot higher that I would have broken more stuff than I would have fixed if I would have changed a code base I did not really understand.
In the specific situation, the author was on vacation for 3 more weeks and the other team members I asked also could not tell if the code was okay or buggy.
Overall, three hours of my lifetime wasted and a production bug fix postponed for three weeks, just because the author did not spend a few minutes to comment what these methods do.
It was BTW not the only time a problem like mine occurred in that project. I observed it multiple times with multiple team members affected. Most of the time, it ended that the originator of the code needed to fix bugs in it after other team members failed understanding the code.
In the end, the time saved vs. time wasted ratio due to not documenting the “what” leaned massively towards time wasted in that project. And I observed similar effects in other projects that did not document the “what” of their code building blocks.
Effects of not documenting the “What”
In general, it can be said that not documenting the “what” of your code typically results in two typical types of behavior if someone tries to understand a piece of code in order to change it (which is the primary reason developers open other people’s code in their IDEs):
- You try to understand the parts of the code that are called from the place where you intend to apply your change to make sure that you will not accidentally break something due to wrong assumptions about the parts called. As the “what” of the called parts are not documented, in the worst case, you first have to read through and understand half the code base (as it was in my situation) before you can safely apply the change – if possible at all.
- You realize that the “what” of the called parts is not documented and thus you decide to guess by method name or alike what the parts do when applying the change.
In the first case, the time you need to implement the change explodes, because you first laboriously need to reconstitute what methods do from their implementations (the how).
In the second case, chances are high that you involuntarily introduce new bugs with your change because the methods behave differently than you thought – which then requires fixing the new bugs where you face the same dilemma again.
Both cases suck. In short, not documenting the “what” means:
- Repeatedly wasting lots of time of the readers/users of the code
- Provoking bugs if the code needs to be changed 2
Therefore, I recommend documenting the “what” of code. It saves the people who come back to the code to change it a lot of time and reduces the probability of introducing involuntary bugs.
As a pleasant side effect, documenting the “what” helps you to reflect your design, if the slicing of functionality is sensible. If you try to write down the contract of a method, you realize very quickly, if the responsibility of method makes sense or if your design is flawed. This way, documenting the “what” has a similar effect on design as TDD has (the resulting artifacts are different, though). 3
Of course, this means that your comments documenting the “what” need to be up-to-date. I will discuss this topic and some more in the next post.
In this post we discussed the value of documenting the “what” for the readers and users of the code, how not documenting the “what” can lead to a lot of wasted time and frustration for the readers and users of the code.
In the next and last post we will discuss some typical counter-arguments regarding documenting the “what” (like the inevitable “Do I need to document all my setters and getters now?").
Additionally, we will discuss a few considerations how to find the right balance when documenting the “what”. Finally, we will discuss how to document the “why” and sum it all up. Stay tuned … ;)
This is an interesting observation, I had several times in larger code bases: Not only comments go out of sync (which is the by far most frequent argument used against comments). Also method names (and sometimes variable names) go out of sync, if you do not carefully keep them in sync with their implementation. This happens especially often in contexts where the implementation changes a lot in the beginning before settling. ↩︎
A third option is that you recently wrote the code that you attempt to change now. In that situation you do not realize the problem because you still know what the code does. Still, coming back to the same code 6 months later might confront you with a completely different situation. ↩︎
This is neither an argument for or against TDD. TDD done right is a good way to reflect and adjust your design. Documenting the method/function contracts can do the same. I often write the contracts for a method before writing the code, which helps me to ponder my design. Other people can do the same thing better using TDD. Or even another approach. The good thing about TDD is that you already have automated tests when you are done pondering your design. With documenting method contracts, you already have the contracts for the later readers/users of your code. Both artifacts have a value and IMO you need both in the end. ↩︎