Naming Things Clearly

Dec 17, 2019

Programming

By: Brandon Quakkelaar

Clean Code by Robert C. Martin is one of my favorite programming texts. It has a chapter called Meaningful Names wherein principles for the clear and clean naming of code elements are explained.

When we write code we should be considering the programmer who follows us. Sometimes it may be ourselves who have not touched the code for months or years. The programmer working in previously written code will need to study it to try and recognize the intention and systems of the original author. Making the code as easy to understand as possible is the author’s responsibility. When writing professional code, resist urges to ‘be clever’ or to sacrifice clarity for brevity.

Use these good naming principles and the programmers who follow us will be grateful.

Use Names That Show Intent

Name things so that we can identify them. Consider the variable declared below.

DateTime z;

We can clearly see there is a variable. We can tell it’s a DateTime. Though, we can’t tell what it’s used for from just the name. A programmer may try to add clarity be adding a few more keystrokes to the names.

DateTime login;

This improves readability somewhat. But it the programmer still isn’t clear about what purpose this variable is intended to serve. The name could be further improved by adding more details.

DateTime lastLoginUtc;

This is a better name because it communicates several things about the intent of the variable. It is intended to hold a UTC date representing the last time there was a login. The original name, z, communicated none of that intent. The program using z would have been harder to decipher.

Here’s a slightly more complex example.

public bool GetResult(Dictionary<string, string> a, string x, string y)
{
  if (!a.ContainsKey(x)) 
  {
    return false;
  }

  return compare(y, a[x]);
}

This isn’t doing anything complex, but it’s not obvious what’s happening. See how much cleaner things get when meaningful names are applied.

public bool ValidatePassword(Dictionary<string, string> credentials, string username, string password)
{
  if (!credentials.ContainsKey(username)) 
  {
    return false;
  }

  return CheckHash(password, credentials[username]);
}

When names show intention we have a much better idea of the job this code performs.

Avoid False Clues

Programmers should not use names that have common meanings which could mislead the reader from what the variable actually represents. For example ee is a bad variable name. Depending on the reader’s background they may be led to believe it stands for Expected Error, Employee Experience, or just Employee.

Similarly, if a variable refers to a collection of employees, avoid using employeeList unless it’s actually a list. In fact, often it’s preferable to not indicate data type in the variable’s name. A pluralized name, such as employees, communicates that it can hold multiple employee objects.

Avoid Noise

When attempting to write descriptive names, be careful to avoid meaningless noise. For example, words like “info” or “detail” can easily become noise. Consider the method name GetUserClaims(). Some may be inclined to name it GetUserClaimsInfo() or GetUserClaimsDetail() even though the extra words don’t add any new information for the reader.

Names Should Fit in Conversation

Abbreviating terms or using acronyms can produce names that are awkward to verbalize in regular conversation. Avoid difficult to pronounce names in favor of names that are easily said.

Prefer originalLoginDate over ogLgnDt. Also prefer EnableLongDescription over ENBLNGDESC.

Don’t Use Encodings

Names that use encodings often violate more than one of the previously recommended rules. They introduce an additional learning curve and require programmers to be subjected to unnecessary extra effort to decipher such names.

The most egregious offenses I’ve seen are names for database tables and columns in ERP software. From a real ERP we find this example, CO00101. The name provides next to no clues regarding the table’s purpose. The programmer is expected to know that it refers to the table representing “Document Attachment Master.” Not only does this example use encodings, it also violates the principle that names should show intent. It’s an unfortunate name on multiple levels.

Hungarian Notation is an encoding scheme that has historically been popular. But it is a violation of this avoid-encodings rule. In some pioneering programming languages it was necessary to encode information about the variable within the variable name itself using mnemonics. Today our languages and IDEs are advanced enough to have nearly eliminate any need for Hungarian Notation style encodings.

Don’t Use Encodings: Exceptions to the Rule

Using prefixes for names of class members is still a commonly encountered encoding technique. Uncle Bob recommends avoiding member prefixes entirely. He writes:

…people quickly learn to ignore the prefix (or suffix) to see the meaningful part of the name. The more we read the code, the less we see the prefixes. Eventually the prefixes become unseen clutter and a marker of older code.

— Clean Code by Robert C. Martin (pp. 24)

I haven’t yet decided whether I agree with avoiding member prefixes. Consider this class with member prefixes:

public class User
{
  private string _username;

  public User(string username)
  {
    _username = username;
  }
}

And compare it to a version without the member prefix:

public class User
{
  private string username;

  public User(string username)
  {
    this.username = username;
  }
}

I’m hard pressed to say one is superior to the other. Ultimately, I don’t think this is a hill worth dying on. I would object to a more involved member prefix, such as m_ instead of just _. But I think both versions of the User class are easy to read. However, I would advocate for either this.username or _username when assigning the field rather than username = username. While username = username does work in languages like C#, it appears redundant. And, it violates the “avoid false clues” rule.

Another exception comes up when dealing with interfaces and concrete classes. If we have need for a configuration service with an interface and a concrete implementation, then how would the names be different between the two. It’s common to use an I prefix to indicate an interface. So IConfigurationService and ConfigurationService for the concrete class implementation. But Uncle Bob prefers having the encoding on the concrete class, rather than on the interface. His reasoning for this is that the users of the configuration service interface have no need to know that they’ve been given an interface. So, he’d name them something like ConfigurationService, and ConfigurationServiceImp.

Use Nouns For Class Names

This is a straightforward rule. Classes and objects should be thought of as nouns. So, they should be named that way. They shouldn’t have verbs.

Use Verbs For Method Names

Here’s another straightforward rule. Methods are “doers” so their name should have a verb for what they are doing. GetCredentials, ValidatePassword, and AddUser are all good method names.

Avoid Inside Jokes, References, and General “Cleverness”

As fun as it might be, avoid using “easter eggs” or cultural references in your code. Instead, just be direct and write what you mean. Don’t require the reader to resist a distraction and map a label to a concept while they’re trying to understand your code.

Exceptions to Avoiding Jokes and References

There is a very specific exception to this rule. Cultural references and inside jokes must be encouraged whenever you’re writing in the ArnoldC language

Pick One Word Per Concept

Some words could be used interchangeably to use the same thing. In such scenarios settle on one word for each abstract concept. For example, using Configuration and Settings to represent the same concept could be confusing. Likewise, avoid using Fetch alongside Get. Having consistent terminology is very useful for teams working together. I’ve been part of many conversations over the difference between a UserManager class and a UserService class. The important thing is to identify the concept, and settle on a single word to represent it.

Don’t Pun

Don’t have one word mean more than one thing. Uncle Bob highlights that doing so is essentially making a pun. The example he gives is Add. A codebase may have many Add methods in it, and they all create a new value by adding or concatenating two values. A programmer may want to add a value to a collection, and for consistency’s sake use the word Add as the method name, even though the operation being performed is different than every other use of the word. Insert or Append would be better choices for such a method.

Use Solution Domain Names

It’s okay to use computer science terms, algorithm names, and so on, in our code. Our readers will be other programmers, so it’s alright to expect them to know these things.

Use Problem Domain Names

Don’t invent new terms to represent concepts in the problem domain. Instead use the terms that the problem domain experts use. This will aid in communication, and when someone takes maintains the code after you, they’ll be able to ask questions using terminology that the problem domain expert will understand.

Add Meaningful Context

Names usually require some amount of context to be meaningful. The context can be provided by well named classes inside of well names namespaces. For example, a State variable might not be immediately obvious as part of an address. But if it’s a property on an Address class then what Address.State is becomes much clearer.

Don’t Add Gratuitous Context

Most of the rules explored so far tend to lead us toward longer more descriptive names. But, a short name that is clear is always better than a longer name that is equally clear.

Gratuitous Context: Comments

I submit that comments are a code smell that indicates poorly named code. Comments become stale easily because they’re not subject to a compiler nor an interpreter. Following the previous rules will help reduce the need for comments because the code’s intent and readability will be improved enough to eliminate the need for many of them.

A general rule I like is to limit my comments to “why’s”. Comments explaining what is happening or how it’s happening are usually vestigial because the code itself should be communicating the “what” and “how”. However, the sometimes does a poor job of explaining the “why”. If you identify a section of your code that may be confusing because it’s not obvious why it needs to be there then double check whether it’s possible to refactor to make the code clearer. Then, if “why” is still a question, then a comment may be appropriate. Though it should be a very rare occurrence.

Takeaways

Solving problems with code is the programmer’s primary concern. But, a professional programmer also bears the responsibility of writing code that can be understood by others. If he fails at that task his code will rot, and eventually become too cumbersome to improve and use. Strive to infuse these previous guidelines into your code writing process. The results will be programs that are crafted together with a clarity that you, and those who follow you, will appreciate.

Our goal, as authors, is to make our code as easy as possible to understand. We want our code to be a quick skim, not an intense study. We want to use the popular paperback model whereby the author is responsible for making himself clear and not the academic model where it is the scholar’s job to dig the meaning out of the paper.

— Clean Code by Robert C. Martin (pp. 27)