The more a person limits himself, the more resourceful he becomes.
– Soren Kierkegaard
This is a series to share simple tips to improve the maintainability of code. The premise is that code becomes easier and simpler when constraints are put in place. In this post, we'll discuss the constraint of using Empty values instead of null
.
Non-nullable code
I call it my billion-dollar mistake…At that time, I was designing the first comprehensive type system for references in an object-oriented language. My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler.
But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
– Tony Hoare, inventor of ALGOL W.
Problems with null
null
is easy, but in most cases it can lead to more problems. Almost every developer's had some experience with it in their career. If you have used it, the chances are pretty high that you've had to fix the dreaded null
-pointer exception bug.
However, null
can lead to another problem that will impact the longevity of the code. This is the communication problem that arises from using null
. What does null
mean? Does it mean that the value is undefined? Does it mean that the value is empty? When building code for longevity, optimization for maintainability and code readability are essential.
The side effect of using nullable code is that now, it's up to any developer using that value to do extra work. When using a nullable value, a developer must:
- remember to guard against null-pointer exceptions.
- implicitly know what
null
means for that value.
Solution
The simple solution is to just not use null
, use an empty value instead! If null
is avoided, then getting a null
-pointer exception becomes impossible!
An empty value is a representation of the empty state of that type. For example, an array without any items is the empty state of an array. Defaulting all arrays or lists to the empty state instead of null
makes the life of someone reading your code easier because it makes the code explicit. There is no confusion of what an empty list is compared to a list set to null
.
So how does one represent the empty state for other types?
Use Optionals
The Maybe
or Optional
type is a wrapper type to represent the empty state. A Maybe
can have only two values, a Some<T>
and a None<T>
. By wrapping a value with the Maybe
type, it forces developers to explicitly:
- Unwrap the value by checking for the
Some<T>
- Declare the value as
None<T>
when it is empty
The value inside the Some
is guaranteed to be non-null, while the value inside the None
is guaranteed to be null. It enables the developer to proactively guard against null
-pointer exceptions by making it explicit.
Use Interfaces
For Object Oriented Languages, an interface can be defined. With an interface in place, a type can have an empty implementation and a non-empty implementation. However, this technique requires a bit more discipline, as the developer will need to remember to:
Program to an interface, not an implementation.
– Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley, 1994, p. 18)
Consider the following:
public interface IJob {}
public record PrintWorkOrders(string Name) : IJob;
public record Noop() : IJob;
If the code is programmed to the interface, then users of this interface are required to perform a check on the interface against the implementation.
public void ProcessWorkOrders(IJob job) {
if (job is not PrintWorkOrders printWorkOrders) return;
// process the job
}
Again, it accomplishes the same goals of the Maybe
type.
- Users explicitly check for the implementation type
- Empty types are explicit and type safe
Identity of Empty values
Below are some examples of empty values for different types. However, it should be noted that the representation of empty will depend on your use case.
Type | Empty Value Name | Example Representation |
Array | Empty Array | [] |
Hash Map | Empty Dictionary | {} |
Procedure | No-op | () => {} |
Function | Identity | x => x |
Numbers | Identity | 0 or 1 (depends on operation) |
Strings | Empty String | "" |
Promise | Fulfilled | Promise.Resolve(T) |
An interesting observation is that in most of these cases, applying the empty value to an existing value should yield the same result. As an example, using the Concat
function to concatenate an empty array with an array will yield the input array. Similarly, concatenating an empty string to a string will yield the input string. It could be that an empty value is the identity for a given operation. 🤔
In contrast, using Concat
between a null
and an array will yield a null
-pointer exception.
Conclusion
The use of an empty object seems like extra work compared to null
because in both cases a conditional check is still required. The main difference is that performing a null
check is optional, whereas, an empty value check is required. In summary, avoid the use of null
to avoid:
null
-pointer exceptions- increase code clarity and readability
If you don't create the problem, then there's no problem to solve!
Reference
- ESO/S. Brunier, CC BY 4.0 https://creativecommons.org/licenses/by/4.0, via Wikimedia Commons