How about doing something paradoxical, like returning something or “nothing” from a function at the same time? What is “nothing” in C++?
Well, void
represents the unit type and carries no information. But how can we return void
or some other type? And even if we can achieve this, what would be the semantics of such a return type? What is it good for?
The word “or” represents in a very general sense some kind of branching.
Same is true for boolean or any pointer type. The key observation here is, that a boolean or pointer value always represents the same unique type.
What we need is something which is either the unit type, or some arbitrary other type.
Get the player name!
Let us start a discussion on how to find an entry in a std::map<int, std::string>
, where the key is the shirt number and the value is the player name of our football club.
A simple lookup for the player name of shirt number 8 could be:
Here are some facts and my review on this kind of approach:
-
Won’t compile without a C++20 compiler, since the
contains
function will be available from C++20 onwards. I use it for illustration purposes only 🙂 - Querying if something is present and then retrieving it, involves two lookups. This might be expensive, especially when containers hold the data.
- A lot of boilerplate code like the check if the data is present and the retrieval later.
-
Inherently error-prone: the
if
check will eventually be forgotten, so the call tonameByShirtNo.at(shirtNumber)
will throw, and bad things will happen…
But wait, if at()throws, why not surrounding it with a try-catch block to get rid of the query altogether?
Is this an improvement?
-
The runtime cost of an exception is far bigger than a simple
if
check. Even if we assume a zero-cost model for when the exception is not thrown, our costs depend then on how often we query non-existing data. This implies we cannot predict the runtime overhead. - A long time ago in a galaxy far, far away….I have been told that exceptions are there for exceptional circumstances. It is quite natural that a search now and then finds nothing. The exception’s mechanism is just wrong for expressing this fact.
-
We replaced the
if
check with an exception which is kind of a goto-statement. - The boilerplate is still given with all the keywords needed for an exception.
Using the STL more seriously
Hold on, you might think by now. What the heck is this about? No one with his right mind will search for a value in a map like this. Why not using the direct find method of std::map
?
And you are right! The two former approaches should not be used, but guess what? I’ve seen these over and over again in the wild…and you probably too. 🙂
Ok, let’s get serious now and do this like experienced STL users do:
Once again, what can we say about this very common approach?
-
By calling
find
, we use the usual way of combining the query and the retrieval of the data in one go. We get back an iterator and our runtime cost is predictably only one lookup.
- The code is as error-prone as the previous ones! What if we forget to check the iterator? The more of these checks we have, the higher the chances of crashing.
-
This code is not very expressive either, when it comes to model the outcome of a lookup. Why do we always have to bother with low-level stuff like iterators? And what is this
second
thingy again? Ah yes, this is for accessing the returnedstd::pair
.
These are bold claims, since a big portion of code in C++ is written in this vein.
But things change and/or improve and I will argue that in this case we can improve.
With C++20 ahead, the STL makers have a similar vision in mind, especially when you look at the very exciting ranges library. Things will become dramatically more functional, and less iterator-like. So, if you are a functional fanboy like me, check out this masterpiece from Eric Niebler.
Coming back to our example, with C++17 std::optional we can improve on the expressiveness of our player name lookup. For that, we first create a little wrapper which takes our map and the shirt number and returns an optional player name:
Since find
is implemented in terms of returning an iterator, we have to work with them here too.
If the iterator represents a valid value, we return a std::string
wrapped into a std::optional
.
If no shirt number was found, we return the special value std::nullopt
which represents “nothing”. This behavior is what we have asked for in the beginning: we need a type constructor which can hold any type or nothing.
Client code will now call maybePlayerName
instead of find
:
This search looks similar to the one where we directly worked with iterators. First we receive an optional playerName
and then check with has_value()
if it contains a valid value. If so, we can extract it with a pointer-like syntax. Our code review for this new approach might be:
-
The client calls
maybePlayerName
which in turn does one lookup. Again we have no performance hit and a good determinism in runtime. -
With the help of
std::optional
we have a far better expressiveness. We can work directly with an optional value that clearly represents the result of a search.
- This code suffers the exact same problem as all other tries so far: the check for the optional value is needed, but not enforced. We can and will forget it in some place, and our much valued code will crash in the face of the customer.
We could stop here and argue that there are way more subtle ways to shoot yourself in the foot with C++ and this is just the way it is with a powerful language. I will try to make a case that it does not have to be like that.
I am a really lazy programmer. Every boilerplate code bothers me and I always seek for automation and for help by the type system, to not let me do stupid things.
Here, the boilerplate is this nasty, tedious and error-prone check for the existence of the optional value. Good news is, that there are solutions out there for C++, which are used for decades in other languages and frameworks.
The functional approach
I could now start and tell you something about sum types, functors and monads.
But first, this post would never end, and second, I will make a blog post about monads anyway. That way I can embarrass myself the most, see the prominent fallacy.
But for this post, I will explain the usage of optionals in a less formal and more technical way.
Simon Brand has implemented tl::optional
which overcomes our need for checking the
optional values, and I have forked it here.
The difference is just, that I removed all unsafe operators like *
, ->
or get()
, which can lead to crashes. Simon has kept them for legacy reasons I guess.
Let us first rewrite the maybePlayerName
function such that it now uses tl::optional
:
That is almost the same code as with std::optional
, except that in line 59
we return the string directly. This works by conversion and would also be possible with std::optional
.
Nothing fancy so far, but here is the client code:
Even without knowing what this code does, one thing we can see is that no error-prone checks are present. Let’s extract two simple functions from the lambdas:
In line 78
we call maybePlayerName
and get a tl::optional
back. With map
and orElse
we cover the two possible outcomes.
If the optional value exists, map
picks it up and applies a handler(callback) to it.
Otherwise, the optional contains no value, and we provide another handler with orElse
, which takes nothing and just logs that we did not find a player name.
For this “.” notation to work, map
and orElse
return again an optional. Some of you might notice that this kind of syntax resembles the builder pattern(although it is about monoids).
With that, we can chain even more operations on an optional type which will look like piping on the console(to have another view on it).
In the above diagram you see the two paths and the implicit checks for it. Map
is a higher-order function, which applies a function to the optional value and wraps the result(above an int
) into an optional again.
The function orElse
is also a higher-order function, which takes void
(since there is no value) and in this case returns also void
. After that, the original optional, here tl::optional
,
is returned. This is useful for side effects like logging.
Before we see more applications of tl::optional
, let us recap and review once more:
- The performance is the same as with iterators. Nothing new here.
-
The expressiveness is at least as high as with
std::optional
. I would argue that the dot notation even increases the readability compared to having an explicitif
check. -
And finally, this approach will never crash because of unchecked access. It is simply impossible to call some kind of getter on
tl::optional
, since I removed all of them.
What else could we hope for here? Some of you might say, “I can’t read this, readability is bad.”
I think we often confuse readability with familiarity. If someone has never seen a state machine before, chances are high that he will recognize states, transitions and guards as unreadable.
He would argue that some simple if
statements would do the same and are more readable.
Someone with a high experience on state machines will probably say the exact opposite,
that too many and deeply nested if
statements hurt readability.
We all had situations where it clicked in our mind and from there on, once alien constructs became very natural to read and apply.
Composability
The discussion above was more or less only a teaser. The changes of C++ in recent years were dramatic, and we have to become aware of the functional concepts that slowly become a first class citizen in the language.
As for tl::optional
, there is so much more we could explore, but the space for one blog post is just too small. But I really want to showcase that the real power of such type constructors lies in their composability. So let’s expand our running example a little.
How to find out if, for a given shirt number, the respective player is older than 30(yeah, I am masterclass in contrived examples)?
For that we introduce another map which assigns each player name to its age, together with a function in the same spirit as maybePlayerName
. Additionally, we introduce a simple predicate which checks if the player is older than a given threshold. We start this task with std::optional
.
This should be pretty straight-forward by now, but my point is not the nested if
statements, since you can write it without nesting(as an exercise maybe?). The pain stems from the fact that we need the first two, since we would crash when they are not there.
And now we compare this with tl::optional
, after adapating maybePlayerName
and maybePlayerAge
to it:
I think, the declarative nature of this code is quite obvious. All checks are done implicit and our customers are happy that we no longer crash because of silly oversights.
In line 116
we first get the optional player name as usual. Then we call flatMap
and provide a handler if the player name is present. FlatMap
is almost the same as map
, but remember what happens when calling map
? Map
will always wrap the handler result into a new optional.
If we call map
here instead of flatMap
, the resulting type would be tl::optional<tl::optional<int>>
, which is not what we want. Therefore, flatMap
exists, which cuts off one layer of optional to yield tl::optional<int>
.
In line 118
we take the optional age returned by maybePlayerAge
, and apply the filter function isOlderThanThirty
. If the age is older than 30, filter
returns the tl::optional
untouched.
If the player is younger, filter
returns tl::nullopt
instead, which represents nothing.
And finally in line 119
, we take the optional age and print search success if the age passed the previous filter
, otherwise orElse
is called in case one of the previous handlers returned nothing.
I would argue that this kind of composition is a nice alternative to the solution with std::optional
. Once we understand some very general and mighty concepts, code like this is no longer a mystery. I would even go further and claim that learning some functional concepts is easier than remembering thousands of OOP design patterns, where we developers do not
agree in many cases anyway.
Conclusion
I think it is crucial for C++ developers to get a bit familiar with functional concepts.
From C++20 onward, the language will never be the same as before, and the sooner we adapt, the better are our chances to cope with the future.
The concept of an optional type constructor like tl::optional
, is one of the simplest stemming from the function world, and is a good starting point to learn this kind of programming.
You can play with some code examples on godbolt and here is the gist for the code snippets.
References
Optionals in other languages:
- Java
- Kotlin/Arrow This library is so good, it deserves its own blog post!
- Haskell
- Ocaml
The new ranges library in C++20 and why it is functional.
Here is Simon Brand in action, with a funny presentation of his tl::optional
🙂