"Robust Clojure: The best way to handle nil"

Table of Contents

Large Clojure codebases can become nasty, just as in any other dynamic language. Fortunately, Clojure isn't as problematic as some other languages because it is partially inspired by ML. True, it doesn't have static typing, but the way Clojure treats nil allows us to get very close to the ML way.

1. Maybe

In Haskell and other ML-ish languages, the Maybe type represents either Nothing or Just <some_value>. This becomes super useful when you need to check if a thing exists and get the value of that thing at the same time.

For example doing this explicitly in Clojure is cumbersome:

(def data {:a 1 :b 2})

(if (= nil (:a data))
  0 ; default return value
  (:a data))

Checking for nil comes at a cost, of course. You're accessing the map twice, and that's a lot of boilerplate code if you'll be doing this often. In Haskell it's much cleaner:

-- let's pretend this is a function that Maybe returns an Int
getSomeData :: Maybe Int

-- call the function and handle the return value
case getSomeData of
  Nothing -> Nothing
  Just a  -> a

We can handle both the getting of the value and the returning of the value in one fell swoop.

Clojure has a function that does basically the same thing. The get function will return nil if a key in a dictionary isn't present:

(get {:a 1 :b 2} :a) ;=> 1
(get {:a 1 :b 2} :c) ;=> nil

This is similar to our imaginary getSomeData function in the Haskell snippet, except the Just a is implicit, so we don't have to extract the value 1 every time.

2. Maybe nil?

Practically speaking, nil is a value in Clojure because you can do anything with it that you can do with any other value. This idea of "everything is a value" (commonly expressed as "everything is data") runs deep in the Lisp tradition and gives Lisp languages a lot of power. But it also causes problems. Consider:

(+ 1 (get {:a 1 :b 2} :c))

This will evaluate to (+ 1 nil) which is nonsensical and will raise an error. You can't increase nothing by 1—if you try to, you just end up with more of nothing!

3. The Right Way

The simple fix is to check for nil just like you would check for Nothing. Clojure provides the if-some function to make this more concise:

(if-some [it (get {:a 1 :b 2} :c)]
  (+ 1 it)
  nil)

Which is more-or-less the same in Haskell:

case getSomeData of
  Nothing -> Nothing
  Just it -> 1 + it

If you remember to write all of your Clojure code like this, your codebase will become much more robust to nil-related errors!

To sum up: Always treat nil as if it means Nothing.

If you're an intermediate Clojure programmer, then you're probably already familiar with if-let=/=if-some and perhaps not impressed. The big idea, however, is the treatment of nil as a type, and not as a value, which is a subtle but important point.

To avoid these errors once and for all, you need to stop thinking about nil as a value. Yes, that is how Clojure treats nil, but that doesn't mean that you, the programmer, must treat it as a value too. If you come from Java or C, which represents the absence of a value as the null value, then you'll have to update your mental model.

Realize: the concept of absence refers to a type of thing, not a value.

While you are writing code, you should be thinking, "Is the type of this thing always an int, or could it be nil?" When doing numerical or statistical programming, you can probably guarantee that you'll have a number type returning in your algorithms. However, when you start working with networking, or databases, or certain Java libraries, you often lose the guarantee that functions will return concrete values (network is down, database exploded, etc.), and then you must think about nil.

4. Clojure Idioms and Category Theory

In both Haskell and Clojure, manually checking for nil=/=Nothing becomes tedious very fast, especially when you are chaining lots of functions together. However both languages have solutions for this: Haskell has category theory, Clojure has idioms.

In Haskell, the "bind" operator is defined basically like this:

(>>=) m g = case m of
  Nothing -> Nothing  -- if m is Nothing, just return Nothing
  Just x  -> g x      -- otherwise, call the function g on the extracted value

Extending the above example, we can call getSomeData and increase it by 1 with the following:

incIfEven :: Int -> Maybe Int
incIfEven n =
  if n/2 == 0
  then Just n+1
  else Nothing

getSomeData >>= incIfEven

Clojure has a similar idiom. We use some->> to thread the map through the rest of the functions. First extract a key if it exists, then lift the value into a vector, so we can use all of the collection-related functions on it. This allows us to filter and map over it to transform the data as we see fit:

(some->> {:a 2 :b 3} :a vector (filter even?) (map inc) first) ;;=> 3
(some->> {:a 2 :b 3} :b vector (filter even?) (map inc) first) ;;=> nil
(some->> {:a 2 :b 3} :c vector (filter even?) (map inc) first) ;;=> nil

Voilà! You get the compactness of Haskell, without the overhead of category theory :)

I kid! Category theory is great. The "bind" operator (>>=) is very similar to some->> because they both take a value from one monad and "shove" it into the next monad. ⊕ If you have no idea what a "monad" is, replace "monad" with "thing" and re-read that sentence. In Haskell, the monad is the Maybe type; in Clojure, the monad is implicit in the collection interface which is the unifying abstraction in the language.

5. Clojure's Most Under-Appreciated Function

On IRC, technomancy mentioned he was surprised fnil wasn't in this article. I admit that completely forgot about fnil, but it's extremely useful.

fnil can be used in our example above like so:

(def safe-inc (fnil inc 0))

(safe-inc (get {:a 1 :b 2} :b)) ;=> 3
(safe-inc (get {:a 1 :b 2} :c)) ;=> 1

In the above snippet, safe-inc is a function just like (+ 1 x) in the earlier example, except if x is nil, then safe-inc will use 0 as a default value instead. More (better) examples are available at ClojureDocs.

fnil isn't talked about much in the Clojure community, but it is a handy funciton. Use it whenever you aren't sure if a variable is nil but you do know what the value should be. In fact, the entire problem of nil isn't discussed much at all, but it is a very important issue, one that the Clojure community should be aware of. Hopefully this article will at least make you aware of the problems with nil, and start you down the path of thinking critically about nil on your own.

6. We Still Have Problems

The biggest problem is that this practice explicit nil handling is a convention—the only thing enforcing it is your habits, and we all know that we mere humans are fallible. Haskell's approach to Nothing is thus superior because the compiler checks your work automatically, which is nice.

A second problem is nil itself, which is a problem with any dynamically-typed language. Unforeseen =nil=s can bubble up the stack and cause a lot of headache. One solution is to use a monad library (discussed below), but more often than not, in everyday Clojure code, a monad library is unnecessary.

The core problem is one of language design. Like I said above, Clojure treats nil as a value, when in reality, the concept of absence refers to a type: intuitively, we say "absence of a value" just like we say "an integer of 5". Clojure, as a lisp, made the choice to keep types an evaluable construct, so they could be modified at runtime, instead of a construct of compilation like Haskell. By choosing Clojure over Haskell, you are choosing the power of metaprogramming, but with that comes the drawbacks of dynamic typing.

The best solution to this that I've found (in dynamically-typed languages) is to follow the single-responsibility principle: each function should just do 1 thing. Then spec that function and catch the possible nil-causing inputs with auto-generated tests (this is an article for another time). If you have other solutions, please email me and I will add your contribution here :)

7. Other Solutions and nil-Punning

As described by Skyliner, chaining if-let's together like this is annoying:

(if-let [x (foo)]
  (if-let [y (bar x)]
    (if-let [z (goo x y)]
      (do (qux x y z)
          (log "it worked")
          true)
      (do (log "goo failed")
          false))
    (do (log "bar failed")
        false))
  (do (log "foo failed")
      false))

It's only mildly less annoying when using cats:

(require '[cats.core :as m])
(require '[cats.monad.either :as either])

@(m/mlet [x (if-let [v (foo)]
              (either/right v)
              (either/left))

          y (if-let [v (bar x)]
              (either/right v)
              (either/left))

          z (if-let [v (goo x y)]
              (either/right v)
              (either/left))]

  (m/return (qux x y z)))

The benefit with cats is you get fine-grained error handling for each left. Read more about cats error handling and the Either type.

If some-> is out of the question, then personally I prefer the pattern matching approach:

(match (foo) ;; pretend `foo` is function that returns a map
  nil (log "foo failed")
  {:ms t :user u :data data} (do (log "User: " u " took " t " seconds.")
                                 data))

The benefit is mostly the same as with if-let, but you can pattern match on the return value and then jump right into the next function, which I find myself doing quite a lot.

Of course you can always tighten this up by defining your own version of "bind" or some-> in Clojure:

(defn >>= [m g]
  (if-let [x (m)]
    (g x)
    (do (log (str (name m) " failed")
             nil))))

This is a (very) naïve implementation, but you get the idea. Modify to fit your use-case.

On Reddit, tolitius suggested the use of get's optional third argument (which I had forgotten about!) and or:

get has a default value built in:

user=> (get {:a 1 :b 2} :b)
2
user=> (get {:a 1 :b 2} :c 0)
0

hence

user=> (-> (get {:a 1 :b 2} :c 0) inc)
1

In case this is a single op, such as inc, this would work as well:

user=> (-> (or nil 41) inc)
42

user=> (-> :c
           {:a 1 :b 2}
           (or 41)
           inc)
42

i.e. or is really handy for default values

Over at Lispcast, Eric Normand argues for the "nil-punning" approach, which is fine. But I think this approach requires a confused notion of what nil=/=Nothing actually means. According to Eric, nil is a type, a value, a key in a map, a boolean, an empty seq. It seems to me that "nil-punning" is really just "nil-confusion". It is much simpler to understand nil as Nothing, i.e. the absence of a value (which is a type). That said, nil-punning in practice ends up mostly the same as I describe above, so either technique will work.