Skip to main content

Closures

Topics/tags: The joy of code, technical, long

Yesterday, I was musing about new programming languages I might learn and how I might learn them. Along the way, I mentioned that I like to explore closures in the new functional languages I learn. What’s a closure? I think of it as a way to capture the environment of a function and tie it to the function. Here’s what Wikipedia says,

In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function (variables that are used locally, but defined in an enclosing scope) with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure’s copies of their values or references, even when the function is invoked outside their scope.

Here’s an example of a program I might write in Racket that takes advantages of closures [1]. The function makeCounter returns functions that count; each function returned by makeCounter gives the next value in its individual sequence of numbers.

> (define make-counter
    (lambda (init)
      (let ([count init])
        (lambda ()
          (let ([temp count])
            (set! count (+ 1 count))
            temp)))))
> (define c1! (make-counter 1))
> (c1!)
1
> (c1!)
2
> (c1!)
3
> (c1!)
4
> (define c2! (make-counter 1))
> (c2)
1
> (c1!)
5

What does this code snippet reveal? First, it shows that a function can refer to variables in its enclosing scope even after that scope seems to have finished. That is, although count is declared in make-counter, and the call to make-counter has terminated when we call c1!, c1! can still access count. Second, it shows that each function we create with make-counter has its own copy of count.

Closures in Python

I recall having trouble achieving that behavior when I first learned Python. While I know that Python now implements closures, I’m still having trouble figuring out the details of how to do just that. The sample Python code on the Wikipedia page for closures claims to implement closures. Here’s an excerpt:

def f(x):
    def g(y):
        return x + y
    return g  # Return a closure.
a = f(1)
assert a(5) == 6

I had written something like that as I was thinking about closures in Python. Here’s my version, taken from an interactive Python interpreter.

>>> def make_adder(x):
...   def adder(y):
...     return x+y
...   return adder
... 
>>> add3 = make_adder(3)
>>> add5 = make_adder(5)
>>> add3(2)
5
>>> add5(2)
7

Mine differs in that I’ve tried to assign more sensible names to the two function and that I’ve tried to demonstrate that each new function returned by make_adder has an independent version of x.

However, I’m not sure that this is actually a closure. That behavior seems to be achievable with a strategy less powerful than what I think of as closures, since it does not permit mutation of the variable in the enclosed environment. In particular, we need not need to keep track of the x; we can simply translate the x to its value in building the body. Or maybe that’s enough; perhaps I don’t understand closures well enough.

In any case, here’s my first attempt at Python code with mutation, a manual translation of my Racket code to Python.

>>> def counter(init):
...   count = init
...   def fun():
...     temp = count
...     count = count + 1
...     return temp
...   return fun

Unfortunately, this version does not work. Can you tell why? Let’s see.

>>> c1 = counter(1)
>>> c1()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in fun
UnboundLocalError: local variable 'count' referenced before assignment

Some poking around suggests that I could try nonlocal. Let’s see if that works.

$ python
Python 2.7.8 (v2.7.8:ee879c0ffa11, Jun 29 2014, 21:07:35) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def counter(init):
...   count = init
...   def fun():
...     nonlocal count
  File "<stdin>", line 4
    nonlocal count
                 ^
SyntaxError: invalid syntax

Whoops. I guess nonlocal is only available in Python3. My next inclination was to make x a mutable object and see if I can mutate it. My further research about nonlocal and closures in Python2 revealed the following useful snippet from StackOverflow [2].

Inner functions can read nonlocal variables in 2.x, just not rebind them. This is annoying, but you can work around it. Just create a dictionary, and store your data as elements therein. Inner functions are not prohibited from mutating the objects that nonlocal variables refer to.

To use the example from Wikipedia:

def outer():
  d = {'y' : 0}
  def inner():
      d['y'] += 1
      return d['y']
  return inner

f = outer()
print(f(), f(), f()) #prints 1 2 3

And yes, that example works as anticipated. It even works when we create two separate counters.

>>> f = outer()
>>> print(f(), f(), f())
(1, 2, 3)
>>> g = outer()
>>> print(f(), f(), f())
(4, 5, 6)
>>> print(g(), g(), g())
(1, 2, 3)
>>> print(f(), f(), f())
(7, 8, 9)
>>> print(g(), g(), g())
(4, 5, 6)
>>> f = outer()
>>> print(f(), f(), f()) 
(1, 2, 3)
>>> print(g(), g(), g())
(7, 8, 9)

But that’s not a closure. That’s a manual implementation of environments. What’s the difference? First, it’s something that the programmer has to do, rather than the language. Second, it’s not as generalized as closures. With simple nesting, this approach works okay. However, closures work even if we have much deeper nesting of functions. That would require significantly more effort on the part of the programmer if we were to use the manual environment approach [3].

Given what I’ve found, I’d have to say that Python2 does not have closures, at least not what I think of as closures.

Kotlin

In yesterday’s musing, I identified closures as one of the things I play with when I learn a new language. I’m trying to learn Kotlin. Perhaps I should try to figure out closures in Kotlin.

There seem to be two basic ways to write anonymous functions [4] in Kotlin. The documentation suggests that there are at least two ways to write anonymous functions. You can use arrow notation, as in { x: Int -> x + 1 }, or you can use more verbose function notation, as in fun (x: Int): Int { return x + 1 }.

While my experience with Kotlin is limited, I was able to quickly put together the counter example. Here’s what I wrote.

fun makeCounter(): () -> Int {
    var count = 0
    return { ++count }
}

fun main() {
    val c1 = makeCounter()
    val c2 = makeCounter()

    println("${c1()} ${c1()} ${c1()}") // 1 2 3
    println("${c1()} ${c1()} ${c1()}") // 4 5 6
    println("${c2()} ${c2()}")   // 1 2
}

And yes, it works as expected. Kotlin does seem to have closures, or at least some basic form of closures. I must admit that the return { ++count; } feels strange to me. My brain doesn’t read that as a function. But I can understand that it’s more concise than the follow (which also works).

    return fun(): Int { return ++count }

There does seem to be a reasonable compromise, that looks more like a function to me.

    return { -> ++counter }

I’m intrigued to find that IntelliJ tells me to get rid of the redundant arrow.

JavaScript

I wasn’t planning to include JavaScript in this musing. However, the day after I started drafting it, my colleague Janet Davis posted a link to an article on closures in JavaScript. Here’s the code that the author, Olivier De Meulder, posted.

 1: function createCounter() {
 2:   let counter = 0
 3:   const myFunction = function() {
 4:     counter = counter + 1
 5:     return counter
 6:   }
 7:   return myFunction
 8: }
 9: const increment = createCounter()
10: const c1 = increment()
11: const c2 = increment()
12: const c3 = increment()
13: console.log('example increment', c1, c2, c3)

It seems like counters are a fairly common approach to looking at closures. However, I’m surprised that De Meulder doesn’t create multiple counters make sure that the counter variable is not shared. Let’s give that a try.

function createCounter() {
  var c = 0;
  return function() { return ++c; }
}

const c1 = createCounter();
const c2 = createCounter();

console.log(c1(), c1(),  c1()); // 1 2 3
console.log(c2(), c2());        // 1 2
console.log(c1());              // 4
console.log(c2());              // 3

For some reason, the JavaScript IDE I tried wouldn’t allow let, so I substituted var. That seems to be working fine. I also simplified things a bit. We are getting expected behavior.

De Meulder goes on to write,

The closure contains all the variables that are in scope at the time of creation of the function. It is analogous to a backpack. A function definition comes with a little backpack. And in its pack it stores all the variables that were in scope at the time that the function definition was created.

That’s not quite true. Since multiple functions can be declared (and returned) within another function, some of the variables will be shared. Here’s an example with a single shared variable.

function dualCounter() {
  var count = 0;
  return [function() { return ++count; }, function() { return --count; }];
}

var [inc1,dec1] = dualCounter();
var [inc2,dec2] = dualCounter();

console.log(inc1(), inc1(), dec1(), inc1(), inc1()) // 1 2 1 2 3
console.log(dec2(), dec2(), inc2(), inc2()) // -1 -2 -1 0

Now, this may seem to you like they share the same backpack. However, with a little work, we can probably come up with an example in which they share one variable, but not another. Here’s a stupid, and potentially confusing, example.

// Return a pair of factories associated with a single counter.  The
// 2, then 3, then 4, and so on and so forth.
function confuse() {
  var counter = 0;
  function incrementerFactory() {
    var offset = 1;
    return function() { counter += offset++; return counter; }
  }
  function decrementerFactory() {
    var offset = 1;
    return function() { counter -= offset++; return counter; }
  } // decrementerFactory()
  return [incrementerFactory, decrementerFactory];
}

var [if1,df1] = confuse();
var [if2,df2] = confuse();
var inc1 = if1();
var dec1 = df1();
var inc2 = if2();
var dec2 = df2();

console.log(inc1());    // 1 ; The inc offset for the first counter is now 2
console.log(inc1());    // 3 ; The inc offset for the first counter is now 3
console.log(inc1());    // 6 ; The inc offset for the first counter is now 4.
// If the offset variable is shared, decrementing should give us 2.  If it's
// not shared, decrementing should give us 5.
console.log(dec1());    // 5
// We've determined that offset is not shared between inc1() and dec1(),
// but counter is.  What about the other pair?
console.log(inc2());    // 1;
// Nope; counter is not shared.

I don’t think backpack covers this behavior very well. I admit that it’s not a behavior I need in many cases. But it’s still the behavior that the term continuation is supposed to imply.

Other thoughts

I’m not sure if I’ve explained closures to those of you who didn’t understand them already, but I helped myself think about them more. I also discovered that there seem to be a variety of ways that people think about closures and perhaps even the ways that I think about closures. We’ve seen one model that captures only the values of variables. Most examples show what happens when you capture a single enclosing scope. But real closures can handle multiple levels of enclosing scope. And to me, that’s the part that’s coolest part.


Postscript: I realize that some folks would say Closures are a stupid way to achieve the counter behavior that you described in this musing. I might even agree. Nonetheless, the counters provide a simple example of what closures are; there are more complex situations that benefit from more complex use of closures. It’s kind of like recursion. Many of the recursive procedures we first consider could be written iteratively. But consider Quicksort [5,9]. Can you write Quicksort iteratively without a stack? I’m not sure that I can. And, in the end, the implicit stack given by recursion is much easier to manage than an explicit stack.


Postscript: I’d love to hear from one of my colleagues in Programming Languages whether they consider the Python 2 version of closures real closures.


[1] Would I really write code with side-effects in Racket? Sure, when it seems appropriate.

[2] I generally tell my students to be cautious about what they take from StackOverflow. For example, the advice to use ArrayLists in Java appears way too often. But that doesn’t mean that everything on StackOverflow is pointless.

[3] It could also generate much slower code.

[4] Some call these lambdas.

[5] Quicksort is a famous probabilistic divide and conquer sorting routine developed [8] by C.A.R. Tony Hoare. It involves reordering an array or subarray so that small elements are on the left and large elements are on the right, and then recursing [6] on each part.

[6] It appears that recursing is a term that I invented. My colleagues tell me that recurring is correct. I don’t care. I like my term better [7].

[7] I’m also a descriptivist, so I hope that by encouraging my students to use the term, it will enter the lexicon.

[8] Discovered?

[9] There seem to be multiple spellings of Quicksort, including Quick sort, Quick Sort, QuickSort, and even quik sort. I stick with Hoare’s spelling.

[10] There seem to be at least two early articles by Hoare on Quicksort.

C. A. R. Hoare. 1961. Algorithm 64: Quicksort. Commun. ACM 4, 7 (July 1961), 321-. DOI=10.1145/366622.366644 http://doi.acm.org/10.1145/366622.366644

C. A. R. Hoare. 1962. Quicksort. (Comput. J.* 5, 1, 10–16. doi:10.1093/comjnl/5.1.10 http://comjnl.oxfordjournals.org/content/5/1/10.full.pdf


Version 1.0 of 2019-09-11.