Closures
Topics/tags: The joy of code, technical, long
Yesterday, I was musing about new programming languages I might learn and how I might learn them. Along the way, I mentioned that I like to explore closures in the new functional languages I learn. What’s a closure? I think of it as a way to capture the environment of a function and tie it to the function. Here’s what Wikipedia says,
In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function (variables that are used locally, but defined in an enclosing scope) with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure’s copies of their values or references, even when the function is invoked outside their scope.
Here’s an example of a program I might write in Racket that takes advantages
of closures [1]. The function makeCounter returns functions that count
;
each function returned by makeCounter gives the next value in its
individual sequence of numbers.
> (define make-counter
(lambda (init)
(let ([count init])
(lambda ()
(let ([temp count])
(set! count (+ 1 count))
temp)))))
> (define c1! (make-counter 1))
> (c1!)
1
> (c1!)
2
> (c1!)
3
> (c1!)
4
> (define c2! (make-counter 1))
> (c2)
1
> (c1!)
5
What does this code snippet reveal? First, it shows that a function
can refer to variables in its enclosing scope even after that scope
seems to have finished. That is, although count is declared in
make-counter, and the call to make-counter has terminated when we call
c1!, c1! can still access count. Second, it shows that each
function we create with make-counter has its own copy of count.
Closures in Python
I recall having trouble achieving that behavior when I first learned Python. While I know that Python now implements closures, I’m still having trouble figuring out the details of how to do just that. The sample Python code on the Wikipedia page for closures claims to implement closures. Here’s an excerpt:
def f(x):
def g(y):
return x + y
return g # Return a closure.
a = f(1)
assert a(5) == 6
I had written something like that as I was thinking about closures in Python. Here’s my version, taken from an interactive Python interpreter.
>>> def make_adder(x):
... def adder(y):
... return x+y
... return adder
...
>>> add3 = make_adder(3)
>>> add5 = make_adder(5)
>>> add3(2)
5
>>> add5(2)
7
Mine differs in that I’ve tried to assign more sensible names to
the two function and that I’ve tried to demonstrate that each new
function returned by make_adder has an independent version of
x.
However, I’m not sure that this is actually a closure. That behavior
seems to be achievable with a strategy less powerful than what I
think of as closures, since it does not permit mutation of the
variable in the enclosed environment. In particular, we need not
need to keep track of the x; we can simply translate the x to
its value in building the body. Or maybe that’s enough; perhaps I
don’t understand closures well enough.
In any case, here’s my first attempt at Python code with mutation,
a manual translation
of my Racket code to Python.
>>> def counter(init):
... count = init
... def fun():
... temp = count
... count = count + 1
... return temp
... return fun
Unfortunately, this version does not work. Can you tell why? Let’s see.
>>> c1 = counter(1)
>>> c1()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in fun
UnboundLocalError: local variable 'count' referenced before assignment
Some poking around suggests that I could try nonlocal. Let’s see if
that works.
$ python
Python 2.7.8 (v2.7.8:ee879c0ffa11, Jun 29 2014, 21:07:35)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def counter(init):
... count = init
... def fun():
... nonlocal count
File "<stdin>", line 4
nonlocal count
^
SyntaxError: invalid syntax
Whoops. I guess nonlocal is only available in Python3. My next
inclination was to make x a mutable object and see if I can mutate
it. My further research about nonlocal and closures in Python2
revealed the following useful snippet from StackOverflow [2].
Inner functions can read nonlocal variables in 2.x, just not rebind them. This is annoying, but you can work around it. Just create a dictionary, and store your data as elements therein. Inner functions are not prohibited from mutating the objects that nonlocal variables refer to.
To use the example from Wikipedia:
def outer(): d = {'y' : 0} def inner(): d['y'] += 1 return d['y'] return inner f = outer() print(f(), f(), f()) #prints 1 2 3
And yes, that example works as anticipated. It even works when we create two separate counters.
>>> f = outer()
>>> print(f(), f(), f())
(1, 2, 3)
>>> g = outer()
>>> print(f(), f(), f())
(4, 5, 6)
>>> print(g(), g(), g())
(1, 2, 3)
>>> print(f(), f(), f())
(7, 8, 9)
>>> print(g(), g(), g())
(4, 5, 6)
>>> f = outer()
>>> print(f(), f(), f())
(1, 2, 3)
>>> print(g(), g(), g())
(7, 8, 9)
But that’s not a closure. That’s a manual implementation of environments.
What’s the difference? First, it’s something that the programmer has to
do, rather than the language. Second, it’s not as generalized as closures.
With simple nesting, this approach works okay. However, closures work even
if we have much deeper nesting of functions. That would require significantly
more effort on the part of the programmer if we were to use the manual
environment
approach [3].
Given what I’ve found, I’d have to say that Python2 does not have closures, at least not what I think of as closures.
Kotlin
In yesterday’s musing, I identified closures as one of the things I play with when I learn a new language. I’m trying to learn Kotlin. Perhaps I should try to figure out closures in Kotlin.
There seem to be two basic ways to write anonymous functions [4]
in Kotlin. The
documentation
suggests that there are at least two ways to write anonymous
functions. You can use arrow notation, as in { x: Int -> x + 1 }, or you can use more verbose function notation, as in fun (x: Int): Int { return x + 1 }.
While my experience with Kotlin is limited, I was able to quickly put together the counter example. Here’s what I wrote.
fun makeCounter(): () -> Int {
var count = 0
return { ++count }
}
fun main() {
val c1 = makeCounter()
val c2 = makeCounter()
println("${c1()} ${c1()} ${c1()}") // 1 2 3
println("${c1()} ${c1()} ${c1()}") // 4 5 6
println("${c2()} ${c2()}") // 1 2
}
And yes, it works as expected. Kotlin does seem to have closures, or
at least some basic form of closures. I must admit that the
return { ++count; } feels strange to me. My brain doesn’t read
that as a function. But I can understand that it’s more concise
than the follow (which also works).
return fun(): Int { return ++count }
There does seem to be a reasonable compromise, that looks more like a function to me.
return { -> ++counter }
I’m intrigued to find that IntelliJ tells me to get rid of the
redundant arrow
.
JavaScript
I wasn’t planning to include JavaScript in this musing. However, the day after I started drafting it, my colleague Janet Davis posted a link to an article on closures in JavaScript. Here’s the code that the author, Olivier De Meulder, posted.
1: function createCounter() {
2: let counter = 0
3: const myFunction = function() {
4: counter = counter + 1
5: return counter
6: }
7: return myFunction
8: }
9: const increment = createCounter()
10: const c1 = increment()
11: const c2 = increment()
12: const c3 = increment()
13: console.log('example increment', c1, c2, c3)
It seems like counters are a fairly common approach to looking at
closures. However, I’m surprised that De Meulder doesn’t create multiple
counters make sure that the counter variable is not shared. Let’s
give that a try.
function createCounter() {
var c = 0;
return function() { return ++c; }
}
const c1 = createCounter();
const c2 = createCounter();
console.log(c1(), c1(), c1()); // 1 2 3
console.log(c2(), c2()); // 1 2
console.log(c1()); // 4
console.log(c2()); // 3
For some reason, the JavaScript IDE I tried wouldn’t allow let, so I
substituted var. That seems to be working fine. I also simplified
things a bit. We are getting expected behavior.
De Meulder goes on to write,
The closure contains all the variables that are in scope at the time of creation of the function. It is analogous to a backpack. A function definition comes with a little backpack. And in its pack it stores all the variables that were in scope at the time that the function definition was created.
That’s not quite true. Since multiple functions can be declared (and returned) within another function, some of the variables will be shared. Here’s an example with a single shared variable.
function dualCounter() {
var count = 0;
return [function() { return ++count; }, function() { return --count; }];
}
var [inc1,dec1] = dualCounter();
var [inc2,dec2] = dualCounter();
console.log(inc1(), inc1(), dec1(), inc1(), inc1()) // 1 2 1 2 3
console.log(dec2(), dec2(), inc2(), inc2()) // -1 -2 -1 0
Now, this may seem to you like they share the same backpack
. However,
with a little work, we can probably come up with an example in which they
share one variable, but not another. Here’s a stupid, and potentially
confusing, example.
// Return a pair of factories associated with a single counter. The
// 2, then 3, then 4, and so on and so forth.
function confuse() {
var counter = 0;
function incrementerFactory() {
var offset = 1;
return function() { counter += offset++; return counter; }
}
function decrementerFactory() {
var offset = 1;
return function() { counter -= offset++; return counter; }
} // decrementerFactory()
return [incrementerFactory, decrementerFactory];
}
var [if1,df1] = confuse();
var [if2,df2] = confuse();
var inc1 = if1();
var dec1 = df1();
var inc2 = if2();
var dec2 = df2();
console.log(inc1()); // 1 ; The inc offset for the first counter is now 2
console.log(inc1()); // 3 ; The inc offset for the first counter is now 3
console.log(inc1()); // 6 ; The inc offset for the first counter is now 4.
// If the offset variable is shared, decrementing should give us 2. If it's
// not shared, decrementing should give us 5.
console.log(dec1()); // 5
// We've determined that offset is not shared between inc1() and dec1(),
// but counter is. What about the other pair?
console.log(inc2()); // 1;
// Nope; counter is not shared.
I don’t think backpack
covers this behavior very well. I admit that
it’s not a behavior I need in many cases. But it’s still the behavior
that the term continuation
is supposed to imply.
Other thoughts
I’m not sure if I’ve explained closures to those of you who didn’t understand them already, but I helped myself think about them more. I also discovered that there seem to be a variety of ways that people think about closures and perhaps even the ways that I think about closures. We’ve seen one model that captures only the values of variables. Most examples show what happens when you capture a single enclosing scope. But real closures can handle multiple levels of enclosing scope. And to me, that’s the part that’s coolest part.
Postscript: I realize that some folks would say Closures are a
stupid way to achieve the counter behavior that you described
in this musing.
I might even agree. Nonetheless, the counters
provide a simple example of what closures are; there are more complex
situations that benefit from more complex use of closures. It’s
kind of like recursion. Many of the recursive procedures we first
consider could be written iteratively. But consider Quicksort
[5,9]. Can you write Quicksort iteratively without a stack? I’m
not sure that I can. And, in the end, the implicit stack given by
recursion is much easier to manage than an explicit stack.
Postscript: I’d love to hear from one of my colleagues in Programming
Languages whether they consider the Python 2 version of closures
real closures
.
[1] Would I really write code with side-effects in Racket? Sure, when it seems appropriate.
[2] I generally tell my students to be cautious about what they take from StackOverflow. For example, the advice to use ArrayLists in Java appears way too often. But that doesn’t mean that everything on StackOverflow is pointless.
[3] It could also generate much slower code.
[4] Some call these lambdas.
[5] Quicksort is a famous probabilistic divide and conquer
sorting
routine developed [8] by C.A.R. Tony
Hoare. It involves reordering
an array or subarray so that small
elements are on the left and
large
elements are on the right, and then recursing [6] on each part.
[6] It appears that recursing
is a term that I invented. My colleagues
tell me that recurring
is correct. I don’t care. I like my term
better [7].
[7] I’m also a descriptivist, so I hope that by encouraging my students to use the term, it will enter the lexicon.
[8] Discovered?
[9] There seem to be multiple spellings of Quicksort
, including
Quick sort
, Quick Sort
, QuickSort
, and even quik sort
. I
stick with Hoare’s spelling.
[10] There seem to be at least two early articles by Hoare on Quicksort.
C. A. R. Hoare. 1961. Algorithm 64: Quicksort. Commun. ACM 4, 7 (July 1961), 321-. DOI=10.1145/366622.366644 http://doi.acm.org/10.1145/366622.366644
C. A. R. Hoare. 1962. Quicksort. (Comput. J.* 5, 1, 10–16. doi:10.1093/comjnl/5.1.10 http://comjnl.oxfordjournals.org/content/5/1/10.full.pdf
Version 1.0 of 2019-09-11.
