Easy reading

:: weekly update, Clojure, GPGPU

I had to tear myself away from my new video game fix to bring you this update. Parkour and giant mechs are, I think, more exciting than blog posts. I did manage to get some new reading done this week, so here we go!

Weekly Status Update 2: Update Boogaloo

:: weekly update, Boston Software Craftsmanship, GPGPU

I promised I’d start writing new blog posts each week, and I’m a man of my word. However, this week’s programming/research updates were hampered by getting my remaining wisdom teeth removed. As it turns out, feeding oneself an equivalent amount of calories in liquid form from a usually solid diet is time-consuming. I did start at least one paper and go to one meetup, so on with the show!

Inaugural weekly update

:: weekly update, GPGPU

It seems I did that thing I always do with my blog and ramped up with a new platform, got everything situated the way I want it, and then never stuck to writing anything. Well that’s gonna change today! This is my first foray into doing a weekly update of the stuff I’ve been working on, papers I’ve read, and talks I’ve gone to/watched. This sort of update encapsulates what I had hoped to write about as individual posts, but I clearly have bad writing habits with respect to scheduling time to do them. Maybe I’ll change my evil ways, and these updates will go away in lieu of real write-ups. Only time will tell.

On with the show!

A daily journal system in 11 lines of code

:: quick tip, shell script

Trying to manage iterations, user stories, task point estimates, and all the other fun stuff that comes with agile development, I’ve taken to writing a daily journal of what I do at work. It helps me keep track of my progress from the 2–3 weeks of an iteration and serves as a nice place to document things I learned for the day (which happens a lot, thankfully).

To support my journal, I wrote up a program to open up an entry for the current day in my text editor with a single command. It also accepts an optional argument for specifying a different day’s journal entry, useful for when I forget to write the previous day’s entry. I’ve written the program so it understands a pretty rich language; for instance, I can call update-journal 2-days-ago, and it opens the entry two days prior.

Today, I’ve decided that I’m open-sourcing my update-journal program with an MIT licence, so feel free to copy the code and use it as you like. In fact I’m going to post the source right here on my blog.

Ready?

Back in business

:: clojure, ctco, chaser, heyawake, pirates and bulgars, syntax-case, tools.cps, cps

About a year ago, I moved my blog over from Wordpress to Octopress because Dave Herman did it, so it seemed like a good idea. A couple hours of Ruby issues later, I had it working, made my first post about how I’d switched over to Octopress and how great it was, and promptly stopped writing on that blog for almost a year.

This week, I got set up with Greg Hendershott’s Frog, which is similar to Octopress, but written in Racket instead of Ruby. As Greg suggests, Racket is generally easier to deal with than Ruby, and I have to agree. Plus, since it’s written in Racket, I actually have some hope of being able to modify the source if I want it to do something different. I’ve already spent some time scripting some workflows for it, and I’m planning on posting the quick-and-dirty script I wrote to GitHub (with Greg’s blessing). If I get a chance, I may add some of the functionality back into Frog itself.

Anyway, this isn’t supposed to be a “hey check out this new blogging framework I’m using” post. It’s supposed to be an update on what I’ve been up to over the last year that I haven’t been blogging, at least in terms of CS and software engineering. So let’s get to it!

A Crash Course on CTCO

:: clojure, ctco

A couple of weeks ago I gave a talk for the Triangle Clojure Users Group giving an overview of the state of constant space tail calls in Clojure and the techniques employed in the Clojure Tail Call Optimization compiler (hereafter known as Clojure TCO or CTCO) to improve that support. Rather than following my normal routine of writing up slides and then practicing the talk to death, I wrote a set of extensive notes to fully flesh out all the topics I wanted to cover and then gave the talk more off-the-cuff and used notes and drawings on a whiteboard. Unfortunately there’s no recording of that talk, but I figured I could share my original notes. I’ve cleaned them up a bit, removing the parts that make reference to specifics of the talk, but they’re still a bit unapologetically rough.


Introduction

Rather than discussing the nitty-gritty details of how I’ve put CTCO together, I think it’s better to get an understanding of the history of tail calls and then examine the techniques that CTCO uses, but not necessarily exactly how they’re implemented. The hope is that discussing the techniques individually will help one understand how they can be useful both and in and outside of the context of TCO. Particularly, I think there’s a lot to be learned from the CPS transformation, and I hope this can be a stepping stone to understanding continuations for those who weren’t introduced to them by PL giants like I (eternally gratefully) was. Roughly, the presentation goes like this:

  • The history and anatomy of a tail call
  • The state of tail calls in Clojure and what CTCO does to improve the situation
  • How to understand the CPS transformation and use continuations in your own everyday programming.
  • The (somewhat easier) transformations of thunkification and trampolining.
  • How these transformations fit together to provide constant space tail calls.
  • The (speed) drawbacks of using CTCO

A History Lesson

To get started, let’s hop into our time machines and go back to the year 1977. Fortran, Pascal, and PL/I rule the land, and functional programming languages (i.e. Lisp) are relinquished to the halls of academia. Common knowledge of the time states that structuring programs in terms of procedures makes them more readable and easy to maintain, but people in professional practice claim that they are too slow to use more than sparingly. Instead, programmers structure their code around the dreaded GOTO because it’s “inherently faster.”

Enter one fresh-faced young Guy Steele in the AI lab at MIT. His paper, “Debunking the ‘Expensive Procedure Call’ Myth or, Procedure Call Implementations Considered Harmful or, Lambda: the Ultimate GOTO” (truly a man after my own heart in terms of ridiculous names), helps to show the world that procedure calls are not inherently slow, just poorly implemented by compiler writers of the time. Steele covers a lot of ground in the paper; in it, he claims that:

  1. Procedure calls are essentially just themselves glorified GOTOs
  2. Lisp compilers (compiling a language generally held as “really slow”) could produce faster numeric computation code than venerable Fortran compilers when they used efficient implementations for procedure calls.
  3. One can express complex state diagrams in terms of sets of simple procedure calls.

Of highest interest to the current discussion, Steele codifies the notion of a tail call, which he defines as the last procedure call made by an expression. Let’s take a look at the example that he uses in the paper:

1
2
(define (bar x y)
  (f (g x) (h y)))

As he walks through the example, he notes that a naive compiler would generate code that would create a stack frame for the call to (g x), saving its value, then a stack frame for (h y), saving its value, and finally creating a stack frame for the call to f with the results from the previous two function calls. (Note that in languages like Scheme the order of evaluation for arguments to procedure calls is undefined, and so g and h may be called in the reverse order in those languages). Why is the code generated by our theoretical compiler “naive?” As Steele points out, on returning from the call to f, there’s nothing left for bar to do, therefore a sufficiently smart compiler will simply reuse the stack frame for bar for the call to f. That is, it would place the arguments in the appropriate registers and jump to f. Thus our the last procedure call in bar, the “tail call,” is like a GOTO with arguments.

After showing compiler writers the error of their ways, Steele shows the power of writing programs in terms of tail-recursive procedures and how much cleaner it is than using GOTO. This is particularly apparent to anyone who has implemented a state machine or a parser.

Steele framed the paper as an admonition to language implementors to shape up in efficiently handling procedure calls, and it would seem that they listened. Following his paper, Guy Steele released the Scheme programming language with Gerry Sussman, and as part of the language’s design, it required constant space tail calls. Sometime later, ML came on the scene as a functional programming language that also required constant space tail calls in its standard.

It’s probably worth taking the time at this point to note a slight distinction in terminology that’s often ignored when discussing this topic. When talking about languages like Scheme and Standard ML which require constant space tail calls in their standard, it is termed “proper handling of tail calls,” as any other handling would violate the language’s semantics. In the case of languages which do not require constant space tail calls, the term is “tail call optimization.” As we should all know by now, Clojure (being implemented on the JVM) doesn’t require constant space tail calls, and so we’ll primarily use the term “tail call optimization.”

Constant Space Tail Calls (Back in the Present)

Constant space tail calls seem like an awesome idea, so surely all language implementations guarantee them, right? As it turns out, no. Some languages, like Java, go ahead and create a stack frame for every procedure call. It has something to do with security policies and stack traces and the like. Sorry, I’m no JVM expert.

As many Clojurians (Clojurers?) probably know, the situation isn’t completely bleak for JVM languages like our favorite JVM-bound Lisp dialect. We have one construct, the recur form which leverages the underlying implementation for while loops to provide constant space tail calls for self recursion. That is, to the top of a function or the loop construct.

Clojure.core also provides a “trampoline” function which can be used, along with some manual transformations, to achieve constant space tail calls for an arbitrarily large set of mutually recursive functions. I’m not a big fan of the “manual transformations” part, but it’s certainly a workable solution. As to what those transformations are, we’ll actually be talking about them later.

Enter CTCO

So Clojure provides not one, but two ways to get constant space tail calls on the JVM. That’s pretty good, right?

I think not.

Well, if I’m gonna complain, than I should have a better idea, right? I like to think that I do. Enter Clojure TCO. It’s a little compiler I whipped up that preprocesses a Clojure expression, parsing it into Clojure 1.3’s records and performs a set of transformations (implemented in terms of Clojure 1.3’s protocols) which have been tried and true in garnering constant space tail calls from systems which don’t provide them. Don’t worry, we’ll get to the specifics in just a second.

So what’s the basis for this process? I certainly didn’t pull them out of thin air. Rather, they come from decades-old work by Dan Friedman, which he titled “How to Write Interesting Recursive Programs in a Spartan Host .” In that work, he starts from Scheme code written in its natural, recursive style, and goes step-by-step applying the requisite, straight-forward transformations until the Scheme code becomes C code. Dan makes the process seem so easy that He’s made the process an assignment for his Intro to Programming Languages course for a lot of years (it was one of my favorites, incidentally).

Enough with the teasing; how does one (not so) magically pull tail call optimization out of arbitrary code? In three simple steps:

  1. Continuation passing style (CPS)
  2. Thunkification
  3. Trampolining

Uh-oh. I said “continuation.” You may think to yourself, “this has got to be complicated.” Not at all; not even if you don’t know how to do the CPS transformation. So let’s start there! (You’d think I set that segue up for myself)

Step 1: Continuation Passing Style

So what’s the key to learning how to do the CPS transformation? Well, first let’s start with a function definition. And since we’re all FP folks, here, it’s gotta be factorial:

1
2
3
4
5
(defn fact
  [n]
  (if (zero? n)
      1
      (* n (fact (dec n)))))

As we dive into applying the transformation, a key idea is that there are two types of expressions from the point of view of the CPS transform: trivial expressions and serious ones. For simplicity’s sake, trivial expressions include atomic values (e.g. numbers, symbols, strings, etc.), functions (i.e. fn expressions), and calls to simple procedures that we know return quickily (i.e. +, first, conj). Serious expressions include any procedure call which may recur and call off to other recursive procedures and are just in general complicated. Once we’ve gotten our heads around all that, the rest flows pretty simply from those rules.

The first thing we do to apply the CPS transform to our function definition is to add a new argument to the function. For absolutely no historical reason whatsoever, we’ll call it k (excuse my sarcasm):

1
2
3
4
5
6
7
;; 1: Add a k argument
;; NOTE: this code won't work (yet)
(defn fact
  [n k]
  (if (zero? n)
      1
      (* n (fact (dec n)))))

This argument is our “continuation.” Seriously, ignore the five-dollar name; just think of it as our k. You can also think of it as a function that at any given point takes a value and performs the part of the computation that is waiting to be done. Don’t worry if that doesn’t make perfect sense yet; the example should make it a little clearer.

Now our procedure takes a new k argument, so the next thing is to deal with our if expression. Luckily for us, the test part of it is a call that we know will return quickly. If it didn’t, we’d have to do something a little more complex, but we’ll ignore that for now. Instead, we just have to worry about the true and false branches of the if, and we can consider them separately.

First up we have the true branch: in which we return 1. This is clearly a simple value, so what do we do with it? Whenever we have a simple value, we apply our k to it. After all, it’s a function that takes a value and performs the part of the computation that’s waiting to happen. Now our definition looks like this:

1
2
3
4
5
6
7
8
9
;; 1: Add a k argument
;; (1.5: Ignore 'if' because the test is trivial)
;; 2: Apply k to value in the 'true' branch of the 'if'
;; NOTE: this code won't work (yet)
(defn fact
  [n k]
  (if (zero? n)
      (k 1)
      (* n (fact (dec n)))))

Next we have to deal with the false branch of the if, and boy, is it a doozy. We’ve got a multiplication with an argument that’s a recursive call to fact.

This brings us to our next rule of applying the CPS transformation. When we have encounter non-tail calls (like the one to (fact (dec n))), we (recursively) pull them to the front and create a new k that encapsulates the part of the computation that’s waiting to be done. Ok, sorry, that sounds really complicated. It’ll be a lot easier if we just look to our example.

When we look at the expression (* n (fact (dec n))), we should probably notice that the part that makes this serious is the call to (fact (dec n)), since it’s otherwise just a multiplication, which we know is trivial (i.e. returns quickly). So what do we do?

The astute reader will note that since we added a k argument to our definition of fact, this recursive call doesn’t make sense anymore; it’s missing an argument. That’s the new k we have to create. Well, we know that we can think of a continuation as a function that takes a value, so it starts looking something like this:

1
(fn [v] ---)

But what goes into the —? If you’ll remember, our k represents the part of the computation that’s waiting to happen. We also know that in the original definition of fact, there’s a (* n —) waiting to happen, so we can probably put that in there:

1
(fn [v] (* n ---))

So what goes into this —? It seems reasonable to put the value fed into the k in there, right? That kinda makes sense from where we call fact on 1 in our base case in the true branch of the if:

1
(fn [v] (* n v))

Another way to think about it is that the value that we feed into the k is the value we have when the computation is done.

This isn’t quite our new k yet, though, since we didn’t use the old one. Remember, at any given point, our k represents the part of the computation that’s waiting to happen. Here we can simply apply our old k to the result of (* n v) inside our new k:

1
(fn [v] (k (* n v)))

Thus we’ve created a new k which takes the value computed by the recursive call to fact, performs the part of the computation that was waiting to happen in our original, serious non-tail recursive call, and finally applies the rest of the computation that’s waiting to happen represented by the k that got passed in.

And so we end up with this new definition for factorial:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
;; 1: Add a k argument
;; (1.5: Ignore 'if' because the test is trivial)
;; 2: Apply k to value in the 'true' branch of the 'if'
;; 3.1: Pull the serious, non-tail call to fact to the front
;; 3.2: Create a new k which performs the part of the non-tail call
;;      waiting to happen and applies our original k
;; NOTE that this code *will* work!
(defn fact
  [n k]
  (if (zero? n)
      (k 1)
      (fact (dec n) (fn [v] (k (* n v))))))

An important feature to note about this definition, as well as any function in CPS, is that all calls are tail calls. So if we wanted to, the “false” branch of the if could swap fact with recur and not change the meaning of the definition (though it would get constant space tail calls). This is one of the important features of CPS for the purposes of CTCO which may or may not be apparent. We’ll get into that in a moment.

Should we want to call this like we did our original definition, we can just pass the number argument and the identity function, (fn [x] x), or “empty continuation.” This should make sense since our k is supposed to be the part of the computation that’s waiting to happen. At the beginning of a computation, there is no part that’s waiting to happen, and so kis just a function that takes a value and returns it, unchanged.

What Are Continuations?

With that, the requisite knowledge has been dropped to move on to my next point, which I think is the most important observation that I’m presenting. First, recall our original definition for fact:

1
2
3
4
5
(defn fact
  [n]
  (if (zero? n)
      1
      (* n (fact (dec n)))))

Let’s consider the stack for a call to (fact 5):

1
2
3
4
5
6
7
(fact 5) == (* 5 (fact 4))
         -> (* 5 (* 4 (fact 3)))
         -> (* 5 (* 4 (* 3 (fact 2))))
         -> (* 5 (* 4 (* 3 (* 2 (fact 1)))))
         -> (* 5 (* 4 (* 3 (* 2 (* 1 (fact 0))))))
         -> (* 5 (* 4 (* 3 (* 2 (* 1 1)))))
         => 120

Now consider the stack for a call to the CPS version of fact:

1
2
3
4
5
6
7
8
(fact 5 identity) == (fact 4 (fn [v] (identity (* 5 v))))
                  -> (fact 3 (fn [v] (* 5 (* 4 v))))
                  -> (fact 2 (fn [v] (* 5 (* 4 (* 3 v)))))
                  -> (fact 1 (fn [v] (* 5 (* 4 (* 3 (* 2 v))))))
                  -> (fact 0 (fn [v] (* 5 (* 4 (* 3 (* 2 (* 1 v)))))))
                  -> ((fn [v] (* 5 (* 4 (* 3 (* 2 (* 1 v)))))) 1)
                  -> (* 5 (* 4 (* 3 (* 2 (* 1 1)))))
                  => 120

(Note that this may not be exactly how the continuation is represented internally, but it is equivalent and makes the point I’m driving home much clearer)

Notice anything familiar about the continuation, er, that is, the ‘k’ argument in the call to the CPS fact and the stack in the original version? Looks pretty similar, huh?

This leads me to my biggest claim of the whole discussion:

CONTINUATIONS ≡ STACKS

Continuations are simply a version of the stack that we’ve represented using functions in place of the one that the JVM runtime usually takes care of for us automatically. So for the rest of our discussion, any time I use the word “continuation,” you should be saying to yourself “stack.”

To make this even more painfully apparent, let’s consider a slightly different definition for fact which takes an additional accumulator argument:

1
2
3
4
5
(defn fact-acc
  [n a]
  (if (zero? n)
      a
      (fact-acc (dec n) (* n a))))

Those paying attention (or who have seen this before) will note that this definition is tail recursive, since there is no computation waiting for the recursive call to fact-acc to finish before completing. We can call this function much like the original definition as long we pass an initial accumulator value of 1. Thus we get:

1
(fact-acc 5 1) == 120

If we go through the CPS algorithm we discussed before, this mostly falls out pretty easily. First we add our k argument:

1
2
3
4
5
(defn fact-acc
  [n a k]
  (if (zero? n)
      a
      (fact-acc (dec n) (* n a))))

We know that the if has a simple test, so we don’t have to do anything special with it. In the “true” branch, the a is a simple value, so we simply apply our k to it:

1
2
3
4
5
(defn fact-acc
  [n a k]
  (if (zero? n)
      (k a)
      (fact-acc (dec n) (* n a))))

Which brings us to the “false” branch. In our original definition, there was a non-tail call which we pulled out and encapsulated the rest of the computation in the new k that we created. Here, however, there’s no non-tail call; both the (dec n) and the (* n a) are simple, fast-returning expressions, meaning that when the call to fact-acc returns, there’s no computation waiting for it. So what does that mean for k? Well, we simply pass it along:

1
2
3
4
5
(defn fact-acc
  [n a k]
  (if (zero? n)
      (k a)
      (fact-acc (dec n) (* n a) k)))

So let’s take a look at our call stack for a call to this function, assuming initial values of 1 for the accumulator and the identity function for the initial, empty continuation:

1
2
3
4
5
6
7
(fact-acc 5 1 identity) == (fact-acc 4 5 identity)
                        -> (fact-acc 3 20 identity)
                        -> (fact-acc 2 60 identity)
                        -> (fact-acc 1 120 identity)
                        -> (fact-acc 0 120 identity)
                        -> (identity 120)
                        => 120

Note that throughout the whole computation, the continuation never changed. This makes sense because there was never a non-tail computation that caused another one to have to wait. This also illustrates that in general, no stack frames need be built for tail calls, supporting the history lesson from the beginning of the discussion.

So with that very long discussion of CPS behind us, what does this mean for CTCO? Of particular importance to CTCO are two of the characteristics of programs in CPS that have already been mentioned:

  1. All procedure calls are tail calls
  2. The continuation provides an explicit representation of the stack at any point in the computation.

At this point, one might think that CTCO need do no more than apply CPS and replace self-recursive calls with recur, given the first property of CPS and our example using factorial. However, this won’t help for groups of mutually recursive procedures. To handle that, we must apply a couple more transformations. No worries, though, they’re a lot easier to get a handle on than CPS.

Steps 2 and 3: Thunkification and Trampolining

Our next transformation is simply called “thunkification,” which we apply to an expression that has already undergone the CPS transformation. The main idea of thunkification is to wrap any expression that would grow the stack in a function of no arguments, known as a thunk. This simply makes it so that any expression that would grow the stack instead returns a value, namely a function that when invoked performs the next step of the computation. And that’s all there is to it!

Of course, thunkification isn’t very useful unless it’s coupled with the final transformation, “trampolining.” You know how thunkification essentially broke the computation into its individual steps, each one returning a thunk that when invoked, performed the next step of the computation? Trampolining simply runs each step of that computation until there’s some indication that it has finished. Plus it does so using a constant space looping construct, in this case recur.

Remember how our CPS transformation ensured that we had a representation of the stack manifest as an argument? The cool part about that is that it allows us to clear out our JVM stack on each “bounce” on the trampoline, but retain the context it would have stored instead in our continuation.

So what does this trampoline look like?

1
2
3
(defn tramp
  [thunk]
  (if <done?> thunk (recur (thunk))))

Not too bad, huh? Of course you’ll note the cryptic <done?> predicate. It can be implemented in any one of a number of ways.

This is a common enough trick that clojure.core includes a procedure called trampoline that does this, assuming that the input function has been thunkified by hand. However, the built-in trampoline simply keeps invoking the argument value until it’s no longer a procedure (i.e. <done?> = (not (procedure? thunk)). This is kind of annoying should one want to return a procedure from the trampoline. As you may well know, the Clojure homepage advises that if you need do this that you should put the return procedure into a data structure that is destructured after the trampoline returns. Again, this is one of the thing that bugs me about constant space tail call support in standard Clojure.

And yet again, when I complain about the built-in support, I like to think I’ve got a better solution on hand. Instead of asking if the value passed to the trampoline is a procedure, CTCO uses a (boolean) flag local to the computation to determine whether the trampoline needs to continue.

When I was a little more green around the ears as a Clojure programmer, this was a ref to a boolean. As Rich Hickey advised after I made my first release of CTCO, “you should definitely be using atoms instead of refs.” As David Nolen put it, “an experienced Clojure programmer would probably spit his/her coffee if he/she knew that you were using a ref.” So I lived and learned and changed the ref into an atom (thanks, Rich). Being one of the Clojurians who spit his coffee when he read through the initial release’s source code, Alan Dipert kindly changed the flag into a piece of metadata attached to each thunk, which it remains to this day.

The Drawbacks

So CTCO sounds pretty great, right? You can write simple, tail-recursive procedures all day and never worry about stack overflows by running them through CTCO. But it’s not all roses.

There are drawbacks to CTCO, namely that it necessarily introduces computational overhead. First, we’re passing around our k argument everywhere, which is an extra cost. We’re also creating new procedures to represent continuations wherever there are non-tail calls, which takes time and requires additional heap space to allocate. We’re also introducing overhead by creating thunks to delimit the bounces on the trampoline. Finally, we have to execute our computations one step at a time on the trampoline. It’s intended to be a small, tight, tail-recursive loop, but it’s still slower than running the code directly, especially since it has to check whether the argument is a thunk on each bounce.

Great pains have been taken with CTCO to mitigate these costs. The CPS algorithm is intended to introduce as few continuations as possible. There’s been a little bit of work in trying to minimize the number of thunks, though that’s still an open problem that I’m hoping to address in the near future.

On the topic of allocating new continuations for non-tail calls, it’s worth making an aside to point out an important trade-off: while CPS converts non-tail calls to tail calls, allowing them to be performed with constant stack space, that stack space is essentially converted to heap space for the new continuation we have to allocate. In that way, applying CTCO to a primarily non-tail recursive program will prevent it from overflowing the stack, but it will likely trade that stack overflow for a heap overflow.

Conclusions

So what is the primary contribution of CTCO? My view is that it gives the programmer peace of mind. I think it’s unfortunate that Clojure puts as much onus as it does on the programmer to ensure that tail calls are constant space. Using a system like CTCO allows one to write programs in a tail recursive manner and know that constant space tail calls are guarunteed. This is really great if you want to write something like a parser or a state machine that makes tons of tail-recursive calls to other functions. If you’ve not written something like this yourself, just check out ctco.parse to see how simply a parser can be written in tail-recursive fashion.

And don’t let the drawbacks scare you off from CTCO. The point is not to use it on every expression of every program that you write. That’s just making your program slow just to use some cool transformations.

Rather, CTCO is intended for those tail-recursive programs that won’t finish because of stack overflow errors, where Clojure would require you to apply these tranformations anyway. The hope is that you wouldn’t be able to do a better job than CTCO if you transformed the code yourself.


As always, you can check out the code on Github. Feel free to fork it and let me know if you’ve got improvements you’d like to make; I accept pull requests. For more information on the specific CPS algorithm that CTCO uses, you can read Olivier Danvy’s “A First-Order One-Pass CPS Transformation.” You can also read about the transformations that CTCO uses as outlined for Dan Friedman’s intro to PL class in “Using ParentheC to Transform Scheme to C or How to Write Interesting Recursive Programs in a Spartan Host (Program Counter).”

A little bit of CPS, a few thunks, and a trampoline

::

As promised, I’m going to post some more information about the Clojure TCO project that I’ve been working on with Dan Friedman. For those who missed it, my good friend Eric Holk posted the video from a talk I gave two weeks about it. Just for a little context, I gave this at our informal weekly PL Wonks colloquium series. We keep it pretty light, so there were a number of stops to discuss certain points and quite a few inside jokes.

I apologize to any Clojurers bored by the basics discussed in the video; the IU PL group is primarily made up of Schemers and Haskellers. While they’re no strangers to Lispy languages, few had used or knew much about Clojure.

When the video went up, a number of people asked for slides, so I have made them available here. Also, here’s the video to go with it (via YouTube):

Some notes worth mentioning about the video:

  • At about 4:06, I say that there’s a “cross-compiler from Clojure to ClojureScript.” I of course misspoke myself; I meant to say that it compiled Clojure to JavaScript.

  • At around 11:30 there’s a fairly long discussion comparing Clojure’s use of “recur” to Scala’s automatic self-recursion tail-call optimization. I say in the video that I’m not sure why it’s not automatic in Clojure. Clojurers from IRC have said that it’s to avoid “magic” in the compiler in which one would write code they thought was tail-recursive, but then blows the stack. This actually leads into a possible loop optimization suggested by a fellow IU compiler hacker, but I’ll write more on that later.

  • I mention a couple of IU undergraduate courses by catalog number: C311 is the programming languages course taught by Dan Friedman and P423 is the compilers course originally taught by Kent Dybvig.

  • The video says that the trampoline’s flag is a ref, which was true at the time. I have been informed that this is a faux pas, and the current version uses an atom instead. It may come as no surprise that this was a substantial performance improvement.

  • The demo portion was cut off primarily because I had technical difficulties (as always happens with live code demos).

There is of course tons and tons to say about the work that Dan and I have done. I’ve been chatting with David Nolen for a few days, including about the possibilities of reusing the CPS transformer for asynchronous client-side scripting code with ClojureScript, as well as adding full call/cc and shift/reset to Clojure.

I’ve also got enough ideas to keep work going on CTCO itself for the foreseeable future (especially with finishing my M.S. next week and starting work at a real job in the next month). So if you’re an interested Clojurer, please take a look at the code on Github, fork it, play with it, add to it, and have fun! Also feel free to ask me questions; I try to be prompt about answering email and often appear on the Clojure IRC as cjfrisz.

I’m a bit busy with classes finishing up and other research deadlines, but I hope to keep writing some updates about the inner workings of CTCO and talk about where I would like to take the project from here. Until then, enjoy!

Dusting this old blog off

::

Well it has certainly been a long while since I wrote anything here. I’ve been a very busy boy, and I’ve got some work to show for it. Dan Friedman and I have been working on adding tail-call optimization to the Clojure language. A video went up a few days ago from an informal talk I gave to the IU PL group a couple of weeks ago about it, and I’ve posted some code to back it up.

Now that it’s starting to get out there, I’m going to try to do some writing about my work. Check back for the slides from the video and some write-ups on the state of the project.