Coroutines in Mu: Playing with Continuations in a Statement-Oriented Language

Based shamelessly on Simon Tatham's great “Coroutines in C”. To follow along, you'll first need to go (re)read that essay. I've retained much of the original prose (with permission) to help readers compare the two.

Introduction

Structuring a large program is always a difficult job. One of the particular problems that often comes up is this: if you have a piece of code producing data, and another piece of code consuming it, which should be the caller and which should be the callee?

Following Simon's example, here is a piece of run-length decompression code in Mu, and a corresponding parser emitting “word” (identifier) and “punctuation” tokens (explanation below):

  ## unpack characters from stream 'src'
{ # while src has data c:char, eof?:bool read src break-if eof? repeat-tag?:bool equal c, 0xFF { # if c signals a run break-unless repeat-tag? len:num read src c:char read src i:num copy 0 { # while (--len) emit(c); done?:bool equal i, len break-if done? emit c i add i, 1 loop } } { # else (c is a single character) break-if repeat-tag? emit c } loop } emit EOF
complete working sources: Version 1 
  ## parse tokens into stream 'dest'
{ # while getchar returns data c:char getchar eof?:bool equal c, EOF break-if eof? { # if c is an alphabet alpha?:bool isalpha c break-unless alpha? word:buffer new buffer { word append word, c c getchar eof? equal c, EOF break-if eof? alpha? isalpha c loop-if alpha? } # output a WORD token out:token merge WORD, word dest write dest, out return-if eof? } # now we must have punctuation out:token merge PUNCT, c dest write dest, out loop }
 Version 2

[Quick tutorial: Mu is like Assembly. Lines contain a single operation and multiple operands. The operation is either the first word following the ‘’, or the first word if no ‘’ exists. Operands may be inputs or outputs, the latter coming before the ‘’. Operands optionally state their type after a colon. The ‘break’ operation jumps to the enclosing ‘}’, while ‘loop’ jumps to the previous enclosing ‘{’. Jumps have conditional variants with suffixes ‘-if’ and ‘-unless’ which check their first operand. Comments in blue following ‘#’ are ignored. For more details, see Mu's Readme.]

[We'll quietly improve on how the original communicates with its environment. Instead of the decompressor reading input using ‘getchar’ and the parser writing output using ‘add_to_token’ and ‘got_token’, we'll ‘read’ from and ‘write’ to ‘src’ and ‘dest’ streams, respectively. It seems clearer to reserve ‘getchar’ for communicating between decompressor and parser.]

[The original had a minor bug in the parser: it didn't check for ‘EOF’ while accumulating a ‘WORD’. This is fixed above.]

[These examples aren't quite runnable Mu code. Click on the links below them to get the complete versions, along with instructions for running them.]

Both these code fragments are relatively easy to understand. One produces a character at a time by calling ‘emit’; the other consumes a character at a time by calling ‘getchar’. If only the calls to ‘emit’ and the calls to ‘getchar’ could be made to feed data to each other, it would be simple to connect the two fragments together so that the output from the decompressor went straight to the parser.

In many modern operating systems, you could do this using pipes between two processes or two threads. ‘emit’ in the decompressor writes to a pipe, and ‘getchar’ in the parser reads from the other end of the same pipe. Simple and robust, but also heavyweight and not portable. Typically you don't want to have to divide your program into threads for a task this simple.

[Mu supports concurrency using Go-style channels, which you could use to run decompressor and parser in separate “routines”. But we'll ignore that option as well. Coroutines allow us to be oblivious to a thread scheduler.]

Rewriting

The conventional answer is to rewrite the producer or consumer as a function that can be called. Here's what that may look like for each side above:

def decompressor src, repchar, replen [
  { # if (replen > 0)
    in-rep?:bool  greater-than replen, 0
    break-unless in-rep?
    replen  subtract replen, 1
    return
  }
  # if (c == EOF) return EOF;
  c:char, eof?:bool  read src
  return-if eof?, EOF
  # if (c != 0xFF) return c;
  repeat-tag?:bool  equal c, 0xFF
  return-unless repeat-tag?, repchar=c, replen=0
  # else if (c == 0xFF)
  replen  read src
  repchar  read src
  replen  subtract replen, 1
]



































complete working sources: Version 2 
def parser c, dest, state [
  word:buffer  get state, buf:offset
  { # handle EOF
    eof?:bool  equal c, EOF
    break-unless eof?
    # emit final word if necessary
    {
      break-unless word
      out:token  merge WORD, word
      dest  write dest, out
    }
    close dest
    return
  }
  alpha?:bool  isalpha c
  in-word?:bool  get state, IN_WORD
  { # case IN_WORD
    break-unless in-word?
    # if (isalpha(c))
    break-unless alpha?
    # continue pending word
    word  append word, c
    state  merge IN_WORD, word
    return
  }
  { # case IN_WORD
    break-unless in-word?
    # if (!isalpha(c))
    break-if alpha?
    # finish pending word
    out:token  merge WORD, word
    dest  write dest, out
    # fall through; haven't processed 'c' yet
  }
  { # process 'c'
    break-if alpha?
    # emit this punctuation
    out  merge PUNCT, c
    dest  write dest, out
    # reset state
    clear word
    state  merge START, word
    return
  }
  { # case START
    break-if in-word?
    break-unless alpha?
    # start new word
    word  append word, c
    state  merge IN_WORD, word
    return
  }
]
 Version 1

[Mu has neither globals nor C's static variables, so we thread them as input/output parameters to our functions and let the caller manage them. We mark these variables in red.]

Of course you don't have to rewrite both of them; just one will do. If you rewrite the decompressor in the form shown, so that it returns one character every time it's called, then the original parser code can replace calls to ‘getchar’ with calls to ‘decompressor’, and the program will be happy. [Version 2] Conversely, if you rewrite the parser in the form shown, so that it is called once for every input character, then the original decompression code can call ‘parser’ instead of ‘emit’ with no problems. [Version 1] You would only want to rewrite both functions as callees if you were a glutton for punishment. [This is much more obvious in these Mu fragments than in Simon Tatham's original C.]

And that's the point, really. Both these rewritten functions are thoroughly ugly compared to their originals. Both of the processes taking place here are easier to read when written as a caller, not as a callee. Try to deduce the grammar recognised by the parser, or the compressed data format understood by the decompressor, just by reading the code, and you will find that both the originals are clear and both the rewritten forms are less clear. It would be much nicer if we didn't have to turn either piece of code inside out.

Knuth's coroutines

In The Art of Computer Programming, Donald Knuth presents a solution to this sort of problem. His answer is to throw away the stack concept completely. Stop thinking of one process as the caller and the other as the callee, and start thinking of them as cooperating equals.

In practical terms: replace the traditional "call" primitive with a slightly different one. The new "call" will save the return value somewhere other than on the stack, and will then jump to a location specified in another saved return value. So each time the decompressor emits another character, it saves its program counter and jumps to the last known location within the parser - and each time the parser needs another character, it saves its own program counter and jumps to the location saved by the decompressor. Control shuttles back and forth between the two routines exactly as often as necessary.

This is very nice in theory, but in practice no mainstream language supports the coroutine call primitive. Languages like C depend utterly on their stack-based structure, so whenever control passes from any function to any other, one must be the caller and the other must be the callee.


[Here this post strikes out away from Simon Tatham's original.]

Implementing coroutines in Mu

In Mu, local variables aren't allocated on the stack. Instead, the stack is just a barebones call stack, usually containing just two addresses: the return address, and a pointer to a heap-allocated stack frame which contains arguments and local variables. Stack frames of local variables usually get deallocated when a function returns, but may outlive their associated function. (This approach is well-studied under the name “spaghetti stacks”.) Two mechanisms so far have benefited from this additional flexibility: closures and (our star today) continuations.

Continuations are copies of a stack that allow one to pause and resume a computation. Mu provides the ability to create delimited continuations using two primitives: ‘call-with-continuation-mark’ and ‘return-continuation-until-mark’. The ‘call-with-continuation-mark’ operator is like ‘call’ (which calls a dynamic function pointer), but first bookmarks (marks) the current point on the call stack. The ‘return-continuation-until-mark’ operator copies the current call stack until the mark, unwinds the call stack until the mark, and returns this stack segment or continuation as the result of the corresponding ‘call-with-continuation-mark’. Intervening stack frames don't return; the computation they represent is effectively paused. To resume it, just ‘call’ the continuation. That will cause the segment of stack copied earlier to be pushed to the top of the stack, resuming the computation from where it was paused. (Example program, worth playing with for a few minutes if you're new to continuations.)

Delimited continuations are a powerful mechanism, well-studied by others. Some useful properties:

The second property suggests how we may build coroutines out of continuations. We start with a regular caller and callee. Every call to the callee returns a continuation in addition to its other results. The continuation represents progress in the callee, and calling it resumes the callee where it last left off.

Evaluation

Here's what the fragments at the top of this page look like when using this approach (abbreviating ‘call-with-continuation-mark’ as ‘call/cm’, and drawing a box around ‘callee-related’ variables):

def decompressor src, dest [
  emit:continuation  call/cm parser, dest   
  {
    c:char, eof?:bool  read src
    break-if eof?
    repeat-tag?:bool  equal c, 0xFF
    {
      break-unless repeat-tag?
      len:num  read src
      c:char  read src
      i:num  copy 0
      {
        done?:bool  equal i, len
        break-if done?
        emit  call emit, c
        i  add i, 1
        loop
      }
    }
    {
      break-if repeat-tag?
      emit  call emit, c
    }
    loop
  }
  call emit, EOF
]
complete working sources: Version 3a 
def parser src, dest [
  getchar:continuation  call/cm decompressor, src
  {
    getchar, c:char  call getchar
    eof?:bool  equal c, EOF
    break-if eof?
    {
      alpha?:bool  isalpha c
      break-unless alpha?
      word:buffer  new buffer
      {
        word  append word, c
        getchar, c  call getchar
        eof?  equal c, EOF
        break-if eof?
        alpha?  isalpha c
        loop-if alpha?
      }
      out:token  merge WORD, word
      dest  write dest, out
      return-if eof?
    }
    out:token  merge PUNCT, c
    dest  write dest, out
    loop
  }
]
 Version 3b

The major change is that ‘emit’ and ‘getchar’ are now values rather than functions. These values encapsulate the state of the callee, and you have to thread them through the caller. In return, you get reentrant coroutines; you could track multiple executions of them in parallel, stopping and resuming them as you like.

The criss-crossing ‘callee’ versions are now identical to our initial fragments. We just need to trivially define ‘emit’ and ‘getchar’ (this time ‘caller-related’ variables):

def decompressor src [
  {
    c:char, eof?:bool  read src
    break-if eof?
    repeat-tag?:bool  equal c, 0xFF
    {
      break-unless repeat-tag?
      len:num  read src
      c:char  read src
      i:num  copy 0
      {
        done?:bool  equal i, len
        break-if done?
        emit c
        i  add i, 1
        loop
      }
    }
    {
      break-if repeat-tag?
      emit c
    }
    loop
  }
  emit EOF
]
def parser dest [
  {
    c:char  getchar
    eof?:bool  equal c, EOF
    break-if eof?
    {
      alpha?:bool  isalpha c
      break-unless alpha?
      word:buffer  new buffer
      {
        word  append word, c
        c  getchar
        eof?  equal c, EOF
        break-if eof?
        alpha?  isalpha c
        loop-if alpha?
      }
      out:token  merge WORD, word
      dest  write dest, out
      return-if eof?
    }
    out:token  merge PUNCT, c
    dest  write dest, out
    loop
  }
]
def emit c:char [
  return-continuation-until-mark c
]

complete working sources: Version 3b 
def getchar [
  c:char  return-continuation-until-mark
  return c
]
 Version 3a

Basically, you can either implement ‘emit’ as a continuation and ‘getchar’ as a thin wrapper around ‘return-continuation-until-mark’, or vice versa. Any use of coroutines should be able to follow this symmetric template.

One drawback of coroutines (and continuations more generally) in Mu is that function calls no longer look like their definitions. In spite of that caveat, the process of writing this post strengthened Simon Tatham's original case for me: it's worth giving programmers the option of coroutines in a language, without forcing them to drop down to raw Assembly.


Nov 27, 2017
Portions of this document © 2000 Simon Tatham.
This document is OpenContent.
You may copy and use the text under the terms of the OpenContent Licence.
Please send comments and criticism to Kartik Agaram.