Mar 31, 2018
List comprehensions in Anarki Lisp

or,

Example #4392 of building recursive functions

(Problem posed by musk_fan.)

Goal: we'd like to add Python-style list comprehensions to Anarki (Arc Lisp) (using a syntax inspired by Common Lisp's iterate macro):

arc> (collect i (for i from 1 to 10))
(1 2 3 4 5 6 7 8 9 10)

As a first step, let's try to transform this to:

(accum acc
  (for i from 1 to 10
    (acc i)))                           ...[1]

This isn't quite the right Arc syntax to get our result, but it's relatively straightforward to convert it to:

(accum acc
  (for i 1 (<= i 10) ++.i
    (acc i)))                           ...[2]

So we'll call 1 our initial goal. Let's elaborate on it with a few more examples.

arc> (collect i (for i from 1 to 10)
                (if odd.i))
(1 3 5 7 9)
 
arc> (collect (list i j)
              (for i from 1 to 3)
              (for j from 1 to 3)
              (if (< i j)))
((1 2) (1 3) (2 3))

We'd like the above calls to be rewritten to the following:

(accum acc
  (for i from 1 to 10
    (if odd.i
      (acc i))))
 
(accum acc
  (for i from 1 to 3
    (for j from 1 to 3
      (if (< i j)
        (acc i)))))

The pattern seems to be to 'suck' each successive guard through the ')' at the end of the previous one. If we can do that, it's relatively easy to slap on the accum acc on top and the acc i below. So let's build such a helper function. We'll call it ingest:

arc> (ingest)
nil
arc> (ingest '(1 2 3))
(1 2 3)
arc> (ingest '(1 2 3)
             '(4 5 6))
(1 2 3 (4 5 6))
arc> (ingest '(for i from 1 to 3)
             '(if odd.i))
(for i from 1 to 3 (if odd.i))

How to accomplish this? Let's proceed inductively.

arc> (ingest)
nil

This one's easy. If we get no lists we return nil. So the basic skeleton of the program will be:

(def ingest lists
  (when lists
    ...))

When lists is nil we never go into the ..., so we're done.

A slightly more complex example:

arc> (ingest '(1 2 3))
(1 2 3)

This one's again easy. If we get a single list we return it. Done.

(def ingest lists
  (when lists
    (ret result car.lists
      (when cdr.lists
        ...))))

Again, with a single list cdr.lists will be nil, so we'll never enter the ..., and result will be returned as is.

Now the hard case: two or more arguments.

arc> (ingest '(1 2 3) '(4 5 6))
(1 2 3 (4 5 6))

What do we put into the ...? We've already set ourselves up to return the first list. So we'll modify it before we return it. We'll append the new list to it.

(def ingest lists
  (when lists
    (ret result car.lists
      (when cdr.lists
        (nappend result lists.1)))))

This works for two arguments, but not more. However, with experience I know that I've done the hard work, and just need to find a way to reuse it. In other words, I need to replace the reference to the second argument lists.1 with a recursive call somehow.

The trick is to notice that lists.1 can be replaced with (ingest lists.1), because ingest on a single argument just returns it. So that'll be the starting point for our recursive call.

(def ingest lists
  (when lists
    (ret result car.lists
      (when cdr.lists
        (nappend result (ingest lists.1))))))

This still doesn't handle more than 2 arguments. But let's take a flyer and try passing in the remaining arguments anyway:

(def ingest lists
  (when lists
    (ret result car.lists
      (when cdr.lists
        (nappend result (apply ingest cdr.lists))))))

(We need to use apply because cdr.lists has lists “packed” into a list, whereas ingest expects its arguments “unpacked”.)

This works!

arc> (ingest)
nil
arc> (ingest '(1 2 3))
(1 2 3)
arc> (ingest '(1 2 3) '(4 5 6))
(1 2 3 (4 5 6))
arc> (ingest '(1 2 3) '(4 5 6) '(7 8 9))
(1 2 3 (4 5 6 (7 8 9)))
arc> (ingest '(for i from 1 to 3) '(if odd.i))
(for i from 1 to 3 (if odd.i))

(It may seem a little scary to just try an argument and hope for the best. But this is how I come up with recursive functions: at some point I assume I already have the function, and just use it like any other function. I "fake it till I make it". And then I "trust but verify" by testing it a lot and trying to break it. Usually when my attempt fails it's because I didn't yet have a clear declarative description in my head of what the function does with its arguments. For example, when I first built ingest, I initially forgot to use apply. When I discover such a disconnect between how I defined the function and how I want to call it, I may sometimes end up changing the header to make it more convenient to call. It's not very systematic, but it's all I have. What can I tell you, I monkey with code until I get it working. The dirty secret is out at last.)


Ok, now that we have ingest working, let's start putting collect together. As the last example shows, the first thing we need to do is go from:

(for i from 1 to 3
  (if odd.i))

to:

(accum acc
  (for i from 1 to 3
    (if odd.i
      (acc i))))

Here's an initial draft of collect that does this:

(mac collect (var . guards)
  `(accum acc
     ,(apply ingest
        (join guards (list `(acc ,var))))))
 
arc> (macex1 '(collect i (for i from 1 to 10)))
(accum acc (for i from 1 to 10 (acc i)))

The key is to move var to the end of guards before we call ingest. The rest is just massaging the output.

This seems a good place to point out that we don't always have to collect the variable being iterated over.

arc> (collect (list i j)
              (for i from 1 to 3)
              (for j from 1 to 3)
              (if (< i j)))
((1 2) (1 3) (2 3))
 
arc> (collect (* i 2)
              (for i from 1 to 10))
(2 4 6 8 10 12 14 16 18 20)

This seems like a useful feature. And really there's nothing new we need to do to support it. We'll just rename the first argument to highlight this possibility:

(mac collect (expr . guards)
  `(accum acc
     ,(apply ingest
             (join guards (list `(acc ,expr))))))

We're getting pretty close. But our macro doesn't run just yet. This is because while I'd like to say ‘for i from 1 to 10’, that isn't actually valid Arc code. Instead, it needs to be the more C-like:

(for i 1 (<= i 10) ++.i
  ...)

In fixing this I recall the example of Common Lisp's iterate above, and include room for other such mangling:

(mac collect (expr . guards)
  `(accum acc
      ,(apply ingest
              (join (map collect-transform guards)
                    (list `(acc ,expr))))))
 
(def collect-transform (guard)
  (if (is 'for guard.0)
        ; guard is of the form ‘(for _ from _ to _)’
        `(for ,guard.1 ,guard.3 (<= ,guard.1 ,guard.5)
              (++ ,guard.1))
      ; you can insert other syntax here if you want
      'else
        guard))

That's my final version: you can grab all the code, along with instructions to run it, at https://gitlab.com/snippets/1707258.

Making the big picture easy to see, in software and in society at large.
selected
Code
Prose (shorter; favorites)