Jul 6, 2014: commit 0, tree-based interpreter for a statement-oriented language
Jul 19, 2017: commit 3930, start of SubX machine code
Sep 20, 2018: started building SubX in SubX
Jul 24, 2019: SubX in SubX done, commit 5461
Oct 2, 2019: started designing the Mu memory-safe language
Oct 29: started http://akkartik.github.io/mu/html/apps/mu.subx.html
https://raw.githubusercontent.com/akkartik/mu/master/mu_instructions
It's not clean. Mu isn't a clean, well-designed language. Because it's designed to map 1:1 with x86, and x86 is not a clean, well-designed instruction set.
But this sort of 1-page summary of a compiler is something I've always wished I had. Something that doesn't tell you what to type out and then pretend you understand compilers.
Basic language is done! Here's factorial. (Compare with SubX.)
Still todo:
- user-defined types
- type checking and memory-safety
In other words, I'm about a third of the way there 😂 More detailed todo list.
(More details on the Mu project. Repo)
I should probably highlight register names. Here's an updated screenshot.
(Yes, in Mu you manually allocate registers. Mu will eventually check your allocation.)
Still no type-checking or memory-safety, but we now have local variables.
http://akkartik.name/post/mu-2019-2
https://github.com/akkartik/mu
It would be really nice to be able to avoid null pointers by construction. But providing opt-in null pointers would require option types.
Option types can be seen as a special case of sum types (tagged unions) but without needing an explicit definition for each unique type. I like Ceylon's generalization, which lets one use types like int|bool
.
One interesting idea here is that anonymous unions are to sum types as tuples are to product types. The only wrinkle: it seems natural to refer to the variants of an anonymous union by type (you can't have int|int
), but tuples by position ((int, int)
is a common use case).
I'm also thinking about Rich Hickey's criticism of Haskell, that it should be possible to pass in an int
to a function that expects an int|bool
. That requires checking types based on their structure rather than their names.
But I'm reluctant to permit passing in a type point3D
to a function expecting a point2D
just because the member names are a superset. Perhaps structure should only be checked for anonymous types.
Should we be able to pass in anonymous types anywhere the language expects a type? In members of user-defined types? Any constraints seem surprising.
By now we're well in the territory of features that I'm not sure will have much adoption. Just because I wanted to provide clean concepts without surprising limitations.
I'm curious to hear where others would draw the line. How much of this seems reasonable, and how much excessive architecture astronaut-ism?
- The Right Way is for product types to be nominative and sum types to be structural.
- Maybe we need tags for product types as well? Then unify types on the names not of types but of their constituent tags, whether sums or products.
e.g. Foo
and Bar
can be automatically coerced in:
type Foo = A int * B boolean
type Bar = A int * B boolean
Still no type-checking or memory-safety, but we can now write any programs with int variables.
There's still no 'var' keyword, so we can't define local variables yet. But that's not insurmountable; just pass in extra arguments for any space you want on the stack 😀
result <- factorial n 0 0 0
Progress has been slow over the holiday season because I've been working on a paper about Mu for https://2020.programming-conference.org/home/salon-2020
But functions can now return outputs.
fn foo a: int -> result/eax: int {
result <- copy a
increment result
}
Sources for the memory-safe language, now at 5kLoC.
Caveats: no checking yet, only int types supported.