Nov 30, 2019
Update on the Mu computer's memory-safe language
In the last 2 weeks I went from being able to translate:
fn foo {
}
to:
fn foo n: int {
increment n
}
That required 2k LoC. So it seems fair to say that things are still in the "black triangles" phase. And there are still gaping holes. Variable declarations don't really exist yet. (I can just parse them.)
(Project page)
permalink
* *
Nov 19, 2019
After banging my head on a problem at work all day, the answer came to me in a flash of insight on the way home. I spent all day repeatedly running experiments on my program, inserting complex sequences of breakpoints, emitting large traces, gradually refining and automating a whole complex workflow so it could be more easily repeated after making changes to my program. I had more ideas for things to try later in the night, but the insight short-circuited them.
One voice in my head (the one often active when interacting in this forum) whispers that if only I had better tools the process could have been shortened.
Another voice in my head whispers that I'm stupid for taking so long to figure out something some putative body else would find obvious. ("If deleting no-op nodes in a dependency graph causes nodes to fire before they're ready, that means some edges are being spuriously cut.") Or maybe I'm rusty, because I don't work anymore with graphs 12 years after finishing grad school.
But the dominant voice in my head is just elation, the flush of insight, of having tamed a small portion of the wilderness around me and inside my own head. And it wouldn't have happened without struggling for a while with the wilderness, no matter what tools I had. A big portion of today was spent trying to visualize graphs and finding them too large for my tools to handle. So I had to resort to progressively more and more precise tools. Text-mode scalpels over graphical assistants. And that process of going beyond what my regular tools can handle is a key characteristic of going out into the wilderness. When tools fail, the only thing left is to try something, see what happens, and think. No improvement in tools can substitute for the experience of having gone beyond your tools, over and over again.
There's a famous saying that insights come to the prepared mind. It's easy to read and watch Bret Victor and imagine that we are in the insight delivery business. But we're really in the mind preparing business.
permalink
* *
Nov 17, 2019
Update on the Mu computer's memory-safe language
Not much to report this week. Last week I implemented the instruction increment x
when x
is on the stack. This week I did x <- increment
when x is a register.
(In Mu, registers written to are return values, while memory locations written to are just passed in by reference.)
The good news: I now have a core data structure in place for code-generating different instructions, and this includes static dispatch.
http://akkartik.name/post/mu-2019-2
permalink
* *
Nov 10, 2019
Update on the Mu computer's memory-safe language
First statements now translating!
http://akkartik.github.io/mu/html/linux/mu.subx.html
fn foo {}
=> function prologue and epilogue
increment x
when x is on the stack.
foo x
when x is on the stack and foo is a user-defined function.
Seems like small progress, but.. http://rampantgames.com/blog/?p=7745
I feel like I'm finally starting to get closure on https://news.ycombinator.com/item?id=13608810#13610366
(Project page)
permalink
* *
Oct 30, 2019
Update on the Mu computer's memory-safe language
I wrote about it last week, but already that post is growing out of date. Here's an initial sketch.
So far all that works is function definitions with no arguments or body. They emit just the prologue and epilogue code sequences.
But I have a sketch of the next few steps of the design in there.
The design has changed in 2 ways.
#1: no more uninitialized variables. There's just no safe way to avoid initialization when structs can contain pointers.
This adds overheads to local variables in two ways: performance, and translator complexity for length-prefixed arrays. We can't just zero out everything, we also have to write array lengths in. Since structs may contain arrays, initializing a variable may involve writing lengths to multiple locations.
#2: unwinding the stack when exiting a scope. I can't just increment the stack pointer, I need to also restore any necessary registers along the way.
Another complication is that I need Java's labeled break
s. Since break
does double-duty for both conditionals and loops, it's too onerous to only be able to break
out of one scope at a time. But labeled break
s increase the frequency with which I need to unwind the stack. More overhead.
The common theme here seems to be that the translator needs to maintain 'symbolic' representations of contiguous memory regions, tracking either values or register bindings for offsets.
I'm not sure how I feel about these choices. Perhaps I should give up on this whole design and switch to something more traditional. A memory-safe C-like language. So I'd love to hear feedback and suggestions.
Project link
permalink
* *
Oct 25, 2019
Starting to wrestle with the problem of safe, efficient array initialization.
Here's Rust
permalink
* *
Oct 21, 2019
Lots of interesting discussion and feedback about
my "level 1" language over the last few days. Now that it's starting to settle down I'm starting to work on my "level 2" language: type-safe, memory-safe, manually register-allocated, maps mostly 1:1 to machine code.
Since I'm building it out of machine code, memory management is a perennial concern. My parsing in level 1 has mostly used static arrays. But now I think I'm going to switch to dynamic linked lists and trees. Leak some memory.
permalink
* *
Oct 15, 2019
Update on the Mu computer
I just wrote up a summary of the state of Mu, in two parts.
Part 1 summarizes the past year as a sequence of major design decisions:
http://akkartik.name/post/mu-2019-1
Part 2 is a sketch of what I plan to build next, again structured as a sequence of design decisions:
http://akkartik.name/post/mu-2019-2
(The flow from design constraints to decisions is inspired by Christopher Alexander.)
Any and all feedback appreciated. I'd like it to be clear to any programmer.
permalink
* *
Oct 2, 2019
I'm thinking about
https://zge.us.to/dirconf.html
What if cat
ing a directory rendered its contents as a structured file?
First reaction: get rid of directories altogether. But it seems useful to firewall off different kinds of content from each other.
Still, the file system could support treating files as dirs.
It seems useful to have consistent lexical conventions spanning paths and code: '#' for comments; '.' for lookup; '/' for metadata. E.g. to look up gitconfig:
cat ~.conf.git.core.pager
Metadata is a new idea here; I use it extensively in my Mu project. In this context, one possible use for it is extensions. Rewriting the above example:
cat ~.conf.git/yaml.core.pager
Swapping the usual meanings of '/' and '.' in Unix seems maximally confusing. I'm choosing here to preserve the meaning of '.' in source code. But that may be the wrong choice.
Anyways, back to metadata. It permits multiple extensions. Say for a MS Word doc:
thesis/doc/docx/doc-v2007
permalink
* *
Sep 30, 2019
I'm poking at
https://github.com/ozkl/soso trying to figure out where it first switches from Ring 0 to Ring 3. I want to rip out all of that protection stuff and just run everything in Ring 0. Just as an exercise for starters, but also eventually because I have.. notions.
After various attempts to grep, the current plan: I'm going to just try to write to some protected address at various points in the kernel, and binary-search my way to the solution. Let's see how this goes.
https://github.com/akkartik/mu#readme
On the language side I've been thrashing a fair bit:
- Between the OS side vs the language side.
- Between building an interpreted, dynamically-typed language in machine code vs a compiled, statically-typed memory-safe language that I can build the interpreted language out of.
- Between building a Lisp interpreter vs a better shell (in the spirit of Oil shell).
- Between making local vars in the compiled language function-scoped vs block-scoped.
permalink
* *