A R7RS-small implementation in Rust.
If anything useful comes out of this, albeit very unlikely, it will be 100% thanks to these person/resources:
There is a small REPL you can simply access by running the default binary:
$ cargo run user> ((lambda (x) (/ x 4)) 28) 7 user> ((lambda (x) (/ x 4)) 28.1) 7.025
The old plan is left below to let everyone see my analysis struggle. Perfect is the enemy of good, and Pest definitely gives me good enough results and error recovery to work with. So instead I'll write my own AST type and a conversion function that walks through Pest pairs to rebuild a proper AST from there. This way the "Pest leaking" issue will disappear.
The parser is handwritten.
The first version was based on Pest, but I had multiple issues:
At that point, I had the choice between handwriting the parser, and using a parser combinator library like nom or chumsky. I chose to use a parser combinator library.
At first I thought that handwriting the parser was the way to go because:
After spending a few minutes writing the scanner by hand, it became obvious that
the syntax of the reference document for the formal syntax and grammar is much
easier translated with a parser combinator library rather than handwriting
everything and redoing the rules myself. For example,
token is supposed to
boolean before matching
#u8(, but at the same time there are
rules in the middle, like
string. So that means I need to split
the logic that deals with
# when it's encountered in a stream. Another small
issue I have is that I had to wrap everything in a
unicode_segmentation::Graphemes iterator to deal properly with Unicode: that
might still be an issue with parser combinator libs though.
Granted, I lose the immediate payoff of having crazy good recovery and error handling, but that can be added in a later step.
For the time being, I plan on using
rust-gc to obtain garbage collection
on my values. At first I thought I could only wrap the runtime rust values in
Gc<GcCell<>> and be done with it, but actually this doesn't support really
well the continuations I want to implement; continuations has been the main
roadblock that triggered the rewrite.
The current plan for memory management is to host the runtime stack on the garbage-collected heap, so that closures (using Upvalues like Lua and Lox in Crafting Interpreters) and continuations (by actually managing StackFrames manually on the garbage-collected heap) can be more easily done, even if performance is trash.
That would mean everything would still be
T would not
only be runtime values like before, but also stack and tracing-related things
This rewrite is almost from scratch, so the roadmap will move slowly
For each extra bit of compliance we go through, usually the parser (which started from the standard spec) gets enhanced to give better lexing information so that the core can make adjustments easily.
This will probably need to be done with a trampoline or something
We want to have native types that are easy to manipulate. The proof of concept/motivating example will be adding a Rope-based structure to deal with long strings.
R7 will start as an interpreter only