The new gg format is based on a BNF file which can be found in the `gg`
directory. The code for decoding `.gg` files has been refactored to
mirror that file. The result is more resilient parsing, better errors,
and a greater ability to extend the format in the future.
The new decoder is notable in that it does not use a lexer. Both lexing
and parsing are done in a single step.
The format syntax itself has also been modified. Rather than using
semi-colons everywhere, commas are used as separators in tuples.
Additionally the final comma/semi-colon is no longer required.
This change required OpenEdges to be passed around as pointers, which in
turn required me to audit how value copying is being done everywhere and
simplify it in a few places. I think I covered all the bases.
The new internals of Graph allow the graph's actual structure to be
reflected within the graph itself. For example, if the OpenEdges of two
ValueIns are equivalent then the same OpenEdge pointer is shared for the
two internally. This applies recursively, so if two OpenEdge tuples
share an inner OpenEdge, they will share the same pointer.
This change is a preliminary requirement for creating a graph mapping
operation. Without OpenEdge deduplication the map operation would end up
operating on the same OpenEdge multiple times, which would be incorrect.
The base graph implementation has been moved into its own package,
`graph`, and been made fully generic, ie the value on each vertex/edge
is a parameterized type. This will allow us to use the graph for both
syntax parsing (gg) and runtime evaluation (vm), with each use-case
being able to use slightly different Value types.
The decoder basically works, though there's some quirks in the design
I'll need to marinate one. For example, you can't have a tuple as an
edge value. This is probably fine?
Stringification of Graphs was added to aid in debugging the decoder, the
format it outputs is not the final one. Most likely the (future) encoder
will be used for that purpose.
The decoder is not implemented in the nicest way; it fully reads in the
LexerTokens first, and then processes. This made trying to wrap my head
around the problem a lot easier because it left fewer failure cases, but
it's not the most efficient thing to do.
Now that v0 is done it's pretty plain to see that the decoder could work
by only reading in the next N tokens that it needs at a time. But that
will be left for a future version.