parent
ae2f854bca
commit
13737cf85d
@ -0,0 +1,222 @@ |
|||||||
|
--- |
||||||
|
title: >- |
||||||
|
Ginger: A Small VM Update |
||||||
|
description: >- |
||||||
|
It works gooder now. |
||||||
|
tags: tech |
||||||
|
series: ginger |
||||||
|
--- |
||||||
|
|
||||||
|
During some recent traveling I had to be pulled away from cryptic-net work for a |
||||||
|
while. Instead I managed to spend a few free hours, and the odd international |
||||||
|
plane ride, to fix the ginger vm. |
||||||
|
|
||||||
|
The problem, as it stood, was that it only functioned "correctly" in a very |
||||||
|
accidental sense. I knew from the moment that I published it that it would get |
||||||
|
mostly rewritten immediately. |
||||||
|
|
||||||
|
And so here we are, with a rewritten vm and some new realizations. |
||||||
|
|
||||||
|
## Operation |
||||||
|
|
||||||
|
The `Operation` type was previously defined like so: |
||||||
|
|
||||||
|
``` |
||||||
|
type Operation interface { |
||||||
|
Perform([]Thunk, Operation) (Thunk, error) |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
I'm not going to explain it, because it's both confusing and wrong. |
||||||
|
|
||||||
|
One thing that is helpful in a refactor, especially in a strongly typed |
||||||
|
language, is to tag certain interfaces as being axiomatic, and conforming the |
||||||
|
rest of your changes around those. If those interfaces are simple enough to |
||||||
|
apply broadly _and_ accurately describe desired behavior, they will help |
||||||
|
pre-decide many difficult decisions you'd otherwise have to make. |
||||||
|
|
||||||
|
So with that mind, I tagged `Operation` as being an axiomatic interface, given |
||||||
|
that ginger is aiming to be a functional language (and I'm wondering if I should |
||||||
|
just rename `Operation` to `Function`, while I'm at it). The new definition of |
||||||
|
the interface is: |
||||||
|
|
||||||
|
``` |
||||||
|
type Operation interface { |
||||||
|
Perform(Value) Value |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
`Operation` takes and argument and returns a result, it could not possibly be |
||||||
|
boiled down any further. By holding `Operation` to this definition and making |
||||||
|
decisions from there, it was pretty clear what the next point of attack was. |
||||||
|
|
||||||
|
## If/Recur |
||||||
|
|
||||||
|
The reason that `Operation` had previously been defined in such a fucked up way |
||||||
|
was to support the `if` and `recur` `Operation`s, as if they weren't different |
||||||
|
than any other `Operation`s. But truthfully they are different, as they are |
||||||
|
actually control flow constructs, and so require capabilities that no other |
||||||
|
`Operation` would be allowed to use anyway. |
||||||
|
|
||||||
|
The new implementation reflects this. `if` and `recur` are now both handled |
||||||
|
directly by the compiler, while global `Operation`s like `tupEl` are |
||||||
|
implementations of the `Operation` interface. |
||||||
|
|
||||||
|
## Compile Step |
||||||
|
|
||||||
|
The previous iteration of the vm hadn't distinguished between a compile step and |
||||||
|
a run step. In a way it did both at the same time, by abusing the `Thunk` type. |
||||||
|
Separating the two steps, and ditching the `Thunk` type in the process, was the |
||||||
|
next major change in the refactoring. |
||||||
|
|
||||||
|
The compile step can be modeled as a function which takes a `Graph` and returns |
||||||
|
an `Operation`, where the `Graph`'s `in` and `out` names correspond to the |
||||||
|
`Operation`'s argument and return, respectively. The run step then reads an |
||||||
|
input from the user, calls the compiled `Operation` with that input, and outputs |
||||||
|
the result back to the user. |
||||||
|
|
||||||
|
As an example, given the following program: |
||||||
|
|
||||||
|
``` |
||||||
|
* six-or-more.gg |
||||||
|
|
||||||
|
max = { |
||||||
|
a = tupEl < (in, 0) |
||||||
|
b = tupEl < (in, 1) |
||||||
|
out = if < (gt < (a, b), a, b) |
||||||
|
} |
||||||
|
|
||||||
|
out = max < (in, 6) |
||||||
|
``` |
||||||
|
|
||||||
|
we want to compile an `Operation` which accepts a number and returns the greater |
||||||
|
of that number and 6. I'm going to use anonymous go functions to demonstrate the |
||||||
|
anatomy of the compiled `Operation`, as that's what's happening in the current |
||||||
|
compiler anyway. |
||||||
|
|
||||||
|
``` |
||||||
|
// After compilation, this function will be in-memory and usable as an |
||||||
|
// Operation. |
||||||
|
|
||||||
|
sixOrMore := func(in Value) Value { |
||||||
|
|
||||||
|
max := func(in Value) Value { |
||||||
|
|
||||||
|
a := tupEl(in, 0) |
||||||
|
b := tupEl(in, 1) |
||||||
|
|
||||||
|
if a > b { |
||||||
|
return a |
||||||
|
} |
||||||
|
|
||||||
|
return b |
||||||
|
} |
||||||
|
|
||||||
|
return max(in, 6) |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
Or at least, this is what I tried for _initially_. What I found was that it was |
||||||
|
easier, in the context of how `graph.MapReduce` works, to make even the leaf |
||||||
|
values, e.g. `in`, `0`, `1`, and `6`, map to `Operations` as well. `in` is |
||||||
|
replaced with an anonymous function which returns its argument, and the numbers |
||||||
|
are replaced with anonymous functions which ignore their argument and always |
||||||
|
return their respective number. |
||||||
|
|
||||||
|
So the compiled `Operation` looks more like this: |
||||||
|
|
||||||
|
``` |
||||||
|
// After compilation, this function will be in-memory and usable as an |
||||||
|
// Operation. |
||||||
|
|
||||||
|
sixOrMore := func(in Value) Value { |
||||||
|
|
||||||
|
max := func(in Value) Value { |
||||||
|
|
||||||
|
a := tupEl( |
||||||
|
func(in Value) Value { return in }(in), |
||||||
|
func(_ Value) Value { return 0}(in), |
||||||
|
) |
||||||
|
|
||||||
|
b := tupEl( |
||||||
|
func(in Value) Value { return in }(in), |
||||||
|
func(_ Value) Value { return 1}(in), |
||||||
|
) |
||||||
|
|
||||||
|
if a > b { |
||||||
|
return a |
||||||
|
} |
||||||
|
|
||||||
|
return b |
||||||
|
} |
||||||
|
|
||||||
|
return max( |
||||||
|
func(in Value) Value { return in }(in), |
||||||
|
func(_ Value) Value { return 6}(in), |
||||||
|
) |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
This added layer of indirection for all leaf values is not great for |
||||||
|
performance, and there's probably further refactoring which could be done to |
||||||
|
make the result look more like the original ideal. |
||||||
|
|
||||||
|
To make things a bit messier, even that representation isn't quite accurate to |
||||||
|
the current result. The compiler doesn't properly de-duplicate work when |
||||||
|
following name values. In other words, everytime `a` is referenced within `max`, |
||||||
|
the `Operation` which the compiler produces will recompute `a` via `tupEl`. |
||||||
|
|
||||||
|
So the _actual_ compiled `Operation` looks more like this: |
||||||
|
|
||||||
|
``` |
||||||
|
// After compilation, this function will be in-memory and usable as an |
||||||
|
// Operation. |
||||||
|
|
||||||
|
sixOrMore := func(in Value) Value { |
||||||
|
|
||||||
|
return func(in Value) Value { |
||||||
|
|
||||||
|
if tupEl(func(in Value) Value { return in }(in), func(_ Value) Value { return 0}(in)) > |
||||||
|
tupEl(func(in Value) Value { return in }(in), func(_ Value) Value { return 1}(in)) { |
||||||
|
|
||||||
|
return tupEl(func(in Value) Value { return in }(in), func(_ Value) Value { return 0}(in)) |
||||||
|
} |
||||||
|
|
||||||
|
return tupEl(func(in Value) Value { return in }(in), func(_ Value) Value { return 1}(in)) |
||||||
|
}( |
||||||
|
func(in Value) Value { return in }(in), |
||||||
|
func(_ Value) Value { return 6}(in), |
||||||
|
) |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
Clearly, there's some optimization to be done still. |
||||||
|
|
||||||
|
## Results |
||||||
|
|
||||||
|
While it's still not perfect, the new implementation is far and away better than |
||||||
|
the old. This can be seen just in the performance for the fibonacci program: |
||||||
|
|
||||||
|
``` |
||||||
|
# Previous VM |
||||||
|
|
||||||
|
$ time ./eval "$(cat examples/fib.gg)" 10 |
||||||
|
55 |
||||||
|
|
||||||
|
real 0m8.737s |
||||||
|
user 0m9.871s |
||||||
|
sys 0m0.309s |
||||||
|
``` |
||||||
|
|
||||||
|
``` |
||||||
|
# New VM |
||||||
|
|
||||||
|
$ time ./eval "$(cat examples/fib.gg)" 50 |
||||||
|
12586269025 |
||||||
|
|
||||||
|
real 0m0.003s |
||||||
|
user 0m0.003s |
||||||
|
sys 0m0.000s |
||||||
|
``` |
||||||
|
|
||||||
|
They're not even comparable. |
Loading…
Reference in new issue