Remove a bunch of old code, update the README

This commit is contained in:
Brian Picciano 2021-08-26 21:25:39 -06:00
parent fed2c35868
commit 3f28c60ab8
20 changed files with 21 additions and 3360 deletions

5
BUILD
View File

@ -1,5 +0,0 @@
RELEASE=RELEASE_381 # this may have to be changed based on llvm version
svn co https://llvm.org/svn/llvm-project/llvm/tags/$RELEASE/final $GOPATH/src/llvm.org/llvm
cd $GOPATH/src/llvm.org/llvm/bindings/go
./build.sh
go install llvm.org/llvm/bindings/go/llvm

431
NOTES
View File

@ -1,431 +0,0 @@
Been thinking about the stack and heap a lot. It would be possible, though
possibly painful, to enforce a language with no global heap. The question really
is: what are the principles which give reason to do so? What are the principles
of this language, period? The principles are different than the use-cases. They
don't need to be logically rigorous (at first anyway).
##########
I need to prioritize the future of this project a bit more. I've been thinking
I'm going to figure this thing out at this level, but I shouldn't even be
working here without a higher level view.
I can't finish this project without financial help. I don't think I can get a v0
up without financial help. What this means at minimum, no matter what, I'm going
to have to:
- Develop a full concept of the language that can get it to where I want to go
- Figure out where I want it to go
- Write the concept into a manifesto of the language
- Write the concept into a proposal for course of action to take in developing
the language further
I'm unsure about what this language actually is, or is actually going to look
like, but I'm sure of those things. So those are the lowest hanging fruit, and I
should start working on them pronto. It's likely I'll need to experiment with
some ideas which will require coding, and maybe even some big ideas, but those
should all be done under the auspices of developing the concepts of the
language, and not the compiler of the language itself.
#########
Elemental types:
* Tuples
* Arrays
* Integers
#########
Been doing thinking and research on ginger's elemental types and what their
properties should be. Ran into roadblock where I was asking myself these
questions:
* Can I do this without atoms?
* What are different ways atoms can be encoded?
* Can I define language types (elementals) without defining an encoding for
them?
I also came up with two new possible types:
* Stream, effectively an interface which produces discreet packets (each has a
length), where the production of one packet indicates the size of the next one
at the same time.
* Tagged, sort of like a stream, effectively a type which says "We don't know
what this will be at compile-time, but we know it will be prefixed with some
kind of tag indicating its type and size.
* Maybe only the size is important
* Maybe precludes user defined types that aren't composites of the
elementals? Maybe that's ok?
Ran into this:
https://www.ps.uni-saarland.de/~duchier/python/continuations.htm://www.ps.uni-saarland.de/~duchier/python/continuations.html
https://en.wikipedia.org/wiki/Continuation#First-class_continuations
which is interesting. A lot of my problems now are derived from stack-based
systems and their need for knowing the size input and output data, continuations
seem to be an alternative system?
I found this:
http://lambda-the-ultimate.org/node/4512
I don't understand any of it, I should definitely learn feather
I should finish reading this:
http://www.blackhat.com/presentations/bh-usa-07/Ferguson/Whitepaper/bh-usa-07-ferguson-WP.pdf
#########
Ok, so I'm back at this for the first time in a while, and I've got a good thing
going. The vm package is working out well, Using tuples and atoms as the basis
of a language is pretty effective (thanks erlang!). I've got basic variable
assignment working as well. No functions yet. Here's the things I still need to
figure out or implement:
* lang
* constant size arrays
* using them for a "do" macro
* figure out constant, string, int, etc... look at what erlang's actual
primitive types are for a hint
* figure out all needed macros for creating and working with lang types
* vm
* figure out the differentiation between compiler macros and runtime calls
* probably separate the two into two separate call systems
* the current use of varCtx is still pretty ugly, the do macro might help
clean it up
* functions
* are they a primitive? I guess so....
* declaration and type
* variable deconstruction
* scoping/closures
* compiler macros, need vm's Run to output a lang.Term
* need to learn about linking
* figure out how to include llvm library in compiled binary and make it
callable. runtime macros will come from this
* linking in of other ginger code? or how to import in general
* comiler, a general purpose binary for taking ginger code and turning it
into machine code using the vm package
* swappable syntax, including syntax-dependent macros
* close the loop?
############
I really want contexts to work. They _feel_ right, as far as abstractions go.
And they're clean, if I can work out the details.
Just had a stupid idea, might as well write it down though.
Similar to how the DNA and RNA in our cells work, each Context is created with
some starting set of data on it. This will be the initial protein block. Based
on the data there some set of Statements (the RNA) will "latch" on and do
whatever work they're programmed to do. That work could include making new
Contexts and "releasing" them into the ether, where they would get latched onto
(or not).
There's so many problems with this idea, it's not even a little viable. But here
goes:
* Order of execution becomes super duper fuzzy. It would be really difficult to
think about how your program is actually going to work.
* Having Statement sets just latch onto Contexts is super janky. They would get
registered I guess, and it would be pretty straightforward to differentiate
one Context from another, but what about conflicts? If two Statements want to
latch onto the same Context then what? If we wanted to keep the metaphor one
would just get randomly chosen over the other, but obviously that's insane.
############
I explained some of this to ibrahim already, but I might as well get it all
down, cause I've expanded on it a bit since.
Basically, ops (functions) are fucking everything up. The biggest reason for
this is that they are really really hard to implement without a type annotation
system. The previous big braindump is about that, but basically I can't figure
out a way that feels clean and good enough to be called a "solution" to type
inference. I really don't want to have to add type annotations just to support
functions, at least not until I explore all of my options.
The only other option I've come up with so far is the context thing. It's nice
because it covers a lot of ground without adding a lot of complexity. Really the
biggest problem with it is it doesn't allow for creating new things which look
like operations. Instead, everything is done with the %do operator, which feels
janky.
One solution I just thought of is to get rid of the %do operator and simply make
it so that a list of Statements can be used as the operator in another
Statement. This would _probably_ allow for everything that I want to do. One
outstanding problem I'm facing is figuring out if all Statements should take a
Context or not.
* If they did it would be a lot more explicit what's going on. There wouldn't be
an ethereal "this context" that would need to be managed and thought about. It
would also make things like using a set of Statements as an operator a lot
more straightforward, since without Contexts in the Statement it'll be weird
to "do" a set of Statements in another Context.
* On the other hand, it's quite a bit more boilerplate. For the most part most
Statements are going to want to be run in "this" context. Also this wouldn't
really decrease the number of necessary macros, since one would still be
needed in order to retrieve the "root" Context.
* One option would be for a Statement's Context to be optional. I don't really
like this option, it makes a very fundamental datatype (a Statement) a bit
fuzzier.
* Another thing to think about is that I might just rethink how %bind works so
that it doesn't operate on an ethereal "this" Context. %ctxbind is one attempt
at this, but there's probably other ways.
* One issue I just thought of with having a set of Statements be used as an
operator is that the argument to that Statement becomes.... weird. What even
is it? Something the set of Statements can access somehow? Then we still need
something like the %in operator.
Let me backtrack a bit. What's the actual problem? The actual thing I'm
struggling with is allowing for code re-use, specifically pure functions. I
don't think there's any way anyone could argue that pure functions are not an
effective building block in all of programming, so I think I can make that my
statement of faith: pure functions are good and worthwhile, impure functions
are.... fine.
Implementing them, however, is quite difficult. Moreso than I thought it would
be. The big inhibitor is the method by which I actually pass input data into the
function's body. From an implementation standpoint it's difficult because I
*need* to know how many bytes on the stack the arguments take up. From a syntax
standpoint this is difficult without a type annotation system. And from a
usability standpoint this is difficult because it's a task the programmer has to
do which doesn't really have to do with the actual purpose or content of the
function, it's just a book-keeping exercise.
So the stack is what's screwing us over here. It's a nice idea, but ultimately
makes what we're trying to do difficult. I'm not sure if there's ever going to
be a method of implementing pure functions that doesn't involve argument/return
value copying though, and therefore which doesn't involve knowing the byte size
of your arguments ahead of time.
It's probably not worth backtracking this much either. For starters, cpus are
heavily optimized for stack based operations, and much of the way we currently
think about programming is also based on the stack. It would take a lot of
backtracking if we ever moved to something else, if there even is anything else
worth moving to.
If that's the case, how is the stack actually used then?
* There's a stack pointer which points at an address on the stack, the stack
being a contiguous range of memory addresses. The place the stack points to is
the "top" of the stack, all higher addresses are considered unused (no matter
what's in them). All the values in the stack are available to the currently
executing code, it simply needs to know either their absolute address or their
relative position to the stack pointer.
* When a function is "called" the arguments to it are copied onto the top of the
stack, the stack pointer is increased to reflect the new stack height, and the
function's body is jumped to. Inside the body the function need only pop
values off the stack as it expects them, as long as it was called properly it
doesn't matter how or when the function was called. Once it's done operating
the function ensures all the input values have been popped off the stack, and
subsequently pushes the return values onto the stack, and jumps back to the
caller (the return address was also stored on the stack).
That's not quite right, but it's close enough for most cases. The more I'm
reading about this the more I think it's not going to be worth it to backtrack
passed the stack. There's a lot of compiler and machine specific crap that gets
involved at that low of a level, and I don't think it's worth getting into it.
LLVM did all of that for me, I should learn how to make use of that to make what
I want happen.
But what do I actually want? That's the hard part. I guess I've come full
circle. I pretty much *need* to use llvm functions. But I can't do it without
declaring the types ahead of time. Ugghh.
################################
So here's the current problem:
I have the concept of a list of statements representing a code block. It's
possible/probable that more than this will be needed to represent a code block,
but we'll see.
There's two different ways I think it's logical to use a block:
* As a way of running statements within a new context which inherits all of its
bindings from the parent. This would be used for things like if statements and
loops, and behaves the way a code block behaves in most other languages.
* To define a operator body. An operator's body is effectively the same as the
first use-case, except that it has input/output as well. An operator can be
bound to an identifier and used in any statement.
So the hard part, really, is that second point. I have the first done already.
The second one isn't too hard to "fake" using our current context system, but it
can't be made to be used as an operator in a statement. Here's how to fake it
though:
* Define the list of statements
* Make a new context
* Bind the "input" bindings into the new context
* Run %do with that new context and list of statements
* Pull the "output" bindings out of that new context
And that's it. It's a bit complicated but it ultimately works and effectively
inlines a function call.
It's important that this looks like a normal operator call though, because I
believe in guy steele. Here's the current problems I'm having:
* Defining the input/output values is the big one. In the inline method those
were defined implicitly based on what the statements actually use, and the
compiler would fail if any were missing or the wrong type. But here we ideally
want to define an actual llvm function and not inline everytime. So we need to
somehow "know" what the input/output is, and their types.
* The output value isn't actually *that* difficult. We just look at the
output type of the last statement in the list and use that.
* The input is where it gets tricky. One idea would be to use a statement
with no input as the first statement in the list, and that would define
the input type. The way macros work this could potentially "just work",
but it's tricky.
* It would also be kind of difficult to make work with operators that take
in multiple parameters too. For example, `bind A, 1` would be the normal
syntax for binding, but if we want to bind an input value it gets weirder.
* We could use a "future" kind of syntax, like `bind A, _` or something
like that, but that would requre a new expression type and also just
be kind of weird.
* We could have a single macro which always returns the input, like
`%in` or something. So the bind would become `bind A, %in` or
`bind (A, B), %in` if we ever get destructuring. This isn't a terrible
solution, though a bit unfortunate in that it could get confusing with
different operators all using the same input variable effectively. It
also might be a bit difficult to implement, since it kind of forces us
to only have a single argument to the LLVM function? Hard to say how
that would work. Possibly all llvm functions could be made to take in
a struct, but that would be ghetto af. Not doing a struct would take a
special interaction though.... It might not be possible to do this
without a struct =/
* Somehow allowing to define the context which gets used on each call to the
operator, instead of always using a blank one, would be nice.
* The big part of this problem is actually the syntax for calling the
operator. It's pretty easy to have this handled within the operator by the
%thisctx macro. But we want the operator to be callable by the same syntax
as all other operator calls, and currently that doesn't have any way of
passing in a new context.
* Additionally, if we're implementing the operator as an LLVM function then
there's not really any way to pass in that context to it without making
those variables global or something, which is shitty.
* So writing all this out it really feels like I'm dealing with two separate
types that just happen to look similar:
* Block: a list of statements which run with a variable context.
* Operator: a list of statements which run with a fixed (empty?) context,
and have input/output.
* There's so very nearly a symmetry there. Things that are inconsistent:
* A block doesn't have input/output
* It sort of does, in the form of the context it's being run with and
%ctxget, but not an explicit input/output like the operator has.
* If this could be reconciled I think this whole shitshow could be made
to have some consistency.
* Using %in this pretty much "just works". But it's still weird. Really
we'd want to turn the block into a one-off operator everytime we use
it. This is possible.
* An operator's context must be empty
* It doesn't *have* to be, defining the ctx which goes with the operator
could be part of however an operator is created.
* So after all of that, I think operators and blocks are kind of the same.
* They both use %in to take in input, and both output using the last statement
in their list of statements.
* They both have a context bound to them, operators are fixed but a block
changes.
* An operator is a block with a bound context.
##############@@@@@@@@@#$%^&^%$#@#$%^&*
* New problem: type inference. LLVM requires that a function's definition have
the type specified up-front. This kind of blows. Well actually, it blows a lot
more than kind of. There's two things that need to be infered from a List of
Statements then: the input type and the output type. There's two approaches
I've thought of in the current setup.
* There's two approaches to determining the type of an operator: analyze the
code as ginger expressions, or build the actual llvm structures and
analyze those.
* Looking at the ginger expressions is definitely somewhat fuzzy. We can
look at all the statements and sub-statements until we find an
instance of %in, then look at what that's in input into. But if it's
simply binding into an Identifier then we have to find the identifier.
If it's destructuring then that gets even *more* complicated.
* Destructuring is what really makes this approach difficult.
Presumably there's going to be a function that takes in an
Identifier (or %in I guess?) and a set of Statements and returns
the type for that Identifier. If we find that %in is destructured
into a tuple then we would run that function for each constituent
Identifier and put it all together. But then this inference
function is really coupled to %bind, which kind of blows. Also we
may one day want to support destructuring into non-tuples as well,
which would make this even harder.
* We could make it the job of the macro definition to know its input
and output types, as well as the types of any bindings it makes.
That places some burden on user macros in the future, but then
maybe it can be inferred for user macros? That's a lot of hope. It
would also mean the macro would need the full set of statements
that will ever run in the same Context as it, so it can determine
the types of any bindings it makes.
* The second method is to build the statements into LLVM structures and
then look at those structures. This has the benefit of being
non-ambiguous once we actually find the answer. LLVM is super strongly
typed, and re-iterates the types involved for every operation. So if
the llvm builder builds it then we need only look for the first usage
of every argument/return and we'll know the types involved.
* This requires us to use structs for tuples, and not actually use
multiple arguments. Otherwise it won't be possible to know the
difference between a 3 argument function and a 4 argument one
which doesn't use its 4th argument (which shouldn't really happen,
but could).
* The main hinderence is that the llvm builder is really not
designed for this sort of thing. We could conceivably create a
"dummy" function with bogus types and write the body, analyze the
body, erase the function, and start over with a non-dummy
function. But it's the "analyze the body" step that's difficult.
It's difficult to find the types of things without the llvm.Value
objects in hand, but since building is set up as a recursive
process that becomes non-trivial. This really feels like the way
to go though, I think it's actually doable.
* This could be something we tack onto llvmVal, and then make
Build return extra data about what types the Statements it
handled input and output.
* For other setups that would enable this a bit better, the one that keeps
coming to mind is a more pipeline style system. Things like %bind would need
to be refactored from something that takes a Tuple to something that only
takes an Identifier and returns a macro which will bind to that Identifier.
This doesn't *really* solve the type problem I guess, since whatever is input
into the Identifier's bind doesn't necessarily have a type attached to it.
Sooo yeah nvm.

132
README.md
View File

@ -1,118 +1,28 @@
# Ginger - holy fuck again?
# Ginger
## The final result. A language which can do X
Fibonacci function in ginger:
- Support my OS
- Compile on many architectures
- Be low level and fast (effectively c-level)
- Be well defined, using a simple syntax
- Extensible based on which section of the OS I'm working on
- Good error messages
```
fib {
decr { out add(in, -1) }
- Support other programmers and other programming areas
- Effectively means able to be used in most purposes
- Able to be quickly learned
- Able to be shared
- Free
- New or improved components shared between computers/platforms/people
out {
n 0(in),
a 1(in),
b 2(in),
- Support itself
- Garner a team to work on the compiler
- Team must not require my help for day-to-day
- Team must stick to the manifesto, either through the design or through
trust
out if(
zero?(n),
a,
recur(decr(n), b, add(a,b))
)
## The language: A manifesto, defines the concept of the language
}(in, 0, 1)
}
```
- Quips
- Easier is not better
Usage of the function to generate the 6th fibonnaci number:
- Data as the language
- Differentiation between "syntax" and "language", parser vs compiler
- Syntax defines the form which is parsed
- The parser reads the syntax forms into data structures
- Language defines how the syntax is read into data structures and
"understood" (i.e. and what is done with those structures).
- A language maybe have multiple syntaxes, if they all parse into
the same underlying data structures they can be understood in the
same way.
- A compiler turns the parsed language into machine code. An
interpreter performs actions directly based off of the parsed
language.
- Types, instances, and operations
- A language has a set of elemental types, and composite types
- "The type defines the [fundamental] operations that can be done on the
data, the meaning of the data, and the way values of that type can be
stored"
- Elemental types are all forms of numbers, since numbers are all a
computer really knows
- Composite types take two forms:
- Homogeneous: all composed values are the same type (arrays)
- Heterogeneous: all composed values are different
- If known size and known types per-index, tuples
- A 0-tuple is kind of special, and simply indicates absence of
any value.
- A third type, Any, indicates that the type is unknown at compile-time.
Type information must be passed around with it at runtime.
- An operation has an input and output. It does some action on the input
to produce the output (presumably). An operation may be performed as
many times as needed, given any value of the input type. The types of
both the input and output are constant, and together they form the
operation's type.
- A value is an instance of a type, where the type is known at compile-time
(though the type may be Any). Multiple values may be instances of the same
type. E.g.: 1 and 2 are both instances of int
- A value is immutable
- TODO value is a weird word, since an instance of a type has both a
type and value. I need to think about this more. Instance might be a
better name
- Stack and scope
- A function call operates within a scope. The scope had arguments passed
into it.
- When a function calls another, that other's scope is said to be "inside"
the caller's scope.
- A pure function only operates on the arguments passed into it.
- A pointer allows for modification outside of the current scope, but only a
pointer into an outer scope. A function which does this is "impure"
- Built-in
- Elementals
- ints (n-bit)
- tuples
- stack arrays
- indexable
- head/tail
- reversible (?)
- appendable
- functions (?)
- pointers (?)
- Any (?)
- Elementals must be enough to define the type of a variable
- Ability to create and modify elmental types
- immutable, pure functions
- Other builtin functionality:
- Load/call linked libraries
- Comiletime macros
- Red/Blue
- Questions
- Strings need to be defined in terms of the built-in types, which would be
an array of lists. But this means I'm married to that definition of a
string, it'd be difficult for anyone to define their own and have it
interop. Unless "int" was some kind of macro type that did some fancy
shit, but that's kind of gross.
- An idea of the "equality" of two variables being tied not just to their
value but to the context in which they were created. Would aid in things
like compiler tagging.
- There's a "requirement loop" of things which need figuring out:
- function structure
- types
- seq type
- stack/scope
- Most likely I'm going to need some kind of elemental type to indicate
something should happen at compile-time and not runtime, or the other way
around.
## The roadmap: A plan of action for tackling the language
```
fib(5)
```

View File

@ -1,219 +0,0 @@
package expr
import (
"fmt"
"log"
"llvm.org/llvm/bindings/go/llvm"
)
func init() {
log.Printf("initializing llvm")
llvm.LinkInMCJIT()
llvm.InitializeNativeTarget()
llvm.InitializeNativeAsmPrinter()
}
type BuildCtx struct {
B llvm.Builder
M llvm.Module
}
func NewBuildCtx(moduleName string) BuildCtx {
return BuildCtx{
B: llvm.NewBuilder(),
M: llvm.NewModule(moduleName),
}
}
func (bctx BuildCtx) Build(ctx Ctx, stmts ...Statement) llvm.Value {
var lastVal llvm.Value
for _, stmt := range stmts {
if e := bctx.BuildStmt(ctx, stmt); e != nil {
if lv, ok := e.(llvmVal); ok {
lastVal = llvm.Value(lv)
} else {
log.Printf("BuildStmt returned non llvmVal from %v: %v (%T)", stmt, e, e)
}
}
}
if (lastVal == llvm.Value{}) {
lastVal = bctx.B.CreateRetVoid()
}
return lastVal
}
func (bctx BuildCtx) BuildStmt(ctx Ctx, s Statement) Expr {
log.Printf("building: %v", s)
switch o := s.Op.(type) {
case Macro:
return ctx.Macro(o)(bctx, ctx, s.Arg)
case Identifier:
s2 := s
s2.Op = ctx.Identifier(o).(llvmVal)
return bctx.BuildStmt(ctx, s2)
case Statement:
s2 := s
s2.Op = bctx.BuildStmt(ctx, o)
return bctx.BuildStmt(ctx, s2)
case llvmVal:
arg := bctx.buildExpr(ctx, s.Arg).(llvmVal)
out := bctx.B.CreateCall(llvm.Value(o), []llvm.Value{llvm.Value(arg)}, "")
return llvmVal(out)
default:
panic(fmt.Sprintf("non op type %v (%T)", s.Op, s.Op))
}
}
// may return nil if e is a Statement which has no return
func (bctx BuildCtx) buildExpr(ctx Ctx, e Expr) Expr {
return bctx.buildExprTill(ctx, e, func(Expr) bool { return false })
}
// like buildExpr, but will stop short and stop recursing when the function
// returns true
func (bctx BuildCtx) buildExprTill(ctx Ctx, e Expr, fn func(e Expr) bool) Expr {
if fn(e) {
return e
}
switch ea := e.(type) {
case llvmVal:
return e
case Int:
return llvmVal(llvm.ConstInt(llvm.Int64Type(), uint64(ea), false))
case Identifier:
return ctx.Identifier(ea)
case Statement:
return bctx.BuildStmt(ctx, ea)
case Tuple:
// if the tuple is empty then it is a void
if len(ea) == 0 {
return llvmVal(llvm.Undef(llvm.VoidType()))
}
ea2 := make(Tuple, len(ea))
for i := range ea {
ea2[i] = bctx.buildExprTill(ctx, ea[i], fn)
}
// if the fields of the tuple are all llvmVal then we can make a proper
// struct
vals := make([]llvm.Value, len(ea2))
typs := make([]llvm.Type, len(ea2))
for i := range ea2 {
if v, ok := ea2[i].(llvmVal); ok {
val := llvm.Value(v)
vals[i] = val
typs[i] = val.Type()
} else {
return ea2
}
}
str := llvm.Undef(llvm.StructType(typs, false))
for i := range vals {
str = bctx.B.CreateInsertValue(str, vals[i], i, "")
}
return llvmVal(str)
case List:
ea2 := make(Tuple, len(ea))
for i := range ea {
ea2[i] = bctx.buildExprTill(ctx, ea[i], fn)
}
return ea2
case Ctx:
return ea
default:
panicf("%v (type %T) can't express a value", ea, ea)
}
panic("go is dumb")
}
func (bctx BuildCtx) buildVal(ctx Ctx, e Expr) llvm.Value {
return llvm.Value(bctx.buildExpr(ctx, e).(llvmVal))
}
// globalCtx describes what's available to *all* contexts, and is what all
// contexts should have as the root parent in the tree.
//
// We define in this weird way cause NewCtx actually references globalCtx
var globalCtx *Ctx
var _ = func() bool {
globalCtx = &Ctx{
macros: map[Macro]MacroFn{
"add": func(bctx BuildCtx, ctx Ctx, e Expr) Expr {
tup := bctx.buildExpr(ctx, e).(llvmVal)
a := bctx.B.CreateExtractValue(llvm.Value(tup), 0, "")
b := bctx.B.CreateExtractValue(llvm.Value(tup), 1, "")
return llvmVal(bctx.B.CreateAdd(a, b, ""))
},
// TODO this chould be a user macro!!!! WUT this language is baller
"bind": func(bctx BuildCtx, ctx Ctx, e Expr) Expr {
tup := bctx.buildExprTill(ctx, e, isIdentifier).(Tuple)
id := bctx.buildExprTill(ctx, tup[0], isIdentifier).(Identifier)
val := bctx.buildExpr(ctx, tup[1])
ctx.Bind(id, val)
return NewTuple()
},
"ctxnew": func(bctx BuildCtx, ctx Ctx, e Expr) Expr {
return NewCtx()
},
"ctxthis": func(bctx BuildCtx, ctx Ctx, e Expr) Expr {
return ctx
},
"ctxbind": func(bctx BuildCtx, ctx Ctx, e Expr) Expr {
tup := bctx.buildExprTill(ctx, e, isIdentifier).(Tuple)
thisCtx := bctx.buildExpr(ctx, tup[0]).(Ctx)
id := bctx.buildExprTill(ctx, tup[1], isIdentifier).(Identifier)
thisCtx.Bind(id, bctx.buildExpr(ctx, tup[2]))
return NewTuple()
},
"ctxget": func(bctx BuildCtx, ctx Ctx, e Expr) Expr {
tup := bctx.buildExprTill(ctx, e, isIdentifier).(Tuple)
thisCtx := bctx.buildExpr(ctx, tup[0]).(Ctx)
id := bctx.buildExprTill(ctx, tup[1], isIdentifier).(Identifier)
return thisCtx.Identifier(id)
},
"do": func(bctx BuildCtx, ctx Ctx, e Expr) Expr {
tup := bctx.buildExprTill(ctx, e, isStmt).(Tuple)
thisCtx := tup[0].(Ctx)
for _, stmtE := range tup[1].(List) {
bctx.BuildStmt(thisCtx, stmtE.(Statement))
}
return NewTuple()
},
"op": func(bctx BuildCtx, ctx Ctx, e Expr) Expr {
l := bctx.buildExprTill(ctx, e, isList).(List)
stmts := make([]Statement, len(l))
for i := range l {
stmts[i] = l[i].(Statement)
}
// TODO obviously this needs to be fixed
fn := llvm.AddFunction(bctx.M, "", llvm.FunctionType(llvm.Int64Type(), []llvm.Type{llvm.Int64Type()}, false))
fnbl := llvm.AddBasicBlock(fn, "")
prevbl := bctx.B.GetInsertBlock()
bctx.B.SetInsertPoint(fnbl, fnbl.FirstInstruction())
out := bctx.Build(NewCtx(), stmts...)
bctx.B.CreateRet(out)
bctx.B.SetInsertPointAtEnd(prevbl)
return llvmVal(fn)
},
"in": func(bctx BuildCtx, ctx Ctx, e Expr) Expr {
fn := bctx.B.GetInsertBlock().Parent()
return llvmVal(fn.Param(0))
},
},
}
return false
}()

View File

@ -1,99 +0,0 @@
package expr
import (
"fmt"
. "testing"
"llvm.org/llvm/bindings/go/llvm"
)
func buildTest(t *T, expected int64, stmts ...Statement) {
fmt.Println("-----------------------------------------")
ctx := NewCtx()
bctx := NewBuildCtx("")
fn := llvm.AddFunction(bctx.M, "", llvm.FunctionType(llvm.Int64Type(), []llvm.Type{}, false))
fnbl := llvm.AddBasicBlock(fn, "")
bctx.B.SetInsertPoint(fnbl, fnbl.FirstInstruction())
out := bctx.Build(ctx, stmts...)
bctx.B.CreateRet(out)
fmt.Println("######## dumping IR")
bctx.M.Dump()
fmt.Println("######## done dumping IR")
if err := llvm.VerifyModule(bctx.M, llvm.ReturnStatusAction); err != nil {
t.Fatal(err)
}
eng, err := llvm.NewExecutionEngine(bctx.M)
if err != nil {
t.Fatal(err)
}
res := eng.RunFunction(fn, []llvm.GenericValue{}).Int(false)
if int64(res) != expected {
t.Errorf("expected:[%T]%v actual:[%T]%v", expected, expected, res, res)
}
}
func TestAdd(t *T) {
buildTest(t, 2,
NewStatement(Macro("add"), Int(1), Int(1)))
buildTest(t, 4,
NewStatement(Macro("add"), Int(1),
NewStatement(Macro("add"), Int(1), Int(2))))
buildTest(t, 6,
NewStatement(Macro("add"),
NewStatement(Macro("add"), Int(1), Int(2)),
NewStatement(Macro("add"), Int(1), Int(2))))
}
func TestBind(t *T) {
buildTest(t, 2,
NewStatement(Macro("bind"), Identifier("A"), Int(1)),
NewStatement(Macro("add"), Identifier("A"), Int(1)))
buildTest(t, 2,
NewStatement(Macro("bind"), Identifier("A"), Int(1)),
NewStatement(Macro("add"), Identifier("A"), Identifier("A")))
buildTest(t, 2,
NewStatement(Macro("bind"), Identifier("A"), NewTuple(Int(1), Int(1))),
NewStatement(Macro("add"), Identifier("A")))
buildTest(t, 3,
NewStatement(Macro("bind"), Identifier("A"), NewTuple(Int(1), Int(1))),
NewStatement(Macro("add"), Int(1),
NewStatement(Macro("add"), Identifier("A"))))
buildTest(t, 4,
NewStatement(Macro("bind"), Identifier("A"), NewTuple(Int(1), Int(1))),
NewStatement(Macro("add"),
NewStatement(Macro("add"), Identifier("A")),
NewStatement(Macro("add"), Identifier("A"))))
}
func TestOp(t *T) {
incr := NewStatement(Macro("op"),
NewList(
NewStatement(Macro("add"), Int(1), NewStatement(Macro("in"))),
),
)
// bound op
buildTest(t, 2,
NewStatement(Macro("bind"), Identifier("incr"), incr),
NewStatement(Identifier("incr"), Int(1)))
// double bound op
buildTest(t, 3,
NewStatement(Macro("bind"), Identifier("incr"), incr),
NewStatement(Identifier("incr"),
NewStatement(Identifier("incr"), Int(1))))
// anon op
buildTest(t, 2,
NewStatement(incr, Int(1)))
// double anon op
buildTest(t, 3,
NewStatement(incr,
NewStatement(incr, Int(1))))
}

View File

@ -1,72 +0,0 @@
package expr
// MacroFn is a compiler function which takes in an existing Expr and returns
// the llvm Value for it
type MacroFn func(BuildCtx, Ctx, Expr) Expr
// Ctx contains all the Macros and Identifiers available. A Ctx also keeps a
// reference to the global context, which has a number of macros available for
// all contexts to use.
type Ctx struct {
global *Ctx
macros map[Macro]MacroFn
idents map[Identifier]Expr
}
// NewCtx returns a blank context instance
func NewCtx() Ctx {
return Ctx{
global: globalCtx,
macros: map[Macro]MacroFn{},
idents: map[Identifier]Expr{},
}
}
// Macro returns the MacroFn associated with the given identifier, or panics
// if the macro isn't found
func (c Ctx) Macro(m Macro) MacroFn {
if fn := c.macros[m]; fn != nil {
return fn
}
if fn := c.global.macros[m]; fn != nil {
return fn
}
panicf("macro %q not found in context", m)
return nil
}
// Identifier returns the llvm.Value for the Identifier, or panics
func (c Ctx) Identifier(i Identifier) Expr {
if e := c.idents[i]; e != nil {
return e
}
// The global context doesn't have any identifiers, so don't bother checking
panicf("identifier %q not found", i)
panic("go is dumb")
}
// Copy returns a deep copy of the Ctx
func (c Ctx) Copy() Ctx {
cc := Ctx{
global: c.global,
macros: make(map[Macro]MacroFn, len(c.macros)),
idents: make(map[Identifier]Expr, len(c.idents)),
}
for m, mfn := range c.macros {
cc.macros[m] = mfn
}
for i, e := range c.idents {
cc.idents[i] = e
}
return cc
}
// Bind returns a new Ctx which is a copy of this one, but with the given
// Identifier bound to the given Expr. Will panic if the Identifier is already
// bound
func (c Ctx) Bind(i Identifier, e Expr) {
if _, ok := c.idents[i]; ok {
panicf("identifier %q is already bound", i)
}
c.idents[i] = e
}

View File

@ -1,210 +0,0 @@
package expr
import (
"fmt"
"llvm.org/llvm/bindings/go/llvm"
)
// Expr represents the actual expression in question.
type Expr interface{}
// equaler is used to compare two expressions. The comparison should not take
// into account Token values, only the actual value being represented
type equaler interface {
equal(equaler) bool
}
// will panic if either Expr doesn't implement equaler
func exprEqual(e1, e2 Expr) bool {
eq1, ok1 := e1.(equaler)
eq2, ok2 := e2.(equaler)
if !ok1 || !ok2 {
panic(fmt.Sprintf("can't compare %T and %T", e1, e2))
}
return eq1.equal(eq2)
}
////////////////////////////////////////////////////////////////////////////////
// an Expr which simply wraps an existing llvm.Value
type llvmVal llvm.Value
/*
func voidVal(lctx LLVMCtx) llvmVal {
return llvmVal{lctx.B.CreateRetVoid()}
}
*/
////////////////////////////////////////////////////////////////////////////////
/*
// Void represents no data (size = 0)
type Void struct{}
func (v Void) equal(e equaler) bool {
_, ok := e.(Void)
return ok
}
*/
////////////////////////////////////////////////////////////////////////////////
/*
// Bool represents a true or false value
type Bool bool
func (b Bool) equal(e equaler) bool {
bb, ok := e.(Bool)
if !ok {
return false
}
return bb == b
}
*/
////////////////////////////////////////////////////////////////////////////////
// Int represents an integer value
type Int int64
func (i Int) equal(e equaler) bool {
ii, ok := e.(Int)
return ok && ii == i
}
func (i Int) String() string {
return fmt.Sprintf("%d", i)
}
////////////////////////////////////////////////////////////////////////////////
/*
// String represents a string value
type String string
func (s String) equal(e equaler) bool {
ss, ok := e.(String)
if !ok {
return false
}
return ss == s
}
*/
////////////////////////////////////////////////////////////////////////////////
// Identifier represents a binding to some other value which has been given a
// name
type Identifier string
func (id Identifier) equal(e equaler) bool {
idid, ok := e.(Identifier)
return ok && idid == id
}
func isIdentifier(e Expr) bool {
_, ok := e.(Identifier)
return ok
}
////////////////////////////////////////////////////////////////////////////////
// Macro is an identifier for a macro which can be used to transform
// expressions. The tokens for macros start with a '%', but the Macro identifier
// itself has that stripped off
type Macro string
// String returns the Macro with a '%' prepended to it
func (m Macro) String() string {
return "%" + string(m)
}
func (m Macro) equal(e equaler) bool {
mm, ok := e.(Macro)
return ok && m == mm
}
////////////////////////////////////////////////////////////////////////////////
// Tuple represents a fixed set of expressions which are interacted with as if
// they were a single value
type Tuple []Expr
// NewTuple returns a Tuple around the given list of Exprs
func NewTuple(ee ...Expr) Tuple {
return Tuple(ee)
}
func (tup Tuple) String() string {
return "(" + exprsJoin(tup) + ")"
}
func (tup Tuple) equal(e equaler) bool {
tuptup, ok := e.(Tuple)
return ok && exprsEqual(tup, tuptup)
}
func isTuple(e Expr) bool {
_, ok := e.(Tuple)
return ok
}
////////////////////////////////////////////////////////////////////////////////
// List represents an ordered set of Exprs, all of the same type. A List's size
// does not affect its type signature, unlike a Tuple
type List []Expr
// NewList returns a List around the given list of Exprs
func NewList(ee ...Expr) List {
return List(ee)
}
func (l List) String() string {
return "[" + exprsJoin(l) + "]"
}
func (l List) equal(e equaler) bool {
ll, ok := e.(List)
return ok && exprsEqual(l, ll)
}
func isList(e Expr) bool {
_, ok := e.(List)
return ok
}
////////////////////////////////////////////////////////////////////////////////
// Statement represents an actual action which will be taken. The input value is
// used as the input to the pipe, and the output of the pipe is the output of
// the statement
type Statement struct {
Op, Arg Expr
}
// NewStatement returns a Statement whose Op is the first Expr. If the given
// list is empty Arg will be 0-tuple, if its length is one Arg will be that
// single Expr, otherwise Arg will be a Tuple of the list
func NewStatement(e Expr, ee ...Expr) Statement {
s := Statement{Op: e}
if len(ee) > 1 {
s.Arg = NewTuple(ee...)
} else if len(ee) == 1 {
s.Arg = ee[0]
} else if len(ee) == 0 {
s.Arg = NewTuple()
}
return s
}
func (s Statement) String() string {
return fmt.Sprintf("(%v %s)", s.Op, s.Arg)
}
func (s Statement) equal(e equaler) bool {
ss, ok := e.(Statement)
return ok && exprEqual(s.Op, ss.Op) && exprEqual(s.Arg, ss.Arg)
}
func isStmt(e Expr) bool {
_, ok := e.(Statement)
return ok
}

View File

@ -1,299 +0,0 @@
package expr
//type exprErr struct {
// reason string
// err error
// tok lexer.Token
// tokCtx string // e.g. "block starting at" or "open paren at"
//}
//
//func (e exprErr) Error() string {
// var msg string
// if e.err != nil {
// msg = e.err.Error()
// } else {
// msg = e.reason
// }
// if err := e.tok.Err(); err != nil {
// msg += " - token error: " + err.Error()
// } else if (e.tok != lexer.Token{}) {
// msg += " - "
// if e.tokCtx != "" {
// msg += e.tokCtx + ": "
// }
// msg = fmt.Sprintf("%s [line:%d col:%d]", msg, e.tok.Row, e.tok.Col)
// }
// return msg
//}
//
//////////////////////////////////////////////////////////////////////////////////
//
//// toks[0] must be start
//func sliceEnclosedToks(toks []lexer.Token, start, end lexer.Token) ([]lexer.Token, []lexer.Token, error) {
// c := 1
// ret := []lexer.Token{}
// first := toks[0]
// for i, tok := range toks[1:] {
// if tok.Err() != nil {
// return nil, nil, exprErr{
// reason: fmt.Sprintf("missing closing %v", end),
// tok: tok,
// }
// }
//
// if tok.Equal(start) {
// c++
// } else if tok.Equal(end) {
// c--
// }
// if c == 0 {
// return ret, toks[2+i:], nil
// }
// ret = append(ret, tok)
// }
//
// return nil, nil, exprErr{
// reason: fmt.Sprintf("missing closing %v", end),
// tok: first,
// tokCtx: "starting at",
// }
//}
//
//// Parse reads in all expressions it can from the given io.Reader and returns
//// them
//func Parse(r io.Reader) ([]Expr, error) {
// toks := readAllToks(r)
// var ret []Expr
// var expr Expr
// var err error
// for len(toks) > 0 {
// if toks[0].TokenType == lexer.EOF {
// return ret, nil
// }
// expr, toks, err = parse(toks)
// if err != nil {
// return nil, err
// }
// ret = append(ret, expr)
// }
// return ret, nil
//}
//
//// ParseAsBlock reads the given io.Reader as if it was implicitly surrounded by
//// curly braces, making it into a Block. This means all expressions from the
//// io.Reader *must* be statements. The returned Expr's Actual will always be a
//// Block.
//func ParseAsBlock(r io.Reader) (Expr, error) {
// return parseBlock(readAllToks(r))
//}
//
//func readAllToks(r io.Reader) []lexer.Token {
// l := lexer.New(r)
// var toks []lexer.Token
// for l.HasNext() {
// toks = append(toks, l.Next())
// }
// return toks
//}
//
//// For all parse methods it is assumed that toks is not empty
//
//var (
// openParen = lexer.Token{TokenType: lexer.Wrapper, Val: "("}
// closeParen = lexer.Token{TokenType: lexer.Wrapper, Val: ")"}
// openCurly = lexer.Token{TokenType: lexer.Wrapper, Val: "{"}
// closeCurly = lexer.Token{TokenType: lexer.Wrapper, Val: "}"}
// comma = lexer.Token{TokenType: lexer.Punctuation, Val: ","}
// arrow = lexer.Token{TokenType: lexer.Punctuation, Val: ">"}
//)
//
//func parse(toks []lexer.Token) (Expr, []lexer.Token, error) {
// expr, toks, err := parseSingle(toks)
// if err != nil {
// return Expr{}, nil, err
// }
//
// if len(toks) > 0 && toks[0].TokenType == lexer.Punctuation {
// return parseConnectingPunct(toks, expr)
// }
//
// return expr, toks, nil
//}
//
//func parseSingle(toks []lexer.Token) (Expr, []lexer.Token, error) {
// var expr Expr
// var err error
//
// if toks[0].Err() != nil {
// return Expr{}, nil, exprErr{
// reason: "could not parse token",
// tok: toks[0],
// }
// }
//
// if toks[0].Equal(openParen) {
// starter := toks[0]
// var ptoks []lexer.Token
// ptoks, toks, err = sliceEnclosedToks(toks, openParen, closeParen)
// if err != nil {
// return Expr{}, nil, err
// }
//
// if expr, ptoks, err = parse(ptoks); err != nil {
// return Expr{}, nil, err
// } else if len(ptoks) > 0 {
// return Expr{}, nil, exprErr{
// reason: "multiple expressions inside parenthesis",
// tok: starter,
// tokCtx: "starting at",
// }
// }
// return expr, toks, nil
//
// } else if toks[0].Equal(openCurly) {
// var btoks []lexer.Token
// btoks, toks, err = sliceEnclosedToks(toks, openCurly, closeCurly)
// if err != nil {
// return Expr{}, nil, err
// }
//
// if expr, err = parseBlock(btoks); err != nil {
// return Expr{}, nil, err
// }
// return expr, toks, nil
// }
//
// if expr, err = parseNonPunct(toks[0]); err != nil {
// return Expr{}, nil, err
// }
// return expr, toks[1:], nil
//}
//
//func parseNonPunct(tok lexer.Token) (Expr, error) {
// if tok.TokenType == lexer.Identifier {
// return parseIdentifier(tok)
// } else if tok.TokenType == lexer.String {
// //return parseString(tok)
// }
//
// return Expr{}, exprErr{
// reason: "unexpected non-punctuation token",
// tok: tok,
// }
//}
//
//func parseIdentifier(t lexer.Token) (Expr, error) {
// e := Expr{Token: t}
// if t.Val[0] == '-' || (t.Val[0] >= '0' && t.Val[0] <= '9') {
// n, err := strconv.ParseInt(t.Val, 10, 64)
// if err != nil {
// return Expr{}, exprErr{
// err: err,
// tok: t,
// }
// }
// e.Actual = Int(n)
//
// /*
// } else if t.Val == "%true" {
// e.Actual = Bool(true)
//
// } else if t.Val == "%false" {
// e.Actual = Bool(false)
// */
//
// } else if t.Val[0] == '%' {
// e.Actual = Macro(t.Val[1:])
//
// } else {
// e.Actual = Identifier(t.Val)
// }
//
// return e, nil
//}
//
///*
//func parseString(t lexer.Token) (Expr, error) {
// str, err := strconv.Unquote(t.Val)
// if err != nil {
// return Expr{}, exprErr{
// err: err,
// tok: t,
// }
// }
// return Expr{Token: t, Actual: String(str)}, nil
//}
//*/
//
//func parseConnectingPunct(toks []lexer.Token, root Expr) (Expr, []lexer.Token, error) {
// if toks[0].Equal(comma) {
// return parseTuple(toks, root)
//
// } else if toks[0].Equal(arrow) {
// expr, toks, err := parse(toks[1:])
// if err != nil {
// return Expr{}, nil, err
// }
// return Expr{Token: root.Token, Actual: Statement{In: root, To: expr}}, toks, nil
// }
//
// return root, toks, nil
//}
//
//func parseTuple(toks []lexer.Token, root Expr) (Expr, []lexer.Token, error) {
// rootTup, ok := root.Actual.(Tuple)
// if !ok {
// rootTup = Tuple{root}
// }
//
// // rootTup is modified throughout, be we need to make it into an Expr for
// // every return, which is annoying. so make a function to do it on the fly
// mkRoot := func() Expr {
// return Expr{Token: rootTup[0].Token, Actual: rootTup}
// }
//
// if len(toks) < 2 {
// return mkRoot(), toks, nil
// } else if !toks[0].Equal(comma) {
// if toks[0].TokenType == lexer.Punctuation {
// return parseConnectingPunct(toks, mkRoot())
// }
// return mkRoot(), toks, nil
// }
//
// var expr Expr
// var err error
// if expr, toks, err = parseSingle(toks[1:]); err != nil {
// return Expr{}, nil, err
// }
//
// rootTup = append(rootTup, expr)
// return parseTuple(toks, mkRoot())
//}
//
//// parseBlock assumes that the given token list is the entire block, already
//// pulled from outer curly braces by sliceEnclosedToks, or determined to be the
//// entire block in some other way.
//func parseBlock(toks []lexer.Token) (Expr, error) {
// b := Block{}
// first := toks[0]
// var expr Expr
// var err error
// for {
// if len(toks) == 0 {
// return Expr{Token: first, Actual: b}, nil
// }
//
// if expr, toks, err = parse(toks); err != nil {
// return Expr{}, err
// }
// if _, ok := expr.Actual.(Statement); !ok {
// return Expr{}, exprErr{
// reason: "blocks may only contain full statements",
// tok: expr.Token,
// tokCtx: "non-statement here",
// }
// }
// b = append(b, expr)
// }
//}

View File

@ -1,149 +0,0 @@
package expr
//import . "testing"
//func TestSliceEnclosedToks(t *T) {
// doAssert := func(in, expOut, expRem []lexer.Token) {
// out, rem, err := sliceEnclosedToks(in, openParen, closeParen)
// require.Nil(t, err)
// assert.Equal(t, expOut, out)
// assert.Equal(t, expRem, rem)
// }
// foo := lexer.Token{TokenType: lexer.Identifier, Val: "foo"}
// bar := lexer.Token{TokenType: lexer.Identifier, Val: "bar"}
//
// toks := []lexer.Token{openParen, closeParen}
// doAssert(toks, []lexer.Token{}, []lexer.Token{})
//
// toks = []lexer.Token{openParen, foo, closeParen, bar}
// doAssert(toks, []lexer.Token{foo}, []lexer.Token{bar})
//
// toks = []lexer.Token{openParen, foo, foo, closeParen, bar, bar}
// doAssert(toks, []lexer.Token{foo, foo}, []lexer.Token{bar, bar})
//
// toks = []lexer.Token{openParen, foo, openParen, bar, closeParen, closeParen}
// doAssert(toks, []lexer.Token{foo, openParen, bar, closeParen}, []lexer.Token{})
//
// toks = []lexer.Token{openParen, foo, openParen, bar, closeParen, bar, closeParen, foo}
// doAssert(toks, []lexer.Token{foo, openParen, bar, closeParen, bar}, []lexer.Token{foo})
//}
//
//func assertParse(t *T, in []lexer.Token, expExpr Expr, expOut []lexer.Token) {
// expr, out, err := parse(in)
// require.Nil(t, err)
// assert.True(t, expExpr.equal(expr), "expr:%+v expExpr:%+v", expr, expExpr)
// assert.Equal(t, expOut, out, "out:%v expOut:%v", out, expOut)
//}
//
//func TestParseSingle(t *T) {
// foo := lexer.Token{TokenType: lexer.Identifier, Val: "foo"}
// fooM := lexer.Token{TokenType: lexer.Identifier, Val: "%foo"}
// fooExpr := Expr{Actual: Identifier("foo")}
// fooMExpr := Expr{Actual: Macro("foo")}
//
// toks := []lexer.Token{foo}
// assertParse(t, toks, fooExpr, []lexer.Token{})
//
// toks = []lexer.Token{foo, foo}
// assertParse(t, toks, fooExpr, []lexer.Token{foo})
//
// toks = []lexer.Token{openParen, foo, closeParen, foo}
// assertParse(t, toks, fooExpr, []lexer.Token{foo})
//
// toks = []lexer.Token{openParen, openParen, foo, closeParen, closeParen, foo}
// assertParse(t, toks, fooExpr, []lexer.Token{foo})
//
// toks = []lexer.Token{fooM, foo}
// assertParse(t, toks, fooMExpr, []lexer.Token{foo})
//}
//
//func TestParseTuple(t *T) {
// tup := func(ee ...Expr) Expr {
// return Expr{Actual: Tuple(ee)}
// }
//
// foo := lexer.Token{TokenType: lexer.Identifier, Val: "foo"}
// fooExpr := Expr{Actual: Identifier("foo")}
//
// toks := []lexer.Token{foo, comma, foo}
// assertParse(t, toks, tup(fooExpr, fooExpr), []lexer.Token{})
//
// toks = []lexer.Token{foo, comma, foo, foo}
// assertParse(t, toks, tup(fooExpr, fooExpr), []lexer.Token{foo})
//
// toks = []lexer.Token{foo, comma, foo, comma, foo}
// assertParse(t, toks, tup(fooExpr, fooExpr, fooExpr), []lexer.Token{})
//
// toks = []lexer.Token{foo, comma, foo, comma, foo, comma, foo}
// assertParse(t, toks, tup(fooExpr, fooExpr, fooExpr, fooExpr), []lexer.Token{})
//
// toks = []lexer.Token{foo, comma, openParen, foo, comma, foo, closeParen, comma, foo}
// assertParse(t, toks, tup(fooExpr, tup(fooExpr, fooExpr), fooExpr), []lexer.Token{})
//
// toks = []lexer.Token{foo, comma, openParen, foo, comma, foo, closeParen, comma, foo, foo}
// assertParse(t, toks, tup(fooExpr, tup(fooExpr, fooExpr), fooExpr), []lexer.Token{foo})
//}
//
//func TestParseStatement(t *T) {
// stmt := func(in, to Expr) Expr {
// return Expr{Actual: Statement{In: in, To: to}}
// }
//
// foo := lexer.Token{TokenType: lexer.Identifier, Val: "foo"}
// fooExpr := Expr{Actual: Identifier("foo")}
//
// toks := []lexer.Token{foo, arrow, foo}
// assertParse(t, toks, stmt(fooExpr, fooExpr), []lexer.Token{})
//
// toks = []lexer.Token{openParen, foo, arrow, foo, closeParen}
// assertParse(t, toks, stmt(fooExpr, fooExpr), []lexer.Token{})
//
// toks = []lexer.Token{foo, arrow, openParen, foo, closeParen}
// assertParse(t, toks, stmt(fooExpr, fooExpr), []lexer.Token{})
//
// toks = []lexer.Token{foo, arrow, foo}
// assertParse(t, toks, stmt(fooExpr, fooExpr), []lexer.Token{})
//
// toks = []lexer.Token{foo, arrow, foo, foo}
// assertParse(t, toks, stmt(fooExpr, fooExpr), []lexer.Token{foo})
//
// toks = []lexer.Token{foo, arrow, openParen, foo, closeParen, foo}
// assertParse(t, toks, stmt(fooExpr, fooExpr), []lexer.Token{foo})
//
// toks = []lexer.Token{openParen, foo, closeParen, arrow, openParen, foo, closeParen, foo}
// assertParse(t, toks, stmt(fooExpr, fooExpr), []lexer.Token{foo})
//
// fooTupExpr := Expr{Actual: Tuple{fooExpr, fooExpr}}
// toks = []lexer.Token{foo, arrow, openParen, foo, comma, foo, closeParen, foo}
// assertParse(t, toks, stmt(fooExpr, fooTupExpr), []lexer.Token{foo})
//
// toks = []lexer.Token{foo, comma, foo, arrow, foo}
// assertParse(t, toks, stmt(fooTupExpr, fooExpr), []lexer.Token{})
//
// toks = []lexer.Token{openParen, foo, comma, foo, closeParen, arrow, foo}
// assertParse(t, toks, stmt(fooTupExpr, fooExpr), []lexer.Token{})
//}
//
//func TestParseBlock(t *T) {
// stmt := func(in, to Expr) Expr {
// return Expr{Actual: Statement{In: in, To: to}}
// }
// block := func(stmts ...Expr) Expr {
// return Expr{Actual: Block(stmts)}
// }
//
// foo := lexer.Token{TokenType: lexer.Identifier, Val: "foo"}
// fooExpr := Expr{Actual: Identifier("foo")}
//
// toks := []lexer.Token{openCurly, foo, arrow, foo, closeCurly}
// assertParse(t, toks, block(stmt(fooExpr, fooExpr)), []lexer.Token{})
//
// toks = []lexer.Token{openCurly, foo, arrow, foo, closeCurly, foo}
// assertParse(t, toks, block(stmt(fooExpr, fooExpr)), []lexer.Token{foo})
//
// toks = []lexer.Token{openCurly, foo, arrow, foo, openParen, foo, arrow, foo, closeParen, closeCurly, foo}
// assertParse(t, toks, block(stmt(fooExpr, fooExpr), stmt(fooExpr, fooExpr)), []lexer.Token{foo})
//
// toks = []lexer.Token{openCurly, foo, arrow, foo, openParen, foo, arrow, foo, closeParen, closeCurly, foo}
// assertParse(t, toks, block(stmt(fooExpr, fooExpr), stmt(fooExpr, fooExpr)), []lexer.Token{foo})
//}

View File

@ -1,40 +0,0 @@
package expr
import (
"encoding/hex"
"fmt"
"math/rand"
"strings"
)
func randStr() string {
b := make([]byte, 16)
if _, err := rand.Read(b); err != nil {
panic(err)
}
return hex.EncodeToString(b)
}
func exprsJoin(ee []Expr) string {
strs := make([]string, len(ee))
for i := range ee {
strs[i] = fmt.Sprint(ee[i])
}
return strings.Join(strs, ", ")
}
func exprsEqual(ee1, ee2 []Expr) bool {
if len(ee1) != len(ee2) {
return false
}
for i := range ee1 {
if !exprEqual(ee1[i], ee2[i]) {
return false
}
}
return true
}
func panicf(msg string, args ...interface{}) {
panic(fmt.Sprintf(msg, args...))
}

View File

@ -1,522 +0,0 @@
// Package graph implements an immutable unidirectional graph.
package graph
import (
"crypto/rand"
"encoding/hex"
"fmt"
"sort"
"strings"
)
// Value wraps a go value in a way such that it will be uniquely identified
// within any Graph and between Graphs. Use NewValue to create a Value instance.
// You can create an instance manually as long as ID is globally unique.
type Value struct {
ID string
V interface{}
}
// Void is the absence of any value.
var Void Value
// NewValue returns a Value instance wrapping any go value. The Value returned
// will be independent of the passed in go value. So if the same go value is
// passed in twice then the two returned Value instances will be treated as
// being different values by Graph.
func NewValue(V interface{}) Value {
b := make([]byte, 8)
if _, err := rand.Read(b); err != nil {
panic(err)
}
return Value{
ID: hex.EncodeToString(b),
V: V,
}
}
// Edge is a directional edge connecting two values in a Graph, the Tail and the
// Head.
type Edge interface {
Tail() Value // The Value the Edge is coming from
Head() Value // The Value the Edge is going to
}
func edgeID(e Edge) string {
return fmt.Sprintf("%q->%q", e.Tail().ID, e.Head().ID)
}
type edge struct {
tail, head Value
}
// NewEdge constructs and returns an Edge running from tail to head.
func NewEdge(tail, head Value) Edge {
return edge{tail, head}
}
func (e edge) Tail() Value {
return e.tail
}
func (e edge) Head() Value {
return e.head
}
func (e edge) String() string {
return edgeID(e)
}
// NOTE the Node type exists primarily for convenience. As far as Graph's
// internals are concerned it doesn't _really_ exist, and no Graph method should
// ever take Node as a parameter (except the callback functions like in
// Traverse, where it's not really being taken in).
// Node wraps a Value in a Graph to include that Node's input and output Edges
// in that Graph.
type Node struct {
Value
// All Edges in the Graph with this Node's Value as their Head and Tail,
// respectively. These should not be expected to be deterministic.
Ins, Outs []Edge
}
// an edgeIndex maps valueIDs to a set of edgeIDs. Graph keeps two edgeIndex's,
// one for input edges and one for output edges.
type edgeIndex map[string]map[string]struct{}
func (ei edgeIndex) cp() edgeIndex {
if ei == nil {
return edgeIndex{}
}
ei2 := make(edgeIndex, len(ei))
for valID, edgesM := range ei {
edgesM2 := make(map[string]struct{}, len(edgesM))
for id := range edgesM {
edgesM2[id] = struct{}{}
}
ei2[valID] = edgesM2
}
return ei2
}
func (ei edgeIndex) add(valID, edgeID string) {
edgesM, ok := ei[valID]
if !ok {
edgesM = map[string]struct{}{}
ei[valID] = edgesM
}
edgesM[edgeID] = struct{}{}
}
func (ei edgeIndex) del(valID, edgeID string) {
edgesM, ok := ei[valID]
if !ok {
return
}
delete(edgesM, edgeID)
if len(edgesM) == 0 {
delete(ei, valID)
}
}
// Graph implements an immutable, unidirectional graph which can hold generic
// values. All methods are thread-safe as they don't modify the Graph in any
// way.
//
// The Graph's zero value is the initial empty graph.
//
// The Graph does not keep track of Edge ordering. Assume that all slices of
// Edges are in random order.
type Graph interface {
// Empty returns a graph with no edges which is of the same underlying type
// as this one.
Empty() Graph
// Add returns a new Graph instance with the given Edge added to it. If the
// original Graph already had that Edge this returns the original Graph.
Add(Edge) Graph
// Del returns a new Graph instance without the given Edge in it. If the
// original Graph didn't have that Edge this returns the original Graph.
Del(Edge) Graph
// Edges returns all Edges which are part of the Graph, mapped using a
// string ID which is unique within the Graph and between Graphs of the same
// underlying type.
Edges() map[string]Edge
// EdgesTo returns all Edges whose Head is the given Value.
EdgesTo(v Value) []Edge
// EdgesFrom returns all Edges whose Tail is the given Value.
EdgesFrom(v Value) []Edge
// Has returns true if the Graph contains at least one Edge with a Head or
// Tail of Value.
Has(v Value) bool
}
type graph struct {
m map[string]Edge
// these are indices mapping Value IDs to all the in/out edges for that
// Value in the Graph.
vIns, vOuts edgeIndex
}
// Null is the empty graph from which all other Graphs are built.
var Null = (graph{}).Empty()
func (g graph) Empty() Graph {
return (graph{}).cp() // cp also initializes
}
func (g graph) cp() graph {
g2 := graph{
m: make(map[string]Edge, len(g.m)),
vIns: g.vIns.cp(),
vOuts: g.vOuts.cp(),
}
for id, e := range g.m {
g2.m[id] = e
}
return g2
}
func (g graph) String() string {
edgeStrs := make([]string, 0, len(g.m))
for _, edge := range g.m {
edgeStrs = append(edgeStrs, fmt.Sprint(edge))
}
sort.Strings(edgeStrs)
return "Graph{" + strings.Join(edgeStrs, ",") + "}"
}
func (g graph) Add(e Edge) Graph {
id := edgeID(e)
if _, ok := g.m[id]; ok {
return g
}
g2 := g.cp()
g2.addDirty(id, e)
return g2
}
func (g graph) addDirty(edgeID string, e Edge) {
g.m[edgeID] = e
g.vIns.add(e.Head().ID, edgeID)
g.vOuts.add(e.Tail().ID, edgeID)
}
// addDirty attempts to add the Edge to Graph using an addDirty method,
// otherwise it just uses Add like normal
func addDirty(g Graph, edgeID string, e Edge) Graph {
gd, ok := g.(interface {
addDirty(string, Edge)
})
if !ok {
return g.Add(e)
}
gd.addDirty(edgeID, e)
return g
}
func (g graph) Del(e Edge) Graph {
id := edgeID(e)
if _, ok := g.m[id]; !ok {
return g
}
g2 := g.cp()
delete(g2.m, id)
g2.vIns.del(e.Head().ID, id)
g2.vOuts.del(e.Tail().ID, id)
return g2
}
func (g graph) Edges() map[string]Edge {
return g.m
}
func (g graph) EdgesTo(v Value) []Edge {
vIns := g.vIns[v.ID]
ins := make([]Edge, 0, len(vIns))
for edgeID := range vIns {
ins = append(ins, g.m[edgeID])
}
return ins
}
func (g graph) EdgesFrom(v Value) []Edge {
vOuts := g.vOuts[v.ID]
outs := make([]Edge, 0, len(vOuts))
for edgeID := range vOuts {
outs = append(outs, g.m[edgeID])
}
return outs
}
func (g graph) Has(v Value) bool {
if _, ok := g.vIns[v.ID]; ok {
return true
} else if _, ok := g.vOuts[v.ID]; ok {
return true
}
return false
}
////////////////////////////////////////////////////////////////////////////////
// Disjoin looks at the whole Graph and returns all sub-graphs of it which don't
// share any Edges between each other.
func Disjoin(g Graph) []Graph {
empty := g.Empty()
edges := g.Edges()
valM := make(map[string]*Graph, len(edges))
graphForEdge := func(edge Edge) *Graph {
headGraph := valM[edge.Head().ID]
tailGraph := valM[edge.Tail().ID]
if headGraph == nil && tailGraph == nil {
newGraph := empty.Empty()
return &newGraph
} else if headGraph == nil && tailGraph != nil {
return tailGraph
} else if headGraph != nil && tailGraph == nil {
return headGraph
} else if headGraph == tailGraph {
return headGraph // doesn't matter which is returned
}
// the two values are part of different graphs, join the smaller into
// the larger and change all values which were pointing to it to point
// into the larger (which will then be the join of them)
tailEdges := (*tailGraph).Edges()
if headEdges := (*headGraph).Edges(); len(headEdges) > len(tailEdges) {
headGraph, tailGraph = tailGraph, headGraph
tailEdges = headEdges
}
for edgeID, edge := range tailEdges {
*headGraph = addDirty(*headGraph, edgeID, edge)
}
for valID, valGraph := range valM {
if valGraph == tailGraph {
valM[valID] = headGraph
}
}
return headGraph
}
for edgeID, edge := range edges {
graph := graphForEdge(edge)
*graph = addDirty(*graph, edgeID, edge)
valM[edge.Head().ID] = graph
valM[edge.Tail().ID] = graph
}
found := map[*Graph]bool{}
graphs := make([]Graph, 0, len(valM))
for _, graph := range valM {
if found[graph] {
continue
}
found[graph] = true
graphs = append(graphs, *graph)
}
return graphs
}
// Join returns a new Graph which shares all Edges of all given Graphs. All
// given Graphs must be of the same underlying type.
func Join(graphs ...Graph) Graph {
g2 := graphs[0].Empty()
for _, graph := range graphs {
for edgeID, edge := range graph.Edges() {
g2 = addDirty(g2, edgeID, edge)
}
}
return g2
}
// GetNode returns the Node for the given Value, or false if the Graph doesn't
// contain the Value.
func GetNode(g Graph, v Value) (Node, bool) {
n := Node{
Value: v,
Ins: g.EdgesTo(v),
Outs: g.EdgesFrom(v),
}
return n, len(n.Ins) > 0 || len(n.Outs) > 0
}
// GetNodes returns a Node for each Value which has at least one Edge in the
// Graph, with the Nodes mapped by their Value's ID.
func GetNodes(g Graph) map[string]Node {
edges := g.Edges()
nodesM := make(map[string]Node, len(edges)*2)
for _, edge := range edges {
// if head and tail are modified at the same time it messes up the case
// where they are the same node
{
headV := edge.Head()
head := nodesM[headV.ID]
head.Value = headV
head.Ins = append(head.Ins, edge)
nodesM[head.ID] = head
}
{
tailV := edge.Tail()
tail := nodesM[tailV.ID]
tail.Value = tailV
tail.Outs = append(tail.Outs, edge)
nodesM[tail.ID] = tail
}
}
return nodesM
}
// Traverse is used to traverse the Graph until a stopping point is reached.
// Traversal starts with the cursor at the given start Value. Each hop is
// performed by passing the cursor Value's Node into the next function. The
// cursor moves to the returned Value and next is called again, and so on.
//
// If the boolean returned from the next function is false traversal stops and
// this method returns.
//
// If start has no Edges in the Graph, or a Value returned from next doesn't,
// this will still call next, but the Node will be the zero value.
func Traverse(g Graph, start Value, next func(n Node) (Value, bool)) {
curr := start
for {
currNode, ok := GetNode(g, curr)
if ok {
curr, ok = next(currNode)
} else {
curr, ok = next(Node{})
}
if !ok {
return
}
}
}
// VisitBreadth is like Traverse, except that each Node is only visited once,
// and the order of visited Nodes is determined by traversing each Node's output
// Edges breadth-wise.
//
// If the boolean returned from the callback function is false, or the start
// Value has no edges in the Graph, traversal stops and this method returns.
//
// The exact order of Nodes visited is _not_ deterministic.
func VisitBreadth(g Graph, start Value, callback func(n Node) bool) {
visited := map[string]bool{}
toVisit := make([]Value, 0, 16)
toVisit = append(toVisit, start)
for {
if len(toVisit) == 0 {
return
}
// shift val off front
val := toVisit[0]
toVisit = toVisit[1:]
if visited[val.ID] {
continue
}
node, ok := GetNode(g, val)
if !ok {
continue
} else if !callback(node) {
return
}
visited[val.ID] = true
for _, edge := range node.Outs {
headV := edge.Head()
if visited[headV.ID] {
continue
}
toVisit = append(toVisit, headV)
}
}
}
// VisitDepth is like Traverse, except that each Node is only visited once,
// and the order of visited Nodes is determined by traversing each Node's output
// Edges depth-wise.
//
// If the boolean returned from the callback function is false, or the start
// Value has no edges in the Graph, traversal stops and this method returns.
//
// The exact order of Nodes visited is _not_ deterministic.
func VisitDepth(g Graph, start Value, callback func(n Node) bool) {
// VisitDepth is actually the same as VisitBreadth, only you read off the
// toVisit list from back-to-front
visited := map[string]bool{}
toVisit := make([]Value, 0, 16)
toVisit = append(toVisit, start)
for {
if len(toVisit) == 0 {
return
}
val := toVisit[0]
toVisit = toVisit[:len(toVisit)-1] // pop val off back
if visited[val.ID] {
continue
}
node, ok := GetNode(g, val)
if !ok {
continue
} else if !callback(node) {
return
}
visited[val.ID] = true
for _, edge := range node.Outs {
if visited[edge.Head().ID] {
continue
}
toVisit = append(toVisit, edge.Head())
}
}
}
func edgesShared(g, g2 Graph) bool {
gEdges := g.Edges()
for id := range g2.Edges() {
if _, ok := gEdges[id]; !ok {
return false
}
}
return true
}
// SubGraph returns true if g2 is a sub-graph of g; i.e., all edges in g2 are
// also in g. Both Graphs should be of the same underlying type.
func SubGraph(g, g2 Graph) bool {
gEdges, g2Edges := g.Edges(), g2.Edges()
// as a quick check before iterating through the edges, if g has fewer edges
// than g2 then g2 can't possibly be a sub-graph of it
if len(gEdges) < len(g2Edges) {
return false
}
for id := range g2Edges {
if _, ok := gEdges[id]; !ok {
return false
}
}
return true
}
// Equal returns true if g and g2 have exactly the same Edges. Both Graphs
// should be of the same underlying type.
func Equal(g, g2 Graph) bool {
if len(g.Edges()) != len(g2.Edges()) {
return false
}
return SubGraph(g, g2)
}

View File

@ -1,386 +0,0 @@
package graph
import (
"fmt"
. "testing"
"time"
"github.com/mediocregopher/mediocre-go-lib/mrand"
"github.com/mediocregopher/mediocre-go-lib/mtest/massert"
"github.com/mediocregopher/mediocre-go-lib/mtest/mchk"
)
func strV(s string) Value {
return Value{ID: s, V: s}
}
func TestGraph(t *T) {
t.Parallel()
type state struct {
Graph
m map[string]Edge
}
type params struct {
add Edge
del Edge
}
chk := mchk.Checker{
Init: func() mchk.State {
return state{
Graph: Null,
m: map[string]Edge{},
}
},
Next: func(ss mchk.State) mchk.Action {
s := ss.(state)
var p params
if i := mrand.Intn(10); i == 0 && len(s.m) > 0 {
// add edge which is already there
for _, e := range s.m {
p.add = e
break
}
} else if i == 1 {
// delete edge which isn't there
p.del = NewEdge(strV("z"), strV("z"))
} else if i <= 5 {
// add probably new edge
p.add = NewEdge(strV(mrand.Hex(1)), strV(mrand.Hex(1)))
} else {
// probably del edge
p.del = NewEdge(strV(mrand.Hex(1)), strV(mrand.Hex(1)))
}
return mchk.Action{Params: p}
},
Apply: func(ss mchk.State, a mchk.Action) (mchk.State, error) {
s, p := ss.(state), a.Params.(params)
if p.add != nil {
s.Graph = s.Graph.Add(p.add)
s.m[edgeID(p.add)] = p.add
} else {
s.Graph = s.Graph.Del(p.del)
delete(s.m, edgeID(p.del))
}
{ // test GetNodes and Edges methods
nodes := GetNodes(s.Graph)
edges := s.Graph.Edges()
var aa []massert.Assertion
vals := map[string]bool{}
ins, outs := map[string]int{}, map[string]int{}
for _, e := range s.m {
aa = append(aa, massert.Has(edges, e))
aa = append(aa, massert.HasKey(nodes, e.Head().ID))
aa = append(aa, massert.Has(nodes[e.Head().ID].Ins, e))
aa = append(aa, massert.HasKey(nodes, e.Tail().ID))
aa = append(aa, massert.Has(nodes[e.Tail().ID].Outs, e))
vals[e.Head().ID] = true
vals[e.Tail().ID] = true
ins[e.Head().ID]++
outs[e.Tail().ID]++
}
aa = append(aa, massert.Len(edges, len(s.m)))
aa = append(aa, massert.Len(nodes, len(vals)))
for id, node := range nodes {
aa = append(aa, massert.Len(node.Ins, ins[id]))
aa = append(aa, massert.Len(node.Outs, outs[id]))
}
if err := massert.All(aa...).Assert(); err != nil {
return nil, err
}
}
{ // test GetNode and Has. GetNodes has already been tested so we
// can use its returned Nodes as the expected ones
var aa []massert.Assertion
for _, expNode := range GetNodes(s.Graph) {
var naa []massert.Assertion
node, ok := GetNode(s.Graph, expNode.Value)
naa = append(naa, massert.Equal(true, ok))
naa = append(naa, massert.Equal(true, s.Graph.Has(expNode.Value)))
naa = append(naa, massert.Subset(expNode.Ins, node.Ins))
naa = append(naa, massert.Len(node.Ins, len(expNode.Ins)))
naa = append(naa, massert.Subset(expNode.Outs, node.Outs))
naa = append(naa, massert.Len(node.Outs, len(expNode.Outs)))
aa = append(aa, massert.Comment(massert.All(naa...), "v:%q", expNode.ID))
}
_, ok := GetNode(s.Graph, strV("zz"))
aa = append(aa, massert.Equal(false, ok))
aa = append(aa, massert.Equal(false, s.Graph.Has(strV("zz"))))
if err := massert.All(aa...).Assert(); err != nil {
return nil, err
}
}
return s, nil
},
}
if err := chk.RunFor(5 * time.Second); err != nil {
t.Fatal(err)
}
}
func TestSubGraphAndEqual(t *T) {
t.Parallel()
type state struct {
g1, g2 Graph
expEqual, expSubGraph bool
}
type params struct {
e Edge
add1, add2 bool
}
chk := mchk.Checker{
Init: func() mchk.State {
return state{
g1: Null,
g2: Null,
expEqual: true,
expSubGraph: true,
}
},
Next: func(ss mchk.State) mchk.Action {
i := mrand.Intn(10)
p := params{
e: NewEdge(strV(mrand.Hex(4)), strV(mrand.Hex(4))),
add1: i != 0,
add2: i != 1,
}
return mchk.Action{Params: p}
},
Apply: func(ss mchk.State, a mchk.Action) (mchk.State, error) {
s, p := ss.(state), a.Params.(params)
if p.add1 {
s.g1 = s.g1.Add(p.e)
}
if p.add2 {
s.g2 = s.g2.Add(p.e)
}
s.expSubGraph = s.expSubGraph && p.add1
s.expEqual = s.expEqual && p.add1 && p.add2
if SubGraph(s.g1, s.g2) != s.expSubGraph {
return nil, fmt.Errorf("SubGraph expected to return %v", s.expSubGraph)
}
if Equal(s.g1, s.g2) != s.expEqual {
return nil, fmt.Errorf("Equal expected to return %v", s.expEqual)
}
return s, nil
},
MaxLength: 100,
}
if err := chk.RunFor(5 * time.Second); err != nil {
t.Fatal(err)
}
}
func TestDisjoinUnion(t *T) {
t.Parallel()
type state struct {
g Graph
// prefix -> Values with that prefix. contains dupes
valM map[string][]Value
disjM map[string]Graph
}
type params struct {
prefix string
e Edge
}
chk := mchk.Checker{
Init: func() mchk.State {
return state{
g: Null,
valM: map[string][]Value{},
disjM: map[string]Graph{},
}
},
Next: func(ss mchk.State) mchk.Action {
s := ss.(state)
prefix := mrand.Hex(1)
var edge Edge
if vals := s.valM[prefix]; len(vals) == 0 {
edge = NewEdge(
strV(prefix+mrand.Hex(1)),
strV(prefix+mrand.Hex(1)),
)
} else if mrand.Intn(2) == 0 {
edge = NewEdge(
mrand.Element(vals, nil).(Value),
strV(prefix+mrand.Hex(1)),
)
} else {
edge = NewEdge(
strV(prefix+mrand.Hex(1)),
mrand.Element(vals, nil).(Value),
)
}
return mchk.Action{Params: params{prefix: prefix, e: edge}}
},
Apply: func(ss mchk.State, a mchk.Action) (mchk.State, error) {
s, p := ss.(state), a.Params.(params)
s.g = s.g.Add(p.e)
s.valM[p.prefix] = append(s.valM[p.prefix], p.e.Head(), p.e.Tail())
if s.disjM[p.prefix] == nil {
s.disjM[p.prefix] = Null
}
s.disjM[p.prefix] = s.disjM[p.prefix].Add(p.e)
var aa []massert.Assertion
// test Disjoin
disj := Disjoin(s.g)
for prefix, graph := range s.disjM {
aa = append(aa, massert.Comment(
massert.Equal(true, Equal(graph, s.disjM[prefix])),
"prefix:%q", prefix,
))
}
aa = append(aa, massert.Len(disj, len(s.disjM)))
// now test Join
join := Join(disj...)
aa = append(aa, massert.Equal(true, Equal(s.g, join)))
return s, massert.All(aa...).Assert()
},
MaxLength: 100,
// Each action is required for subsequent ones to make sense, so
// minimizing won't work
DontMinimize: true,
}
if err := chk.RunFor(5 * time.Second); err != nil {
t.Fatal(err)
}
}
func TestVisitBreadth(t *T) {
t.Parallel()
type state struct {
g Graph
// each rank describes the set of values (by ID) which should be
// visited in that rank. Within a rank the values will be visited in any
// order
ranks []map[string]bool
}
thisRank := func(s state) map[string]bool {
return s.ranks[len(s.ranks)-1]
}
prevRank := func(s state) map[string]bool {
return s.ranks[len(s.ranks)-2]
}
randFromRank := func(s state, rankPickFn func(state) map[string]bool) Value {
rank := rankPickFn(s)
rankL := make([]string, 0, len(rank))
for id := range rank {
rankL = append(rankL, id)
}
return strV(mrand.Element(rankL, nil).(string))
}
randNew := func(s state) Value {
for {
v := strV(mrand.Hex(2))
if !s.g.Has(v) {
return v
}
}
}
type params struct {
newRank bool
e Edge
}
chk := mchk.Checker{
Init: func() mchk.State {
return state{
g: Null,
ranks: []map[string]bool{
{"start": true},
{},
},
}
},
Next: func(ss mchk.State) mchk.Action {
s := ss.(state)
var p params
p.newRank = len(thisRank(s)) > 0 && mrand.Intn(10) == 0
if p.newRank {
p.e = NewEdge(
randFromRank(s, thisRank),
randNew(s),
)
} else {
p.e = NewEdge(
randFromRank(s, prevRank),
strV(mrand.Hex(2)),
)
}
return mchk.Action{Params: p}
},
Apply: func(ss mchk.State, a mchk.Action) (mchk.State, error) {
s, p := ss.(state), a.Params.(params)
if p.newRank {
s.ranks = append(s.ranks, map[string]bool{})
}
if !s.g.Has(p.e.Head()) {
thisRank(s)[p.e.Head().ID] = true
}
s.g = s.g.Add(p.e)
// check the visit
var err error
expRanks := s.ranks
currRank := map[string]bool{}
VisitBreadth(s.g, strV("start"), func(n Node) bool {
currRank[n.Value.ID] = true
if len(currRank) != len(expRanks[0]) {
return true
}
if err = massert.Equal(expRanks[0], currRank).Assert(); err != nil {
return false
}
expRanks = expRanks[1:]
currRank = map[string]bool{}
return true
})
if err != nil {
return nil, err
}
err = massert.All(
massert.Len(expRanks, 0),
massert.Len(currRank, 0),
).Assert()
return s, err
},
DontMinimize: true,
}
if err := chk.RunCase(); err != nil {
t.Fatal(err)
}
if err := chk.RunFor(5 * time.Second); err != nil {
t.Fatal(err)
}
}

View File

@ -1,118 +0,0 @@
package lang
import (
"fmt"
"reflect"
"strings"
)
// Commonly used Terms
var (
// Language structure types
AAtom = Atom("atom")
AConst = Atom("const")
ATuple = Atom("tup")
AList = Atom("list")
// Match shortcuts
AUnder = Atom("_")
TDblUnder = Tuple{AUnder, AUnder}
)
// Term is a unit of language which carries some meaning. Some Terms are
// actually comprised of multiple sub-Terms.
type Term interface {
fmt.Stringer // for debugging
// Type returns a Term which describes the type of this Term, i.e. the
// components this Term is comprised of.
Type() Term
}
// Equal returns whether or not two Terms are of equal value
func Equal(t1, t2 Term) bool {
return reflect.DeepEqual(t1, t2)
}
////////////////////////////////////////////////////////////////////////////////
// Atom is a constant with no other meaning than that it can be equal or not
// equal to another Atom.
type Atom string
func (a Atom) String() string {
return string(a)
}
// Type implements the method for Term
func (a Atom) Type() Term {
return AAtom
}
////////////////////////////////////////////////////////////////////////////////
// Const is a constant whose meaning depends on the context in which it is used
type Const string
func (a Const) String() string {
return string(a)
}
// Type implements the method for Term
func (a Const) Type() Term {
return AConst
}
////////////////////////////////////////////////////////////////////////////////
// Tuple is a compound Term of zero or more sub-Terms, each of which may have a
// different Type. Both the length of the Tuple and the Type of each of it's
// sub-Terms are components in the Tuple's Type.
type Tuple []Term
func (t Tuple) String() string {
ss := make([]string, len(t))
for i := range t {
ss[i] = t[i].String()
}
return "(" + strings.Join(ss, " ") + ")"
}
// Type implements the method for Term
func (t Tuple) Type() Term {
tt := make(Tuple, len(t))
for i := range t {
tt[i] = t[i].Type()
}
return Tuple{ATuple, tt}
}
////////////////////////////////////////////////////////////////////////////////
type list struct {
typ Term
ll []Term
}
// List is a compound Term of zero or more sub-Terms, each of which must have
// the same Type (the one given as the first argument to this function). Only
// the Type of the sub-Terms is a component in the List's Type.
func List(typ Term, elems ...Term) Term {
return list{
typ: typ,
ll: elems,
}
}
func (l list) String() string {
ss := make([]string, len(l.ll))
for i := range l.ll {
ss[i] = l.ll[i].String()
}
return "[" + strings.Join(ss, " ") + "]"
}
// Type implements the method for Term
func (l list) Type() Term {
return Tuple{AList, l.typ}
}

View File

@ -1,54 +0,0 @@
package lang
import "fmt"
// Match is used to pattern match an arbitrary Term against a pattern. A pattern
// is a 2-tuple of the type (as an atom, e.g. AAtom, AConst) and a matching
// value.
//
// If the value is AUnder the pattern will match all Terms of the type,
// regardless of their value. If the pattern's type and value are both AUnder
// the pattern will match all Terms.
//
// If the pattern's value is a Tuple or a List, each of its values will be used
// as a sub-pattern to match against the corresponding value in the value.
func Match(pat Tuple, t Term) bool {
if len(pat) != 2 {
return false
}
pt, pv := pat[0], pat[1]
switch pt {
case AAtom:
a, ok := t.(Atom)
return ok && (Equal(pv, AUnder) || Equal(pv, a))
case AConst:
c, ok := t.(Const)
return ok && (Equal(pv, AUnder) || Equal(pv, c))
case ATuple:
tt, ok := t.(Tuple)
if !ok {
return false
} else if Equal(pv, AUnder) {
return true
}
pvt := pv.(Tuple)
if len(tt) != len(pvt) {
return false
}
for i := range tt {
pvti, ok := pvt[i].(Tuple)
if !ok || !Match(pvti, tt[i]) {
return false
}
}
return true
case AList:
panic("TODO")
case AUnder:
return true
default:
panic(fmt.Sprintf("unknown type %T", pt))
}
}

View File

@ -1,66 +0,0 @@
package lang
import (
. "testing"
"github.com/stretchr/testify/assert"
)
func TestMatch(t *T) {
pat := func(typ, val Term) Tuple {
return Tuple{typ, val}
}
tests := []struct {
pattern Tuple
t Term
exp bool
}{
{pat(AAtom, Atom("foo")), Atom("foo"), true},
{pat(AAtom, Atom("foo")), Atom("bar"), false},
{pat(AAtom, Atom("foo")), Const("foo"), false},
{pat(AAtom, Atom("foo")), Tuple{Atom("a"), Atom("b")}, false},
{pat(AAtom, Atom("_")), Atom("bar"), true},
{pat(AAtom, Atom("_")), Const("bar"), false},
{pat(AConst, Const("foo")), Const("foo"), true},
{pat(AConst, Const("foo")), Atom("foo"), false},
{pat(AConst, Const("foo")), Const("bar"), false},
{pat(AConst, Atom("_")), Const("bar"), true},
{pat(AConst, Atom("_")), Atom("foo"), false},
{
pat(ATuple, Tuple{
pat(AAtom, Atom("foo")),
pat(AAtom, Atom("bar")),
}),
Tuple{Atom("foo"), Atom("bar")},
true,
},
{
pat(ATuple, Tuple{
pat(AAtom, Atom("_")),
pat(AAtom, Atom("bar")),
}),
Tuple{Atom("foo"), Atom("bar")},
true,
},
{
pat(ATuple, Tuple{
pat(AAtom, Atom("_")),
pat(AAtom, Atom("_")),
pat(AAtom, Atom("_")),
}),
Tuple{Atom("foo"), Atom("bar")},
false,
},
{pat(AUnder, AUnder), Atom("foo"), true},
{pat(AUnder, AUnder), Const("foo"), true},
{pat(AUnder, AUnder), Tuple{Atom("a"), Atom("b")}, true},
}
for _, testCase := range tests {
assert.Equal(t, testCase.exp, Match(testCase.pattern, testCase.t), "%#v", testCase)
}
}

47
main.go
View File

@ -1,47 +0,0 @@
package main
import (
"fmt"
"github.com/mediocregopher/ginger/lang"
"github.com/mediocregopher/ginger/vm"
)
func main() {
mkcmd := func(a lang.Atom, args ...lang.Term) lang.Tuple {
if len(args) == 1 {
return lang.Tuple{a, args[0]}
}
return lang.Tuple{a, lang.Tuple(args)}
}
mkint := func(i string) lang.Tuple {
return lang.Tuple{vm.Int, lang.Const(i)}
}
//foo := lang.Atom("foo")
//tt := []lang.Term{
// mkcmd(vm.Assign, foo, mkint("1")),
// mkcmd(vm.Add, mkcmd(vm.Tuple, mkcmd(vm.Var, foo), mkint("2"))),
//}
foo := lang.Atom("foo")
bar := lang.Atom("bar")
baz := lang.Atom("baz")
tt := []lang.Term{
mkcmd(vm.Assign, foo, mkcmd(vm.Tuple, mkint("1"), mkint("2"))),
mkcmd(vm.Assign, bar, mkcmd(vm.Add, mkcmd(vm.Var, foo))),
mkcmd(vm.Assign, baz, mkcmd(vm.Add, mkcmd(vm.Var, foo))),
mkcmd(vm.Add, mkcmd(vm.Tuple, mkcmd(vm.Var, bar), mkcmd(vm.Var, baz))),
}
mod, err := vm.Build(tt...)
if err != nil {
panic(err)
}
defer mod.Dispose()
mod.Dump()
out, err := mod.Run()
fmt.Printf("\n\n########\nout: %v %v\n", out, err)
}

View File

@ -1,39 +0,0 @@
package list
import "fmt"
/*
+ size isn't really _necessary_ unless O(1) Len is wanted
+ append doesn't work well on stack
*/
type List struct {
// in practice this would be a constant size, with the compiler knowing the
// size
underlying []int
head, size int
}
func New(ii ...int) List {
l := List{
underlying: make([]int, ii),
size: len(ii),
}
copy(l.underlying, ii)
return l
}
func (l List) Len() int {
return l.size
}
func (l List) HeadTail() (int, List) {
if l.size == 0 {
panic(fmt.Sprintf("can't take HeadTail of empty list"))
}
return l.underlying[l.head], List{
underlying: l.underlying,
head: l.head + 1,
size: l.size - 1,
}
}

View File

@ -1,280 +0,0 @@
package vm
import (
"errors"
"fmt"
"strconv"
"github.com/mediocregopher/ginger/lang"
"llvm.org/llvm/bindings/go/llvm"
)
type op interface {
inType() valType
outType() valType
build(*Module) (llvm.Value, error)
}
type valType struct {
term lang.Term
llvm llvm.Type
}
func (vt valType) isInt() bool {
return lang.Equal(Int, vt.term)
}
func (vt valType) eq(vt2 valType) bool {
return lang.Equal(vt.term, vt2.term) && vt.llvm == vt2.llvm
}
// primitive valTypes
var (
valTypeVoid = valType{term: lang.Tuple{}, llvm: llvm.VoidType()}
valTypeInt = valType{term: Int, llvm: llvm.Int64Type()}
)
////////////////////////////////////////////////////////////////////////////////
// most types don't have an input, so we use this as a shortcut
type voidIn struct{}
func (voidIn) inType() valType {
return valTypeVoid
}
////////////////////////////////////////////////////////////////////////////////
type intOp struct {
voidIn
c lang.Const
}
func (io intOp) outType() valType {
return valTypeInt
}
func (io intOp) build(mod *Module) (llvm.Value, error) {
ci, err := strconv.ParseInt(string(io.c), 10, 64)
if err != nil {
return llvm.Value{}, err
}
return llvm.ConstInt(llvm.Int64Type(), uint64(ci), false), nil
}
////////////////////////////////////////////////////////////////////////////////
type tupOp struct {
voidIn
els []op
}
func (to tupOp) outType() valType {
termTypes := make(lang.Tuple, len(to.els))
llvmTypes := make([]llvm.Type, len(to.els))
for i := range to.els {
elValType := to.els[i].outType()
termTypes[i] = elValType.term
llvmTypes[i] = elValType.llvm
}
vt := valType{term: lang.Tuple{Tuple, termTypes}}
if len(llvmTypes) == 0 {
vt.llvm = llvm.VoidType()
} else {
vt.llvm = llvm.StructType(llvmTypes, false)
}
return vt
}
func (to tupOp) build(mod *Module) (llvm.Value, error) {
str := llvm.Undef(to.outType().llvm)
var val llvm.Value
var err error
for i := range to.els {
if val, err = to.els[i].build(mod); err != nil {
return llvm.Value{}, err
}
str = mod.b.CreateInsertValue(str, val, i, "")
}
return str, err
}
////////////////////////////////////////////////////////////////////////////////
type tupElOp struct {
voidIn
tup op
i int
}
func (teo tupElOp) outType() valType {
tupType := teo.tup.outType()
return valType{
llvm: tupType.llvm.StructElementTypes()[teo.i],
term: tupType.term.(lang.Tuple)[1].(lang.Tuple)[1],
}
}
func (teo tupElOp) build(mod *Module) (llvm.Value, error) {
if to, ok := teo.tup.(tupOp); ok {
return to.els[teo.i].build(mod)
}
tv, err := teo.tup.build(mod)
if err != nil {
return llvm.Value{}, err
}
return mod.b.CreateExtractValue(tv, teo.i, ""), nil
}
////////////////////////////////////////////////////////////////////////////////
type varOp struct {
op
v llvm.Value
built bool
}
func (vo *varOp) build(mod *Module) (llvm.Value, error) {
if !vo.built {
var err error
if vo.v, err = vo.op.build(mod); err != nil {
return llvm.Value{}, err
}
vo.built = true
}
return vo.v, nil
}
type varCtx map[string]*varOp
func (c varCtx) assign(name string, vo *varOp) error {
if _, ok := c[name]; ok {
return fmt.Errorf("var %q already assigned", name)
}
c[name] = vo
return nil
}
func (c varCtx) get(name string) (*varOp, error) {
if o, ok := c[name]; ok {
return o, nil
}
return nil, fmt.Errorf("var %q referenced before assignment", name)
}
////////////////////////////////////////////////////////////////////////////////
type addOp struct {
voidIn
a, b op
}
func (ao addOp) outType() valType {
return ao.a.outType()
}
func (ao addOp) build(mod *Module) (llvm.Value, error) {
av, err := ao.a.build(mod)
if err != nil {
return llvm.Value{}, err
}
bv, err := ao.b.build(mod)
if err != nil {
return llvm.Value{}, err
}
return mod.b.CreateAdd(av, bv, ""), nil
}
////////////////////////////////////////////////////////////////////////////////
func termToOp(ctx varCtx, t lang.Term) (op, error) {
aPat := func(a lang.Atom) lang.Tuple {
return lang.Tuple{lang.AAtom, a}
}
cPat := func(t lang.Term) lang.Tuple {
return lang.Tuple{lang.AConst, t}
}
tPat := func(el ...lang.Term) lang.Tuple {
return lang.Tuple{Tuple, lang.Tuple(el)}
}
if !lang.Match(tPat(aPat(lang.AUnder), lang.TDblUnder), t) {
return nil, fmt.Errorf("term %v does not look like a vm command", t)
}
k := t.(lang.Tuple)[0].(lang.Atom)
v := t.(lang.Tuple)[1]
// for when v is a Tuple argument, convenience function for casting
vAsTup := func(n int) ([]op, error) {
vop, err := termToOp(ctx, v)
if err != nil {
return nil, err
}
ops := make([]op, n)
for i := range ops {
ops[i] = tupElOp{tup: vop, i: i}
}
return ops, nil
}
switch k {
case Int:
if !lang.Match(cPat(lang.AUnder), v) {
return nil, errors.New("int requires constant arg")
}
return intOp{c: v.(lang.Const)}, nil
case Tuple:
if !lang.Match(lang.Tuple{Tuple, lang.AUnder}, v) {
return nil, errors.New("tup requires tuple arg")
}
tup := v.(lang.Tuple)
tc := tupOp{els: make([]op, len(tup))}
var err error
for i := range tup {
if tc.els[i], err = termToOp(ctx, tup[i]); err != nil {
return nil, err
}
}
return tc, nil
case Var:
if !lang.Match(aPat(lang.AUnder), v) {
return nil, errors.New("var requires atom arg")
}
name := v.(lang.Atom).String()
return ctx.get(name)
case Assign:
if !lang.Match(tPat(tPat(aPat(Var), aPat(lang.AUnder)), lang.TDblUnder), v) {
return nil, errors.New("assign requires 2-tuple arg, the first being a var")
}
tup := v.(lang.Tuple)
name := tup[0].(lang.Tuple)[1].String()
o, err := termToOp(ctx, tup[1])
if err != nil {
return nil, err
}
vo := &varOp{op: o}
if err := ctx.assign(name, vo); err != nil {
return nil, err
}
return vo, nil
// Add is special in some way, I think it's a function not a compiler op,
// not sure yet though
case Add:
els, err := vAsTup(2)
if err != nil {
return nil, err
} else if !els[0].outType().eq(valTypeInt) {
return nil, errors.New("add args must be numbers of the same type")
} else if !els[1].outType().eq(valTypeInt) {
return nil, errors.New("add args must be numbers of the same type")
}
return addOp{a: els[0], b: els[1]}, nil
default:
return nil, fmt.Errorf("op %v unknown, or its args are malformed", t)
}
}

129
vm/vm.go
View File

@ -1,129 +0,0 @@
package vm
import (
"errors"
"fmt"
"sync"
"github.com/mediocregopher/ginger/lang"
"llvm.org/llvm/bindings/go/llvm"
)
// Types supported by the vm in addition to those which are part of lang
var (
Atom = lang.AAtom
Tuple = lang.ATuple
Int = lang.Atom("int")
)
// Ops supported by the vm
var (
Add = lang.Atom("add")
Assign = lang.Atom("assign")
Var = lang.Atom("var")
)
////////////////////////////////////////////////////////////////////////////////
// Module contains a compiled set of code which can be run, dumped in IR form,
// or compiled. A Module should be Dispose()'d of once it's no longer being
// used.
type Module struct {
b llvm.Builder
m llvm.Module
ctx varCtx
mainFn llvm.Value
}
var initOnce sync.Once
// Build creates a new Module by compiling the given Terms as code
// TODO only take in a single Term, implement List and use that with a do op
func Build(tt ...lang.Term) (*Module, error) {
initOnce.Do(func() {
llvm.LinkInMCJIT()
llvm.InitializeNativeTarget()
llvm.InitializeNativeAsmPrinter()
})
mod := &Module{
b: llvm.NewBuilder(),
m: llvm.NewModule(""),
ctx: varCtx{},
}
var err error
if mod.mainFn, err = mod.buildFn(tt...); err != nil {
mod.Dispose()
return nil, err
}
if err := llvm.VerifyModule(mod.m, llvm.ReturnStatusAction); err != nil {
mod.Dispose()
return nil, fmt.Errorf("could not verify module: %s", err)
}
return mod, nil
}
// Dispose cleans up all resources held by the Module
func (mod *Module) Dispose() {
// TODO this panics for some reason...
//mod.m.Dispose()
//mod.b.Dispose()
}
// TODO make this return a val once we get function types
func (mod *Module) buildFn(tt ...lang.Term) (llvm.Value, error) {
if len(tt) == 0 {
return llvm.Value{}, errors.New("function cannot be empty")
}
ops := make([]op, len(tt))
var err error
for i := range tt {
if ops[i], err = termToOp(mod.ctx, tt[i]); err != nil {
return llvm.Value{}, err
}
}
var llvmIns []llvm.Type
if in := ops[0].inType(); in.llvm.TypeKind() == llvm.VoidTypeKind {
llvmIns = []llvm.Type{}
} else {
llvmIns = []llvm.Type{in.llvm}
}
llvmOut := ops[len(ops)-1].outType().llvm
fn := llvm.AddFunction(mod.m, "", llvm.FunctionType(llvmOut, llvmIns, false))
block := llvm.AddBasicBlock(fn, "")
mod.b.SetInsertPoint(block, block.FirstInstruction())
var out llvm.Value
for i := range ops {
if out, err = ops[i].build(mod); err != nil {
return llvm.Value{}, err
}
}
mod.b.CreateRet(out)
return fn, nil
}
// Dump dumps the Module's IR to stdout
func (mod *Module) Dump() {
mod.m.Dump()
}
// Run executes the Module
// TODO input and output?
func (mod *Module) Run() (interface{}, error) {
engine, err := llvm.NewExecutionEngine(mod.m)
if err != nil {
return nil, err
}
defer engine.Dispose()
funcResult := engine.RunFunction(mod.mainFn, []llvm.GenericValue{})
defer funcResult.Dispose()
return funcResult.Int(false), nil
}

View File

@ -1,84 +0,0 @@
package vm
import (
. "testing"
"github.com/mediocregopher/ginger/lang"
)
func TestCompiler(t *T) {
mkcmd := func(a lang.Atom, args ...lang.Term) lang.Tuple {
// TODO a 1-tuple should be the same as its element?
if len(args) == 1 {
return lang.Tuple{a, args[0]}
}
return lang.Tuple{a, lang.Tuple(args)}
}
mkint := func(i string) lang.Tuple {
return lang.Tuple{Int, lang.Const(i)}
}
type test struct {
in []lang.Term
exp uint64
}
one := mkint("1")
two := mkint("2")
foo := mkcmd(Var, lang.Atom("foo"))
bar := mkcmd(Var, lang.Atom("bar"))
baz := mkcmd(Var, lang.Atom("baz"))
tests := []test{
{
in: []lang.Term{one},
exp: 1,
},
{
in: []lang.Term{
mkcmd(Add, mkcmd(Tuple, one, two)),
},
exp: 3,
},
{
in: []lang.Term{
mkcmd(Assign, foo, one),
mkcmd(Add, mkcmd(Tuple, foo, two)),
},
exp: 3,
},
{
in: []lang.Term{
mkcmd(Assign, foo, mkcmd(Tuple, one, two)),
mkcmd(Add, foo),
},
exp: 3,
},
{
in: []lang.Term{
mkcmd(Assign, foo, mkcmd(Tuple, one, two)),
mkcmd(Assign, bar, mkcmd(Add, foo)),
mkcmd(Assign, baz, mkcmd(Add, foo)),
mkcmd(Add, mkcmd(Tuple, bar, baz)),
},
exp: 6,
},
}
for _, test := range tests {
t.Logf("testing program: %v", test.in)
mod, err := Build(test.in...)
if err != nil {
t.Fatalf("building failed: %s", err)
}
out, err := mod.Run()
if err != nil {
mod.Dump()
t.Fatalf("running failed: %s", err)
} else if out != test.exp {
mod.Dump()
t.Fatalf("expected result %T:%v, got %T:%v", test.exp, test.exp, out, out)
}
}
}