Go to file
2017-02-12 08:55:27 -07:00
expr make Tuple compile to a struct 2016-08-21 12:20:07 -06:00
lang taking a new approach using tuples and atoms for compilation, it's working out a lot better 2017-02-11 10:24:02 -07:00
lexer get rid of pipe, sorry pipe 2016-07-28 16:23:06 -06:00
vm attach term Type to values 2017-02-12 08:55:27 -07:00
.gitignore gitignore 2016-08-07 09:06:36 -06:00
BUILD add BUILD file so I remember how 2016-08-01 18:08:18 -06:00
main.go improve how buildCmds are defined 2017-02-11 13:35:02 -07:00
README.md readme.... again 2016-08-05 11:42:07 -06:00

Ginger - I'll get it right this time

A note on compile-time vs runtime

Ginger is a language whose primary purpose is to be able to describe and compile itself. A consequence of this is that it's difficult to describe the actual process by which compiling is done without first describing the built-in types, but it's also hard to describe the built-in types without first describing the process by which compiling is done. So I'm going to do one, then the other, and I ask you to please bear with me.

The primitive types

Ginger is a language which encompasses itself. That means amongst the "normal" primitives a language is expected to have it also has a couple which are used for macros (which other languages would not normally expose outside of the compiler's implementation).

// These are numbers
0
1
2
3

// These are strings
"hello"
"world"
"how are you?"

// These are identifiers. Values at runtime are bound to
// identifiers, such that whenever an identifier is used in a non-macro
// statement that value will be replaced with it
foo
barBaz
biz_buz

// These are macro identifiers. They are like identifiers, except they start
// with percent signs, and they represent operations or values which only exist
// at compile-time There are a number of builtin macros, but they can also be
// user-defined. We'll see more of them later
%foo
%barBaz
%biz_buz

The data structures

Like the primitives, ginger has few built-in data structures, and the ones it does have are primarily used to implement itself.

// These are tuples. Each is a unique and different type, based on its number of
// elements, the type of each element, and the order those types are in. The
// type of a tuple must be known at compile-time
1, 2
4, "foo", 5

// These are arrays. Their elements must be of the same type, but their length
// can be dynamically determined at runtime. The type of an array is only
// determined by the type of its elements, which must be known at compile-time
[1, 2, 3]
["a", "b", "c"]

// These are statements. A statement is a pair of things, the first being a
// macro identifier, the second being an argument (usually a tuple).
%add 1,2
%incr 1

There is a final data structure, called a block, which I haven't come up with a special sytax for yet, and will be discussed later.

Parenthesis

A pair of parenthesis can be used to enclose any type for clarity. For example:

// No parenthesis
%add 1, 2

// Parenthesis around the argument (the tuple)
%add (1, 2)

// Parenthsis around the statement
(%add 1, 2)

// Parenthesis around everything
(%add (1, 2))

Compilation

Ginger programs are taken as a list of statements (as in, the primitive types we've defined already).

During compilation each statement is looked at, first its arguments then its operator. The arguments are "resolved" first, so that they have only primitive types that aren't macros, statements or blocks. Then that is passed into the macro operator which may output a further statement, or may change something in the context of compilation (e.g. the set of identifier bindings), or both. This is done until the statement contains no more macros to run, at which point the process repeats at the next statement.

Example

It's difficult to see this without examples, imo. So here's some example code, with explanatory comments:

// %bind is a macro, it takes in a tuple of an identifier and some value, and
// assigns that value to the identifier at runtime
%bind a, 1

// %add takes in a tuple of numbers or identifiers and will return the sum of
// them. Here we're then binding that sum (3) to the identifier b.
%bind b, (%add a, 2)

// The previous two example are fairly simple, but do something subtle. A ginger
// program starts as a list of statements, and must continue to be a list of
// statements after all macros are run. Each of the above is a macro statement
// which returns a "runtime statement", i.e. a construct representing something
// which will happen at runtime. But they are of type `statement` nonetheless,
// so running these macros does not change the overall type of the program (a
// list of statements)

// Creates an identifier c and returns it. This can't be included at this point,
// because it doesn't return a statement of any sort.
// %ident "c"

// This first creates an identifier a, which is then part of a tuple (a, 2).
// This tuple is used in a further tuple, now (%add, (a, 2)). Remember, %add is
// simply a macro identifier at this point, it's not actually "run" because it's
// part of a tuple, not a statement, and as such can be passed around like any
// other primitive type.
//
// Finally, the tuple (%add, (a, 2)) is passed into %stmt, which creates a new
// statement from a tuple of an operation and an argument. So the statement
// (%add a, 2) is returned. Since this statement still has a macro, %add, that
// is then called, and it finally returns a runtime statement which adds the
// value a is bound to to 2>
%stmt %add, (%ident "a", 2)

// This is exactly equivalent to the above statement, except that it skips some
// redundant macro processing. They can be used interchangeably in all cases and
// situations.
%add a, 2

Blocks

Thus far we've only been able to create code linearly, without much way to do code-reuse or scoping or anything like that.

Blocks fix this. A block is composed of three lists:

  • A list of identifiers which will be "imported" from the parent block (the top level list of list of statements is itself a block, psych!).

  • A list of statements

  • A list of identifiers which will be "exported" from the block into the parent

There is not yet a special syntax for blocks, but there is a macro operator to make them, much like the ones for statements and identifiers:

%bind a, 2

%do (%block [a], [
    %bind b, (%add a, 3)
], [b])

%println b // prints 5

In the above we create a block which imports the a identifier, and exports the b identifier that it creates internally. Note that we have to use %do in order to actually "run" the block, since %block merely returns the block structure, which is not a statement.

This seems kind of like a pain, and not much like a function. But combined with other macros blocks can be used to implement your own function dispatch, so you can add in variadic, defaults, named parameters, as well as implement closures, type methods, and so forth, as needed and in the style desired.

Final note

Keep in mind: blocks, statements, etc... are themselves data structures, and given appropriate built-in macros they can be manipulated like any other data structure. These are merely the building blocks for all other language features (hopefully).