ginger/README.md
2016-08-05 11:42:07 -06:00

194 lines
6.6 KiB
Markdown

# Ginger - I'll get it right this time
## A note on compile-time vs runtime
Ginger is a language whose primary purpose is to be able to describe and compile
itself. A consequence of this is that it's difficult to describe the actual
process by which compiling is done without first describing the built-in types,
but it's also hard to describe the built-in types without first describing the
process by which compiling is done. So I'm going to do one, then the other, and
I ask you to please bear with me.
## The primitive types
Ginger is a language which encompasses itself. That means amongst the "normal"
primitives a language is expected to have it also has a couple which are used
for macros (which other languages would not normally expose outside of the
compiler's implementation).
```
// These are numbers
0
1
2
3
// These are strings
"hello"
"world"
"how are you?"
// These are identifiers. Values at runtime are bound to
// identifiers, such that whenever an identifier is used in a non-macro
// statement that value will be replaced with it
foo
barBaz
biz_buz
// These are macro identifiers. They are like identifiers, except they start
// with percent signs, and they represent operations or values which only exist
// at compile-time There are a number of builtin macros, but they can also be
// user-defined. We'll see more of them later
%foo
%barBaz
%biz_buz
```
## The data structures
Like the primitives, ginger has few built-in data structures, and the ones it
does have are primarily used to implement itself.
```
// These are tuples. Each is a unique and different type, based on its number of
// elements, the type of each element, and the order those types are in. The
// type of a tuple must be known at compile-time
1, 2
4, "foo", 5
// These are arrays. Their elements must be of the same type, but their length
// can be dynamically determined at runtime. The type of an array is only
// determined by the type of its elements, which must be known at compile-time
[1, 2, 3]
["a", "b", "c"]
// These are statements. A statement is a pair of things, the first being a
// macro identifier, the second being an argument (usually a tuple).
%add 1,2
%incr 1
```
There is a final data structure, called a block, which I haven't come up with a
special sytax for yet, and will be discussed later.
## Parenthesis
A pair of parenthesis can be used to enclose any type for clarity. For example:
```
// No parenthesis
%add 1, 2
// Parenthesis around the argument (the tuple)
%add (1, 2)
// Parenthsis around the statement
(%add 1, 2)
// Parenthesis around everything
(%add (1, 2))
```
## Compilation
Ginger programs are taken as a list of statements (as in, the primitive types
we've defined already).
During compilation each statement is looked at, first its arguments then its
operator. The arguments are "resolved" first, so that they have only primitive
types that aren't macros, statements or blocks. Then that is passed into the
macro operator which may output a further statement, or may change something in
the context of compilation (e.g. the set of identifier bindings), or both. This
is done until the statement contains no more macros to run, at which point the
process repeats at the next statement.
### Example
It's difficult to see this without examples, imo. So here's some example code,
with explanatory comments:
```
// %bind is a macro, it takes in a tuple of an identifier and some value, and
// assigns that value to the identifier at runtime
%bind a, 1
// %add takes in a tuple of numbers or identifiers and will return the sum of
// them. Here we're then binding that sum (3) to the identifier b.
%bind b, (%add a, 2)
// The previous two example are fairly simple, but do something subtle. A ginger
// program starts as a list of statements, and must continue to be a list of
// statements after all macros are run. Each of the above is a macro statement
// which returns a "runtime statement", i.e. a construct representing something
// which will happen at runtime. But they are of type `statement` nonetheless,
// so running these macros does not change the overall type of the program (a
// list of statements)
// Creates an identifier c and returns it. This can't be included at this point,
// because it doesn't return a statement of any sort.
// %ident "c"
// This first creates an identifier a, which is then part of a tuple (a, 2).
// This tuple is used in a further tuple, now (%add, (a, 2)). Remember, %add is
// simply a macro identifier at this point, it's not actually "run" because it's
// part of a tuple, not a statement, and as such can be passed around like any
// other primitive type.
//
// Finally, the tuple (%add, (a, 2)) is passed into %stmt, which creates a new
// statement from a tuple of an operation and an argument. So the statement
// (%add a, 2) is returned. Since this statement still has a macro, %add, that
// is then called, and it finally returns a runtime statement which adds the
// value a is bound to to 2>
%stmt %add, (%ident "a", 2)
// This is exactly equivalent to the above statement, except that it skips some
// redundant macro processing. They can be used interchangeably in all cases and
// situations.
%add a, 2
```
## Blocks
Thus far we've only been able to create code linearly, without much way to do
code-reuse or scoping or anything like that.
Blocks fix this. A block is composed of three lists:
- A list of identifiers which will be "imported" from the parent block (the top
level list of list of statements is itself a block, psych!).
- A list of statements
- A list of identifiers which will be "exported" from the block into the parent
There is not yet a special syntax for blocks, but there is a macro operator to
make them, much like the ones for statements and identifiers:
```
%bind a, 2
%do (%block [a], [
%bind b, (%add a, 3)
], [b])
%println b // prints 5
```
In the above we create a block which imports the `a` identifier, and exports the
`b` identifier that it creates internally. Note that we have to use `%do`
in order to actually "run" the block, since `%block` merely returns the block
structure, which is not a statement.
This seems kind of like a pain, and not much like a function. But combined with
other macros blocks can be used to implement your own function dispatch, so you
can add in variadic, defaults, named parameters, as well as implement closures,
type methods, and so forth, as needed and in the style desired.
## Final note
Keep in mind: blocks, statements, etc... are themselves data structures, and
given appropriate built-in macros they can be manipulated like any other data
structure. These are merely the building blocks for all other language features
(hopefully).