dehub/docs/SPEC.md
mediocregopher a709a43696 Re-organize and flesh-out documentation, including writing the tutorials
---
type: change
description: Re-organize and flesh-out documentation, including writing the tutorials
fingerprint: AP5oeODaJO4eq84LRE3rlFgEVGyPa3OywpyftSgsrx13
credentials:
- type: pgp_signature
  pub_key_id: 95C46FA6A41148AC
  body: iQIzBAABAgAdFiEEJ6tQKp6olvZKJ0lwlcRvpqQRSKwFAl7VRQ8ACgkQlcRvpqQRSKwgPg//R95XdGAKyC5Db0n94rHGW9LY9bcnmIe3WEQlQu2UFLJWzwGk0xJNPHugBz7tKqzEMQ+dQJ6Dl1/UjoCzMfr70Bwv26hAJa+CYLwq0qOoAqmCNZkxBREvlQfGV6E82P3iXZVOsFyNyUUTJEsM9ZdOUQB7+wnBHqw67gsTqOkS/6xXr5EPvUZeyASdG2epgHIh+DciDo6O9h6rBjMtyTgOFkCOCHKsZN8a5elAl+LaRaNWh05DJSh3y0VwPlEqfuR+zph8r5Q64aIJEY2ZA8a91T2SJQhBnUVjZ6H9nEqdhuq3bVbxgGdcZoX07pJIFIaqwIICkzEuxGtuRT0PZUC4yz2fjoiI7ykVTEN5urVOXL+vfZbgbklyST+BAUg5Qlac7fD9CP7nlGQ+alcXwL2cHBkXfRZedzw+MCyn/Qph0cNPE10uzwgR3pSWx2Sr6FOaBW/CXSH9y9rhcSF38jgXA6XirSOy3GpqfwHQaC9ol5Vm2R948XS2u0qJV3RZlcuylE62ST4K8pOiHn97HrGZnfG7TyyiYNvWjAq9avYwNhd3klWpgLs+OrgFN+f08xQxqnVbVEpwKLCwXmhRMyW7UVDRgoGTcfB7MVvWVxqbE3/f9VawF/baX4q324f+cMVkchyk0UeGnV30pJrWoDSw3UN3FAQoS/PYWNW3dZgp5F0=
  account: mediocregopher
2020-06-01 12:13:11 -06:00

502 lines
16 KiB
Markdown

# SPEC
This document describes the dehub protocol.
This document assumes that the reader is familiar with git, both conceptually
and in practical use of the git tool. All references to a git-specific concept
retain their meaning; dehub concepts build upon git concepts, but do not
override them.
## Project {#project}
A dehub project is comprised of:
* A collection of files and directories.
* Meta actions related to those files, e.g. discussion, proposed changes, etc.
* Configuration defining which meta actions are allowed under which
circumstances.
All of these components are housed in a git repository. A dehub project does not
require a central repository location (a "remote"), though it may use one if
desired.
## Commit Payload {#payload}
All commits in a dehub [project](#project) contain a payload. The payload is
encoded into the commit message as a YAML object. Here is the general structure
of a commit message containing a payload:
```
Human readable message head
---
# Three dashes indicate the start of the yaml body.
type: type of the payload # Always required
fingerprint: std-base-64 string # Always required
credentials:[...] # Not required but usually present
type_specific_field_a: valueA
type_specific_field_b: valueB
```
The message head is a human readable description of what is being committed, and
is terminated at the first newline. Everything after the message head must be
valid YAML which encodes the payload.
### Fingerprint {#fingerprint}
Each [payload](#payload) object contains a `fingerprint` field. The fingerprint
is an opaque byte string encoded using standard base-64. The algorithm used to
generate the fingerprint will depend on the payload type, and can be found in
each type's sub-section in this document.
### Credential {#credential}
The `credentials` field is not required, but in practice will be found on almost
every [payload](#payload). The field's value will be an array of credential
objects. Only one credential object is currently supported, `pgp_signature`:
```yaml
type: pgp_signature
# One of these fields is required. If account_id is present, it relates the
# signature to a pgp_public_key signifier defined for that account in the config
# (see the Signifier sub-section). Otherwise, the public key will be included in
# the credential itself as the value of pub_key_body.
account_id: some_user_id # Optional
pub_key_body: inlined ASCII-armored pgp public key
# the ID (pgp fingerprint) of the key used to generate the signature
pub_key_id: XXX
# a signature of the payload's unencoded fingerprint, encoded using standard
# base-64
body: std-base-64 signature
```
### Payload Types {#payload-types}
#### Change Payload {#change-payload}
A change [payload](#payload) encompasses a set of changes to the files in the
project. To construct the change payload one must reference the file tree of the
commit which houses the payload as well as the file tree of its parent commit;
specifically one must take the difference between them.
A change payload looks like this:
```yaml
type: change
fingerprint: std-base-64 string
credentials: [...]
description: |-
The description will generally start with a single line, followed by a long-form body
The description corresponds to the body of a commit message in a "normal"
git repo. It gives a more-or-less long-form explanation of the changes being
made to the project's files.
```
##### Change Payload Fingerprint {#change-payload-fingerprint}
The unencoded [fingerprint](#fingerprint) of a [change payload](#change-payload)
is calculated as follows:
* Concatenate the following:
* A uvarint indicating the number of bytes in the description string.
* The description string.
* A uvarint indicating the number of files changed between this commit and
its parent.
* For each file changed, ordered lexographically-ascending based on its full
relative path within the git repo:
* A uvarint indicating the length of the full relative path of the file
within the repo, as a string.
* The full relative path of the file within the repo, as a string.
* A little-endian uint32 representing the previous file mode of the file
(or 0 if the file is not present in the parent commit's tree).
* The 20-byte SHA1 hash of the contents of the previous version of the file
(or 20 0 bytes if the file is not present in the parent commit's tree).
* A little-endian uint32 representing the new file mode of the file (or 0
if the file is not present in the current commit's tree).
* The 20-byte SHA1 hash of the contents of the new version of the file (or
20 0 bytes if the file is not present in the current commit's tree).
* Calculate the SHA-256 hash of the concatenation result.
* Prepend a 0 byte to the result of the SHA-256 hash.
This unencoded fingerprint is then standard base-64 encoded, and that is used as
the value of the `fingerprint` field.
#### Comment Payload {#comment-payload}
A comment [payload](#payload) encompasses no file changes, and is used only to
contain a comment made by a single user.
A comment payload looks like this:
```yaml:
type: comment
fingerprint: std-base-64 string
credentials: [...]
comment: |-
Hey all, how's it going?
Just wanted to pop by and say howdy.
```
The message head of a comment payload will generally be a truncated form of the
comment itself.
##### Comment Payload Fingerprint {#comment-payload-fingerprint}
The unencoded [fingerprint](#fingerprint) of a [comment
payload](#comment-payload) is calculated as follows:
* Concatenate the following:
* A uvarint indicating the number of bytes in the comment string.
* The comment string.
* Calculate the SHA-256 hash of the concatenation result.
* Prepend a 0 byte to the result of the SHA-256 hash.
This unencoded fingerprint is then standard base-64 encoded, and that is used as
the value of the `fingerprint` field.
#### Credential Payload
A credential [payload](#payload) contains only one or more credentials for an
arbitrary [fingerprint](#fingerprint). Credential payloads can be combined with
other payloads of the same fingerprint to create a new payload with many
credentials.
A credential payload looks like this:
```yaml
type: credential
fingerprint: std-base-64 string
credentials: [...]
# This field is not required, but can be helpful in situations where the
# fingerprint was generated based on multiple change payloads
commits:
- commit hash
- commit hash
- commit hash
# This field is not required, but can be helpful to clarify which description
# was used when generating a change fingerprint.
change_description: blah blah blah
```
## Project Configuration {#project-configuration}
The `.dehub` directory contains all meta information related to the dehub
[project](#project). All files within `.dehub` are tracked by the git repo like
any other files in the project.
### config.yml {#config-yml}
The `.dehub/config.yml` file contains a yaml encoded configuration object:
```yaml
accounts: [...]
access_controls: [...]
```
Both fields are described in their own sub-section below.
#### Account {#account}
An account defines a specific user of a [project](#project). Every account has
an ID; no two accounts within a project may share the same ID.
An account looks like this:
```yaml
id: some_string
signifiers: [...]
```
##### Signifier {#signifier}
A signifier is used to signify that an [account](#account) has taken some
action. The most common use-case is to prove that an account created a
particular [credential](#credential). An account may have more than one
signifier.
Currently there is only one signifier type, `pgp_public_key`:
```yaml
type: pgp_public_key
# Path to ASCII-armored pgp public key, relative to repo root.
path: .dehub/account.asc
```
or
```yaml
type: pgp_public_key
body: inlined ASCII-armored pgp public key
```
#### Access Control {#access-control}
An access control allows or denies a particular commit from becoming a part of
a [project](#project). Each access control has an action (allow or deny) and a
set of filters (filters are described in the next section):
```yaml
action: allow # or deny
filters: [...]
```
When a verifying a commit against a project's access controls, each access
control's filters are applied to the commit in the order they appear in the
configuration. The first access control for which all filters match is found,
and its action is taken.
An access control with no filters matches all commits.
##### Filter {#filter}
There are many kinds of [access control](#access-control) filters. Any filter
can be applied to a commit, with no other input, and produce a boolean value.
All filters have a `type` field which indicates their type.
###### Signature Filter {#signature-filter}
A [filter](#filter) of type `signature` asserts that a commit's
[payload](#payload) contains [signature credentials](#credential) with certain
properties. A signature filter must have one of these fields, which define the
set of users or [accounts](#account) whose signatures are applicable.
* `account_ids: [...]` - an array of account IDs, each having been defined in
the accounts section of the [configuration](#config-yml).
* `any_account: true` - matches any account defined in the accounts section of
the configuration.
* `any: true` - matches any signature, whether or not its signifier has been
defined in the configuration.
A `count` field may also be included. Its value may be an absolute number (e.g.
`5`) or it may be a string indicating a percent (e.g. `"50%"`). If not included
it will be assumed to be `1`.
The count indicates how many accounts from the specified set must have a
signature included. If a percent is given then that will be multiplied against
the size of the set (rounded up) to determine the necessary number.
Here are some example signature filters, and explanations for each:
```yaml
# requires that 2 of the 3 specified accounts has a signature credential on
# the commit.
type: signature
account_ids:
- amy
- bill
- colleen
count: 2
```
```yaml
# requires that every account defined in the configuration has a signature
# credential on the commit.
type: signature
any_account: true
count: 100%
```
```yaml
# requires at least one signature credential, not necessarily from an account.
type: signature
any: true
```
###### Branch Filter {#branch-filter}
A [filter](#filter) of type `branch` matches the commit based on which branch in
the repo it is being or has been committed to. Matching is performed on the
short name of the branch, using globstar pattern matching.
A branch filter can have one or multiple patterns defined. The filter will match
if at least one defined pattern matches the short form of the branch name.
A branch filter with only one pattern can be defined like this:
```yaml
type: branch
pattern: some_branch
```
A branch filter with multiple patterns can be defined like this:
```yaml
type: branch
patterns:
- some_branch
- branch*glob
- amy/**
```
###### Files Changed Filter {#files-changed-filter}
A [filter](#filter) of type `files_changed` matches the commit based on which
files were changed between the tree of the commit's parent and the commit's
tree. Matching is performed on the paths of the changed files, relative to the
repo root.
A files changed filter can have one or multiple patterns defined. The filter
will match if any of the changed files matches at least one defined pattern.
A files changed filter with only one pattern can be defined like this:
```yaml
type: files_changed
pattern: .dehub/*
```
A files changed filter with multiple patterns can be defined like this:
```yaml
type: files_changed
patterns:
- some/dir/*
- foo_files_*
- **.jpg
```
###### Payload Type Filter {#payload-type-filter}
A [filter](#filter) of type `payload_type` matches a commit based on the type of
its [payload](#payload). A payload type filter can have one or more types
defined. The filter will match if the commit's payload type matches at least one
of the defined types.
A payload type filter with only one matching type can be defined like this:
```yaml
type: payload_type
payload_type: comment
```
A payload type filter with multiple matching types can be defined like this:
```yaml
type: payload_type
payload_types:
- comment
- change
```
###### Commit Attributes Filter {#commit-attributes-filter}
A [filter](#filter) of type `commit_attributes` matches a commit based on
certain attributes it has. A commit attributes filter may have one or more
fields defined, each corresponding to a different attribute the commit may have.
If more than one field is defined then all corresponding attributes on the
commit must match for the filter to match.
Currently the only possible attribute is `non_fast_forward: true`, which matches
a commit which is not an ancestor of the HEAD of the branch it's being pushed
onto. This attribute only makes sense in the context of a pre-receive git hook.
A commit attributes filter looks like this:
```yaml
type: commit_attributes
non_fast_forward: true
```
###### Not Filter {#not-filter}
A [filter](#filter) of type `not` matches a commit using the negation of a
sub-filter, defined within the not filter. If the sub-filter returns true for
the commit, then the not filter returns false, and vice-versa.
A not filter looks like this:
```
type: not
filter:
# a branch filter is used as the sub-filter in this example
type: branch
pattern: main
```
##### Default Access Controls {#default-access-controls}
These [access controls](#access-control) will be implicitly appended to the list
defined in the [configuration](#config-yml):
```yaml
# Any account may add any commit to any non-main branch, provided there is at
# least one signature credential. This includes non-fast-forwards.
- action: allow
filters:
- type: not
filter:
type: branch
pattern: main
- type: signature
any_account: true
count: 1
# Non-fast-forwards are denied in all other cases. In effect, one cannot
# force-push onto the main branch.
- action: deny
filters:
- type: commit_attributes
non_fast_forward: true
# Any account may add any change commit to the main branch, provided there is
# at least one signature credential.
- action: allow
filters:
- type: branch
pattern: main
- type: payload_type
payload_type: change
- type: signature
any_account: true
count: 1
# All other actions are denied.
- action: deny
```
These default access controls provide a useful baseline of requirements that all
[projects](#project) will (hopefully) find useful in their infancy.
## Commit Verification {#commit-verification}
The dehub protocol is designed such that every commit is "verifiable". A
verifiable commit has the following properties:
* Its [fingerprint](#fingerprint) is correctly formed.
* All of its [credentials](#credential) are correctly formed.
* If they are signatures, they are valid signatures of the commit's
unencoded fingerprint.
* The project's [access controls](#access-control) allow the commit.
The [project's configuration](#config-yml) is referenced frequently when
verifying a commit, such as when determining which access controls to apply and
discovering [signifiers](#signifier) of [accounts](#account). In all cases the
configuration as defined in the commit's _parent_ is used when verifying that
commit. The exception is the [prime commit](#prime-commit), which uses its own
configuration.
### Prime Commit {#prime-commit}
The prime commit is the trusted seed of the [project](#project). When a user
clones and verifies a dehub project they must, implicitly or explicitly, trust
the contents of the prime commit. All other commits must be ancestors of the
prime commit.
Manually specifying a prime commit is not currently spec'd, but it will be.
By default the prime commit is the root commit of the `main` branch.