dehub/SPEC.md

272 lines
8.6 KiB
Markdown
Raw Normal View History

# .dehub
The `.dehub` directory contains all meta information related to
decentralized repository management and access control.
## config.yml
The `.dehub/config.yml` file takes the following structure:
```yaml
# accounts defines all accounts which are known to the repo.
accounts:
# Each account is an object with an id and at least one identifier. The id
# must be unique for each account.
- id: some_user_id:
# signifiers describes different methods the account might use to
# identify itself. Generally, these will be different public keys which
# commits will be signed with. At least one is required.
signifiers:
- type: "pgp_public_key"
body: "FULL PGP PUBLIC KEY STRING"
- type: "pgp_public_key_file"
path: ".dehub/some_user_id.asc"
- type: "keybase"
user: "some_keybase_user_id"
Refactor access controls to support multiple branches message: |- Refactor access controls to support multiple branches This was a big lift. It implements a backwards incompatible change to overhaul access control patterns to also encompass which branch is being interacted with, not only which files. The `accessctl` package was significantly rewritten to support this, as well as some of the code modifying it. The INTRODUCTION and SPEC were also appropriately updated. The change to the SPEC is _technically_ backwards incompatible, but it won't effect anything. The `access_control` previously being used will just be ignored, and the changes to `accessctl` include the definition of fallback access controls which will automatically be applied if nothing else matches, so when verifying the older history of this repo those will be used. change_hash: AIfNYLmOLGpuyTiVodT3hDe9lF4E+5DbOTgSdkbjJONb credentials: - type: pgp_signature pub_key_id: 95C46FA6A41148AC body: iQIzBAABAgAdFiEEJ6tQKp6olvZKJ0lwlcRvpqQRSKwFAl5aw0sACgkQlcRvpqQRSKy7kw//UMyS/waV/tE1vntZrMbmEtFmiXPcMVNal76cjhdiF3He50qXoWG6m0qWz+arD1tbjoZml6pvU+Xt45y/Uc54DZizzz0E9azoFW0/uvZiLApftFRArZbT9GhbDs2afalyoTJx/xvQu+a2FD/zfljEWE8Zix+bwHCLojiYHHVA65HFLEt8RsH33jFyzWvS9a2yYqZXL0qrU9tdV68hazdIm1LCp+lyVV74TjwxPAZDOmNAE9l4EjIk1pgr2Qo4u2KwJqCGdVCvka8TiFFYiP7C6319ZhDMyj4m9yZsd1xGtBd9zABVBDgmzCEjt0LI3Tv35lPd2tpFBkjQy0WGcMAhwSHWSP7lxukQMCEB7og/SwtKaExiBJhf1HRO6H9MlhNSW4X1xwUgP+739ixKKUY/RcyXgZ4pkzt6sewAMVbUOcmzXdUvuyDJQ0nhDFowgicpSh1m8tTkN1aLUx18NfnGZRgkgBeE6EpT5/+NBfFwvpiQkXZ3bcMiNhNTU/UnWMyqjKlog+8Ca/7CqgswYppMaw4iPaC54H8P6JTH+XnqDlLKSkvh7PiJJa5nFDG07fqc8lYKm1KGv6virAhEsz/AYKLoNGIsqXt+mYUDLvQpjlRsiN52euxyn5e41LcrH0RidIGMVeaS+7re1pWbkCkMMMtYlnCbC5L6mfrBu6doN8o= account: mediocregopher
2020-02-29 20:02:25 +00:00
# access_controls define who may do what in the repo. The value is a list of
# access control objects, each containing an action (allow or deny) and a set of
# filters. If a commit matches all filters (or if there are no filters) then the
# action is taken. If not, then the next access control is attempted.
#
# If no access controls match a commit, then the default list is used, which
# will definitely match. The following is the default set, which is enumerated
# here for informational purposes only; it does not normally need to be defined.
access_controls:
- action: allow
filters:
- type: not
filter:
type: branch
pattern: main
- type: signature
any_account: true
count: 1
- action: allow
filters:
- type: branch
pattern: main
- type: commit_type
commit_type: change
- type: signature
any_account: true
count: 1
- action: deny
```
# Change Hash
When a change commit (see Commits section) is being signed by a signifier there
is an expected data format for the data to be signed. The format is a SHA-256
hash of the following pieces of data concatenated together:
* A uvarint indicating the number of bytes in the commit message.
* The message.
* A uvarint indicating the number of files changed.
* For each file changed in the commit, ordered lexographically-ascending based
on its full relative path within the repo, the following is then written:
* A uvarint indicating the length of the full relative path of the file
within the repo.
* The full relative path of the file within the repo.
* A little-endian uint32 representing the previous file mode of the file (or 0
if the file is being inserted).
* The 20-byte SHA1 hash of the previous version of the file's contents (or 20
0 bytes if the file is being inserted).
* A little-endian uint32 representing the new file mode of the file (or 0
if the file is being deleted).
* The 20-byte SHA1 hash of the new version of the file's contents (or 20
0 bytes if the file is being deleted).
The raw output from the SHA-256 is then prepended with a `0` byte (for forward
compatibility). The result is the raw change hash.
# Comment Message Hash
When a comment commit (see Commits section) is being signed by the signifier of
the author there is an expected data format for the data to be signed, very
similar to how change hashes are signed. The format is a SHA-256 hash of the
following pieces of data communicated together:
* A uvarint indicating the number of bytes in the comment message.
* The message.
The raw output from the SHA-256 is then prepended with a `0` byte (for forward
compatibility). The result is the raw comment hash.
# Credentials
All file changes need to have some kind of credential to be accepted into the
`main` branch (see Main Branch section). Each credential is encoded as a yaml
object with a `type` field.
All credentials contain enough information to correspond them to a specific
signifier in the `config.yml`, so as to be able to verify them.
## PGP Signature Credential
Currently there is only a single credential type, the `pgp_signature`, which
signs a raw change hash (which is communicated out-of-band of the object):
```
type: pgp_signature
account_id: some_user_id
pub_key_id: XXX
body: "base-64 signature body"
```
# Commits
All commit messages in dehub repositories are expected to follow the following
template (newlines included, yaml comments start with `#` and are only for
informational purposes):
```
Human readable message head
---
# Three dashes indicate the start of the yaml body. Everything after must be
# valid yaml.
type: type of the commit # Always required
fieldA: valueA
fieldB: valueB
```
## Change Commits
Commits of type `change` correspond to the standard git commit; they encompass a
set of file changes as well as a message describing the changes which occurred.
They extend the standard git commit with a few dehub specific features, such as
the change hash and credentials.
`change` commits are, currently, the _only_ commit type which are allowed to
have file changes.
Example change commit message:
```
This is the message head. It will be re-iterated within the message field
---
type: change
message: >
This is the message head. It will be re-iterated within the message field
The rest of this field is for the message body, which corresponds to the
body of a normal commit message which might give a more long-form
explanation of the commit's changes.
Since the message is used in generating the signature it's necessary for it
to be encoded here fully formed, even though the message head is then
duplicated. Otherwise the exact bytes of the message would be ambiguous.
This situation is ugly, but not unbearable.
# The change_hash is able to be computed from the commit's message and changed
# files, but is reproduced in the commit message for forward compatibility, e.g.
# if the algorithm to compute the hash changes.
change_hash: XXX
# Credentials are the set of credentials which indicate approval of the change
credentials:
- type: pgp_signature
account_id: some_user_id
pub_key_id: XXX
body: "base-64 signature body"
```
## Credential Commits
Commits of type `credential` contain one or more credentials for some hash
(presumably a change hash, but in the future there may be other types). The
commit message head is not spec'd, but should be a human-readable description of
"who is crediting what, and how".
Example credential commit message:
```
some_user_id pgp sig of commits AAA..BBB with key CCC
---
type: credential
credentialed_hash: XXX
credentials:
- type: pgp_signature
account_id: some_user_id
pub_key_id: CCC
body: "base-64 signature body"
```
## Comment Commits
Commits of type `comment` contain a message for others to read. The commit
message head is not spec'd, but should be a human-readable description of "who
is commenting what".
Example credential commit message:
```
some_user_id has commented: Hey all, how's it going?
---
type: comment
# The message_hash is computed from the message, and reproduced here for
# forwards compatibility. See the Comment Message Hash section.
message_hash: XXX
message: >
Heay all, how's it going?
Just wanted to pop by and say howdy.
# credentials can contain a signature from the author of this comment's
# message_hash.
credentials:
- type: pgp_signature
account_id: some_user_id
pub_key_id: CCC
body: "base-64 signature body"
```
# Branches
dehub branches correspond 1-to-1 with branches in the underlying git repo. All
commits in a dehub branch should contain an encoded message as specified in the
Commits section of this document, and possibly file changes as appropriate.
## Main Branch
The "primary" branch of a dehub repo is the `main` branch. All new commits
being appended to the HEAD of the `main` branch are subject to the following
requirements:
* Must be `change` commits.
* Must conform to all requirements defined by the `access_controls` section of
the `config.yml`, as found in the current HEAD. If the commit is the initial
commit of the branch then it instead uses the `config.yml` found in itself.
* Must be a "fast-forward" commit (this may be amended later, but at present it
simplifies implementation).
## Thread Branches
Branches which are not the `main` branch are referred to as "threads", and have
much less stringent requirements than the `main` branch:
* They can contain commits of any type, as long as the commits come from those
with an account defined in the `config.yml`.
* `change` commits are not subject `access_controls` requirements.
# TODO
* access control patterns related to who may push to MR branches, and what types
of commits they can push.