dehub/SPEC.md
mediocregopher c87baa5192 Implement credential commit object, but don't use it anywhere yet
---
type: change
message: Implement credential commit object, but don't use it anywhere yet
change_hash: AIJRxhlQQuDhoByl1nApcFrRhlS9bK+4w/6JYl8SHl9o
credentials:
- type: pgp_signature
  pub_key_id: 95C46FA6A41148AC
  body: iQIzBAABAgAdFiEEJ6tQKp6olvZKJ0lwlcRvpqQRSKwFAl5hjqYACgkQlcRvpqQRSKyD5Q/+MTE6eiKXA0dicHXzv7DR48BMmXwD9wgekF5s49hUOh70+zmPvSypTI3hbnN4lfRnubTvrCZDBwxwhb5q6DRNosNGoBdUj5ofzpsyOJwoy7+htPJrZci6aAzy2uTcVMdqyaqbrDx9RAXGpqui1pZlS+kyUWlhMy4WT6ilJEHVDwqzVw6YcemflcdE6tc+ufqbkmVSy9uAMHLd6a5iegsbX0g/qwME7Qx07yWsuqwfD6FMCeEwtwk9FIkIkPAASZz0u4BlNy6g2KtGrWRraO9i2sQJyLoXh1/vhxCEX9oq0HrzutECgodaDBIFCnZd4NrDSeRPRHKxYF1/rWCUIkPzTMtQR5tGY2CYMCfTC1gBzb5UuJlkAi/D6TTaZO1JWwl2J7R9701D0aEds8w3FFVxOmggwtXonX5mjkYMLyEGVzinPjgtQBgZJQDZ4mIXhUaG8U0vhmo5pQFLAokKXWTo/Jbwm8MZ6SaiLmtGzGm+VoU4sGyUmVURC9X3bv8UjxFJrDibOnmEkfodNzkBsMow0y2gw78XaAME5h4TlBqjCof5iFc9gIBZWskOJYXcEenjChYADwt6Yvfm9UHhXeP8DSb0UtXz3se/PkUrjsC8bCqESncDYyB8QvIcNQY3eiZvRnSTU+m67ME5fuh6ANK5yvwcUIficG9hAVroUwu5eU4=
  account: mediocregopher
2020-03-05 16:43:40 -07:00

8.5 KiB

.dehub

The .dehub directory contains all meta information related to decentralized repository management and access control.

config.yml

The .dehub/config.yml file takes the following structure:

# accounts defines all accounts which are known to the repo.
accounts:

    # Each account is an object with an id and at least one identifier. The id
    # must be unique for each account.
    - id: some_user_id:

      # signifiers describes different methods the account might use to
      # identify itself. Generally, these will be different public keys which
      # commits will be signed with. At least one is required.
      signifiers:
          - type: "pgp_public_key"
            body: "FULL PGP PUBLIC KEY STRING"

          - type: "pgp_public_key_file"
            path: ".dehub/some_user_id.asc"

          - type: "keybase"
            user: "some_keybase_user_id"

# access_controls define who may do what in the repo. The value is a list of
# access control objects, each applying to one or more potential branch names.
access_controls:

  # branch_pattern is a glob pattern describing what branch names this access
  # control applies to. The first matching branch_pattern for a branch name
  # defines which access controls are applied.
  - branch_pattern: main

    # change_access_controls is an array of possible access controls applied for
    # files being changed in the branch
    change_access_controls:

      # file_path_pattern is a glob pattern describing what files this access control
      # applies to. Single star matches all characters except path separators,
      # double star matches everything. The first matching file_path_pattern for a
      # file path (relative to the repo root) defines which access controls are
      # applied.
      - file_path_pattern: ".dehub/**"

        # signature conditions indicate that a commit must be signed by one or
        # more accounts in order to be allowed.
        condition:
          type: signature

          # account_ids lists all accounts whose signature will count towards
          # meeting the condition
          account_ids:
              - some_user_id

          # count describes how many signatures are required. It can be either a
          # contrete integer (e.g. 2, meaning any 2 accounts listed by
          # account_ids) or a percent.
          count: 100% # all accounts in account_ids must sign

      # This catch-all pattern for the rest of the repo requires that changes to
      # any files not under `.dehub/` are signed by at least one of the
      # defined accounts.
      - file_path_pattern: "**"
        condition:
          type: signature
          any_account: true # indicates any account defined in accounts is valid
          count: 1

  # If a branch is not matched by any access control object then the following
  # default object is implied:
  #
  # branch_pattern: **
  # change_access_controls:
  #   - file_path_pattern: **
  #     condition:
  #       type: signature
  #       any_account: true
  #       count: 1

Change Hash

When a change commit (see Commits section) is being signed by a signifier there is an expected data format for the data to be signed. The format is a SHA-256 hash of the following pieces of data concatenated together (the change hash):

  • A uvarint indicating the number of bytes in the commit message.
  • The message.
  • A uvarint indicating the number of files changed.
  • For each file changed in the commit, ordered lexographically-ascending based on its full relative path within the repo, the following is then written:
    • A uvarint indicating the length of the full relative path of the file within the repo.
    • The full relative path of the file within the repo.
    • A little-endian uint32 representing the previous file mode of the file (or 0 if the file is being inserted).
    • The 20-byte SHA1 hash of the previous version of the file's contents (or 20 0 bytes if the file is being inserted).
    • A little-endian uint32 representing the new file mode of the file (or 0 if the file is being deleted).
    • The 20-byte SHA1 hash of the new version of the file's contents (or 20 0 bytes if the file is being deleted).

The raw output from the SHA-256 is then prepended with a 0 byte (for forward compatibility). The result is the raw change hash.

Credentials

All file changes need to have some kind of credential to be accepted into the main branch (see Main Branch section). Each credential is encoded as a yaml object with a type field.

All credentials contain enough information to correspond them to a specific signifier in the config.yml, so as to be able to verify them.

PGP Signature Credential

Currently there is only a single credential type, the pgp_signature, which signs a raw change hash (which is communicated out-of-band of the object):

type: pgp_signature
account_id: some_user_id
pub_key_id: XXX
body: "base-64 signature body"

Commits

All commit messages in dehub repositories are expected to follow the following template (newlines included, yaml comments start with # and are only for informational purposes):

Human readable message head

---
# Three dashes indicate the start of the yaml body. Everything after must be
# valid yaml.

type: type of the commit # Always required
fieldA: valueA
fieldB: valueB

Change Commits

Commits of type change correspond to the standard git commit; they encompass a set of file changes as well as a message describing the changes which occurred. They extend the standard git commit with a few dehub specific features, such as the change hash and credentials.

change commits are, currently, the only commit type which are allowed to have file changes.

Example change commit message:

This is the message head. It will be re-iterated within the message field

---
type: change
message: >
    This is the message head. It will be re-iterated within the message field

    The rest of this field is for the message body, which corresponds to the
    body of a normal commit message which might give a more long-form
    explanation of the commit's changes.

    Since the message is used in generating the signature it's necessary for it
    to be encoded here fully formed, even though the message head is then
    duplicated. Otherwise the exact bytes of the message would be ambiguous.
    This situation is ugly, but not unbearable.

# The change_hash is able to be computed from the commit's message and changed
# files, but is reproduced in the commit message for forward compatibility, e.g.
# if the algorithm to compute the hash changes.
change_hash: XXX

# Credentials are the set of credentials which indicate approval of the change
credentials:
    - type: pgp_signature
      account_id: some_user_id
      pub_key_id: XXX
      body: "base-64 signature body"

Credential Commits

Commits of type credential contain one or more credentials for some hash (presumably a change hash, but in the future there may be other types). The commit message head is not spec'd, but should be a human-readable description of "who is crediting what, and how".

Example credential commit message:

some_user_id pgp sig of commits AAA..BBB with key CCC

---
credentialed_hash: XXX
credentials:
    - type: pgp_signature
      account_id: some_user_id
      pub_key_id: CCC
      body: "base-64 signature body"

Branches

dehub branches correspond 1-to-1 with branches in the underlying git repo. All commits in a dehub branch should contain an encoded message as specified in the Commits section of this document, and possibly file changes as appropriate.

Main Branch

The "primary" branch of a dehub repo is the main branch. All new commits being appended to the HEAD of the main branch are subject to the following requirements:

  • Must be change commits.

  • Must conform to all requirements defined by the access_controls section of the config.yml, as found in the current HEAD. If the commit is the initial commit of the branch then it instead uses the config.yml found in itself.

  • Must be a "fast-forward" commit (this may be amended later, but at present it simplifies implementation).

Thread Branches

Branches which are not the main branch are referred to as "threads", and have much less stringent requirements than the main branch:

  • They can contain commits of any type, as long as the commits come from those with an account defined in the config.yml.

  • change commits are not subject access_controls requirements.

TODO

  • access control patterns related to who may push to MR branches, and what types of commits they can push.