nix process composition

2021-04-22 11:07:31 -06:00 · 2021-04-22 11:07:31 -06:00 · 9ef363410f
commit 9ef363410f
parent 6ecd78dc62
5 changed files with 326 additions and 0 deletions
--- a/src/_posts/2021-04-22-composing-processes-into-a-static-binary-with-nix.md
+++ b/src/_posts/2021-04-22-composing-processes-into-a-static-binary-with-nix.md
@ -0,0 +1,248 @@
+---
+title: >-
+    Composing Processes Into a Static Binary With Nix
+description: >-
+    Goodbye, docker-compose!
+---
+
+It's pretty frequent that one wants to use a project that requires multiple
+processes running. For example, a small web api which uses some database to
+store data in, or a networking utility which has some monitoring process which
+can be run alongside it.
+
+In these cases it's extremely helpful to be able to compose these disparate
+processes together into a single process. From the user's perspective it's much
+nicer to only have to manage one process (even if it has hidden child
+processes). From a dev's perspective the alternatives are: finding libraries in
+the same language which do the disparate tasks and composing them into the same
+process via import, or (if such libraries don't exist, which is likely)
+rewriting the functionality of all processes into a new, monolithic project
+which does everything; a huge waste of effort!
+
+## docker-compose
+
+A tool I've used before for process composition is
+[docker-compose][docker-compose]. While it works well for composition, it
+suffers from the same issues docker in general suffers from: annoying networking
+quirks, a questionable security model, and the need to run the docker daemon.
+While these issues are generally surmountable for a developer or sysadmin, they
+are not suitable for a general-purpose project which will be shipped to average
+users.
+
+## nix-bundle
+
+Enter [nix-bundle][nix-bundle]. This tools will take any [nix][nix] derivation
+and construct a single static binary out of it, a la [AppImage][appimage].
+Combined with a process management tool like [circus][circus], nix-bundle
+becomes a very useful tool for composing processes together!
+
+To demonstrate this, we'll be looking at putting together a project I wrote
+called [markov][markov], a simple REST API for building [markov
+chains][markov-chain] which is written in [go][golang] and backed by
+[redis][redis].
+
+## Step 1: Building Individual Components
+
+Step one is to get [markov][markov] and its dependencies into a state where it
+can be run with [nix][nix]. Doing this is fairly simple, we merely use the
+`buildGoModule` function:
+
+```
+pkgs.buildGoModule {
+    pname = "markov";
+    version = "618b666484566de71f2d59114d011ff4621cf375";
+    src = pkgs.fetchFromGitHub {
+        owner = "mediocregopher";
+        repo = "markov";
+        rev = "618b666484566de71f2d59114d011ff4621cf375";
+        sha256 = "1sx9dr1q3vr3q8nyx3965x6259iyl85591vx815g1xacygv4i4fg";
+    };
+    vendorSha256 = "048wygrmv26fsnypsp6vxf89z3j0gs9f1w4i63khx7h134yxhbc6";
+}
+```
+
+This expression results in a derivation which places the markov binary at
+`bin/markov`.
+
+The other component we need to run markov is [redis][redis], which conveniently
+is already packaged in nixpkgs as `pkg.redis`.
+
+## Step 2: Composing Using Circus
+
+[Circus][circus] can be configured to run multiple processes at the same time.
+It will collect the stdout/stderr logs of these processes and combine them into
+a single stream, or write them to log files. If any processes fail circus will
+automatically restart them. It has a simple configuration and is, overall, a
+great tool for a simple project like this.
+
+Circus also comes pre-packed in nixpkgs, so we don't need to do anything to
+actually build it. We only need to configure it. To do this we'll write a bash
+script which generates the configuration on-the-fly, and then runs the process
+with that configuration.
+
+This script is going to act as the "frontend" for our eventual static binary;
+the user will pass in configuration parameters to this script, and this script
+will translate those into the appropriate configuration for all sub-process
+(markov, redis, circus). For this demo we won't go nuts with the configuration,
+we'll just expose the following:
+
+* `MARKOV_LISTEN_ADDR`: Address REST API will listen on (defaults to
+  `localhost:8000`).
+
+* `MARKOV_TIMEOUT`: Expiration time of each link of the chain (defaults to 720
+  hours).
+
+* `MARKOV_DATA_DIR`: Directory where data will be stored (defaults to current
+  working directory).
+
+The bash script will take these params in as environment variables. The nix
+expression to generate the bash script, which we'll call our entrypoint script,
+will look like this (assumes that the expression to generate `bin/markov`,
+defined above, is set to the `markov` variable):
+
+```
+pkgs.writeScriptBin "markov" ''
+    #!${pkgs.stdenv.shell}
+
+    # On every run we create new, temporary, configuration files for redis and
+    # circus. To do this we create a new config directory.
+    markovCfgDir=$(${pkgs.coreutils}/bin/mktemp -d)
+    echo "generating configuration to $markovCfgDir"
+
+    cat >$markovCfgDir/redis.conf <<EOF
+    save ""
+    dir "''${MARKOV_DATA_DIR:-$(pwd)}"
+    appendonly yes
+    appendfilename "markov.data"
+    EOF
+
+    cat >$markovCfgDir/circus.ini <<EOF
+
+    [circus]
+
+    [watcher:markov]
+    cmd = ${markov}/bin/markov \
+        -listenAddr ''${MARKOV_LISTEN_ADDR:-localhost:8000} \
+        -timeout ''${MARKOV_TIMEOUT:-720}
+    numprocesses = 1
+
+    [watcher:redis]
+    cmd = ${pkgs.redis}/bin/redis-server $markovCfgDir/redis.conf
+    numprocesses = 1
+    EOF
+
+    exec ${pkgs.circus}/bin/circusd $markovCfgDir/circus.ini
+'';
+```
+
+By `nix-build`ing this expression we end up with a derivation with
+`bin/markov`, and running that should result in the following output:
+
+```
+generating configuration to markov.VLMPwqY
+2021-04-22 09:27:56 circus[181906] [INFO] Starting master on pid 181906
+2021-04-22 09:27:56 circus[181906] [INFO] Arbiter now waiting for commands
+2021-04-22 09:27:56 circus[181906] [INFO] markov started
+2021-04-22 09:27:56 circus[181906] [INFO] redis started
+181923:C 22 Apr 2021 09:27:56.063 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
+181923:C 22 Apr 2021 09:27:56.063 # Redis version=6.0.6, bits=64, commit=00000000, modified=0, pid=181923, just started
+181923:C 22 Apr 2021 09:27:56.063 # Configuration loaded
+...
+```
+
+The `markov` server process doesn't have many logs, unfortunately, but redis'
+logs at least work well, and doing a `curl localhost:8000` results in the
+response from the `markov` server.
+
+At this point our processes are composed using circus, let's now bundle it all
+into a single static binary!
+
+## Step 3: nix-bundle
+
+The next step is to run [nix-bundle][nix-bundle] on the entrypoint expression,
+and nix-bundle will compile all dependencies (including markov, redis, and
+circus) into a single archive file, and make that file executable. When the
+archive is executed it will run our entrypoint script directly.
+
+Getting nix-bundle is very easy, just use nix-shell!
+
+```
+nix-shell -p nix-bundle
+```
+
+This will open a shell where the `nix-bundle` binary is available on your path.
+From there just run the following to construct the binary (this assumes that the
+nix code described so far is stored in `markov.nix`, the full source of which
+will be linked to at the end of this post):
+
+```
+nix-bundle '((import ./markov.nix) {}).entrypoint' '/bin/markov'
+```
+
+The resulting binary is called `markov`, and is 89MB. The size is a bit jarring,
+considering the simplicity of the functionality, but it could probably be
+trimmed by using a different process manager than circus (which requires
+bundling an entire python runtime into the binary).
+
+Running the binary directly as `./markov` produces the same result as when we
+ran the entrypoint script earlier. Success! We have bundled multiple existing
+processes into a single, opaque, static binary. Installation of this binary is
+now as easy as copying it to any linux machine and running it.
+
+## Bonus Step: nix'ing nix-bundle
+
+Installing and running [nix-bundle][nix-bundle] manually is _fine_, but it'd be even better if
+that was defined as part of our nix setup as well. That way any new person
+wouldn't have to worry about that step, and still get the same deterministic
+output from the build.
+
+Unfortunately, we can't actually run `nix-bundle` from within a nix build
+derivation, as it requires access to the nix store and that can't be done (or at
+least I'm not on that level yet). So instead we'll have to settle for defining
+the `nix-bundle` binary in nix and then using a `Makefile` to call it.
+
+Defining a `nix-bundle` expression is easy enough:
+
+```
+    nixBundleSrc = pkgs.fetchFromGitHub {
+        owner = "matthewbauer";
+        repo = "nix-bundle";
+        rev = "8e396533ef8f3e8a769037476824d668409b4a74";
+        sha256 = "1lrq0990p07av42xz203w64abv2rz9xd8jrzxyvzzwj7vjj7qwyw";
+    };
+
+    nixBundle = (import "${nixBundleSrc}/release.nix") {
+        nixpkgs' = pkgs;
+    };
+```
+
+Then the Makefile:
+
+```make
+bundle:
+	nix-build markov.nix -A nixBundle
+	./result/bin/nix-bundle '((import ./markov.nix) {}).entrypoint' '/bin/markov'
+```
+
+Now all a developer needs to rebuild the project is to do `make` within the
+directory, while also having nix set up. The result will be a deterministically
+built, static binary, encompassing multiple processes which will all work
+together behind the scenes. This static binary can be copied to any linux
+machine and run there without any further installation steps.
+
+How neat is that!
+
+The final source files used for this project can be found below:
+
+* [markov.nix](/assets/markov/markov.nix.html)
+* [Makefile](/assets/markov/Makefile.html)
+
+[nix]: https://nixos.org/manual/nix/stable/
+[nix-bundle]: https://github.com/matthewbauer/nix-bundle
+[docker-compose]: https://docs.docker.com/compose/
+[appimage]: https://appimage.org/
+[circus]: https://circus.readthedocs.io/en/latest/
+[markov]: https://github.com/mediocregopher/markov
+[markov-chain]: https://en.wikipedia.org/wiki/Markov_chain
+[golang]: https://golang.org/
+[redis]: https://redis.io/
--- a/src/assets/markov/Makefile
+++ b/src/assets/markov/Makefile
@ -0,0 +1,3 @@
+bundle:
+	nix-build markov.nix -A nixBundle
+	./result/bin/nix-bundle '((import ./markov.nix) {}).entrypoint' '/bin/markov'
--- a/src/assets/markov/Makefile.md
+++ b/src/assets/markov/Makefile.md
@ -0,0 +1,6 @@
+---
+layout: code
+include: Makefile
+lang: make
+---
+
--- a/src/assets/markov/markov.nix
+++ b/src/assets/markov/markov.nix
@ -0,0 +1,63 @@
+{
+    pkgs ? import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/d50923ab2d308a1ddb21594ba6ae064cab65d8ae.tar.gz") {}
+}:
+
+rec {
+
+    markov = pkgs.buildGoModule {
+        pname = "markov";
+        version = "618b666484566de71f2d59114d011ff4621cf375";
+        src = pkgs.fetchFromGitHub {
+            owner = "mediocregopher";
+            repo = "markov";
+            rev = "618b666484566de71f2d59114d011ff4621cf375";
+            sha256 = "1sx9dr1q3vr3q8nyx3965x6259iyl85591vx815g1xacygv4i4fg";
+        };
+        vendorSha256 = "048wygrmv26fsnypsp6vxf89z3j0gs9f1w4i63khx7h134yxhbc6";
+    };
+
+    entrypoint = pkgs.writeScriptBin "markov" ''
+        #!${pkgs.stdenv.shell}
+
+        # On every run we create new, temporary, configuration files for redis and
+        # circus. To do this we create a new config directory.
+        markovCfgDir=$(${pkgs.coreutils}/bin/mktemp -d)
+        echo "generating configuration to $markovCfgDir"
+
+        ${pkgs.coreutils}/bin/cat >$markovCfgDir/redis.conf <<EOF
+        save ""
+        dir "''${MARKOV_DATA_DIR:-$(pwd)}"
+        appendonly yes
+        appendfilename "markov.data"
+        EOF
+
+        ${pkgs.coreutils}/bin/cat >$markovCfgDir/circus.ini <<EOF
+
+        [circus]
+
+        [watcher:markov]
+        cmd = ${markov}/bin/markov \
+            -listenAddr ''${MARKOV_LISTEN_ADDR:-localhost:8000} \
+            -timeout ''${MARKOV_TIMEOUT:-720}
+        numprocesses = 1
+
+        [watcher:redis]
+        cmd = ${pkgs.redis}/bin/redis-server $markovCfgDir/redis.conf
+        numprocesses = 1
+        EOF
+
+        exec ${pkgs.circus}/bin/circusd $markovCfgDir/circus.ini
+    '';
+
+    nixBundleSrc = pkgs.fetchFromGitHub {
+        owner = "matthewbauer";
+        repo = "nix-bundle";
+        rev = "8e396533ef8f3e8a769037476824d668409b4a74";
+        sha256 = "1lrq0990p07av42xz203w64abv2rz9xd8jrzxyvzzwj7vjj7qwyw";
+    };
+
+    nixBundle = (import "${nixBundleSrc}/release.nix") {
+        nixpkgs' = pkgs;
+    };
+}
+
--- a/src/assets/markov/markov.nix.md
+++ b/src/assets/markov/markov.nix.md
@ -0,0 +1,6 @@
+---
+layout: code
+include: markov.nix
+lang: plain
+---
+