|
|
|
@ -6,12 +6,6 @@ description: >- |
|
|
|
|
complex structures, and a pattern which helps in solving those problems. |
|
|
|
|
--- |
|
|
|
|
|
|
|
|
|
TODO: |
|
|
|
|
* Double check if I'm using "I" or "We" everywhere (probably should use "I") |
|
|
|
|
* Part 2: Full Example |
|
|
|
|
* Standardize on "programs", not "apps" or "services" |
|
|
|
|
* Prefix all relevant code examples with a package name |
|
|
|
|
|
|
|
|
|
## Part 0: Introduction |
|
|
|
|
|
|
|
|
|
This post is focused on a concept I call "program structure", which I will try |
|
|
|
@ -20,7 +14,7 @@ discussing why complex structures can be problematic to deal with, and finally |
|
|
|
|
discussing a pattern for dealing with those problems. |
|
|
|
|
|
|
|
|
|
My background is as a backend engineer working on large projects that have had |
|
|
|
|
many moving parts; most had multiple services interacting with each other, using |
|
|
|
|
many moving parts; most had multiple programs interacting with each other, using |
|
|
|
|
many different databases in various contexts, and facing large amounts of load |
|
|
|
|
from millions of users. Most of this post will be framed from my perspective, |
|
|
|
|
and will present problems in the way I have experienced them. I believe, |
|
|
|
@ -31,10 +25,10 @@ domain can help to translate the ideas between the two. |
|
|
|
|
Also note that I will be using Go as my example language, but none of the |
|
|
|
|
concepts discussed here are specific to Go. To that end, I've decided to favor |
|
|
|
|
readable code over "correct" code, and so have elided things that most gophers |
|
|
|
|
hold near-and-dear, such as error checking and comments on all public types, in |
|
|
|
|
order to make the code as accessible as possible to non-gophers as well. As with |
|
|
|
|
before, I trust someone with a foot in Go and another language can translate |
|
|
|
|
help me translate between the two. |
|
|
|
|
hold near-and-dear, such as error checking and proper documentation, in order to |
|
|
|
|
make the code as accessible as possible to non-gophers as well. As with before, |
|
|
|
|
I trust someone with a foot in Go and another language can translate help me |
|
|
|
|
translate between the two. |
|
|
|
|
|
|
|
|
|
## Part 1: Program Structure |
|
|
|
|
|
|
|
|
@ -62,7 +56,7 @@ src/ |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
What I grew to learn was that this conflation of "program structure" with |
|
|
|
|
"directory structure" is ultimately unhelpful. While I won't deny that every |
|
|
|
|
"directory structure" is ultimately unhelpful. While can't be denied that every |
|
|
|
|
program has a directory structure (and if not, it ought to), this does not mean |
|
|
|
|
that the way the program looks in a filesystem in any way corresponds to how it |
|
|
|
|
looks in our mind's eye. |
|
|
|
@ -80,27 +74,34 @@ src/ |
|
|
|
|
|
|
|
|
|
If I were to ask you, based on that directory strucure, what the program does, |
|
|
|
|
in the most abstract terms, you might say something like: "The program |
|
|
|
|
establishes an http server which listens for requests, as well as a connection |
|
|
|
|
to the redis server. The program then interacts with redis in different ways, |
|
|
|
|
based on the http requests which are received on the server." |
|
|
|
|
establishes an http server which listens for requests. It also establishes a |
|
|
|
|
connection to the redis server. The program then interacts with redis in |
|
|
|
|
different ways, based on the http requests which are received on the server." |
|
|
|
|
|
|
|
|
|
And that would be a good guess. Here's a diagram which depicts the program |
|
|
|
|
structure, wherein the root node, `main.go`, takes in requests from `http` and |
|
|
|
|
processes them using `redis`. |
|
|
|
|
|
|
|
|
|
TODO diagram |
|
|
|
|
{% include image.html |
|
|
|
|
dir="program-structure" file="diag1.jpg" width=519 |
|
|
|
|
descr="Example 1" |
|
|
|
|
%} |
|
|
|
|
|
|
|
|
|
This is certainly a viable guess for how a program with that directory structure |
|
|
|
|
operates, but consider another: "A component of the program called `server` |
|
|
|
|
establishes an http server which listens for requests, as well as a connection |
|
|
|
|
to a redis server. `server` then interacts with that redis connection in |
|
|
|
|
different ways, based on the http requests which are received on the http |
|
|
|
|
server. Additionally, `server` tracks statistics about these interactions and |
|
|
|
|
makes them available to other components. The root component of the program |
|
|
|
|
establishes a connection to a second redis server, and stores those statistics |
|
|
|
|
in that redis server." |
|
|
|
|
|
|
|
|
|
TODO diagram |
|
|
|
|
operates, but consider another answer: "A component of the program called |
|
|
|
|
`server` establishes an http server which listens for requests. `server` also |
|
|
|
|
establishes a connection to a redis server. `server` then interacts with that |
|
|
|
|
redis connection in different ways, based on the http requests which are |
|
|
|
|
received on the http server. Additionally, `server` tracks statistics about |
|
|
|
|
these interactions and makes them available to other components. The root |
|
|
|
|
component of the program establishes a connection to a second redis server, and |
|
|
|
|
stores those statistics in that redis server." Here's another diagram to depict |
|
|
|
|
_that_ program. |
|
|
|
|
|
|
|
|
|
{% include image.html |
|
|
|
|
dir="program-structure" file="diag2.jpg" width=712 |
|
|
|
|
descr="Example 2" |
|
|
|
|
%} |
|
|
|
|
|
|
|
|
|
The directory structure could apply to either description; `redis` is just a |
|
|
|
|
library which allows for interacting with a redis server, but it doesn't specify |
|
|
|
@ -112,9 +113,9 @@ program will use those components.** |
|
|
|
|
|
|
|
|
|
### Global State vs Compartmentalization |
|
|
|
|
|
|
|
|
|
The directory-centric approach to structure often leads to the use of global |
|
|
|
|
The directory-centric view of structure often leads to the use of global |
|
|
|
|
singletons to manage access to external resources like RPC servers and |
|
|
|
|
databases. In the above example the `redis` library might contain code which |
|
|
|
|
databases. In examples 1 and 2 the `redis` library might contain code which |
|
|
|
|
looks something like: |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
@ -130,33 +131,34 @@ func Get(name string) *RedisConn { |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
Even though this pattern would work, it breaks with our conception of the |
|
|
|
|
program structure in the more complex case shown above. Rather than having the |
|
|
|
|
`server` component own the redis server it uses, the root component would be the |
|
|
|
|
owner of it, and `server` would be borrowing it. Compartmentalization has been |
|
|
|
|
broken, and can only be held together through sheer human discipline. |
|
|
|
|
|
|
|
|
|
This is the problem with all global state. It's shareable amongst all components |
|
|
|
|
of a program, and so is owned by none of them. One must look at an entire |
|
|
|
|
codebase to understand how a globally held component is used, which might not |
|
|
|
|
even be possible for a large codebase. And so the maintainers of these shared |
|
|
|
|
components rely entirely on the discipline of their fellow coders when making |
|
|
|
|
changes, usually discovering where that discipline broke down once the changes |
|
|
|
|
have been pushed live. |
|
|
|
|
|
|
|
|
|
Global state also makes it easier for disparate services/components to share |
|
|
|
|
datastores for completely unrelated tasks. In the above example, rather than |
|
|
|
|
creating a new redis instance for the root component's statistics storage, the |
|
|
|
|
coder might have instead said "well, there's already a redis instance available, |
|
|
|
|
I'll just use that." And so compartmentalization would have been broken further. |
|
|
|
|
Perhaps the two instances _could_ be coalesced into the same one, for the sake |
|
|
|
|
of resource efficiency, but that decision would be better made at runtime via |
|
|
|
|
the configuration of the program, rather than being hardcoded into the code. |
|
|
|
|
program structure in more complexes cases like example 2. Rather than the |
|
|
|
|
`redis` component being owned by the `server` component, which actually uses it, |
|
|
|
|
it would be practically owned by _all_ components, since all are able to use it. |
|
|
|
|
Compartmentalization has been broken, and can only be held together through |
|
|
|
|
sheer human discipline. |
|
|
|
|
|
|
|
|
|
**This is the problem with all global state. It's shareable amongst all components |
|
|
|
|
of a program, and so is accountable to none of them.** One must look at an |
|
|
|
|
entire codebase to understand how a globally held component is used, which might |
|
|
|
|
not even be possible for a large codebase. And so the maintainers of these |
|
|
|
|
shared components rely entirely on the discipline of their fellow coders when |
|
|
|
|
making changes, usually discovering where that discipline broke down once the |
|
|
|
|
changes have been pushed live. |
|
|
|
|
|
|
|
|
|
Global state also makes it easier for disparate programs/components to share |
|
|
|
|
datastores for completely unrelated tasks. In example 2, rather than creating a |
|
|
|
|
new redis instance for the root component's statistics storage, the coder might |
|
|
|
|
have instead said "well, there's already a redis instance available, I'll just |
|
|
|
|
use that." And so compartmentalization would have been broken further. Perhaps |
|
|
|
|
the two instances _could_ be coalesced into the same one, for the sake of |
|
|
|
|
resource efficiency, but that decision would be better made at runtime via the |
|
|
|
|
configuration of the program, rather than being hardcoded into the code. |
|
|
|
|
|
|
|
|
|
From the perspective of team management, global state-based patterns do nothing |
|
|
|
|
except slow teams down. The person/team responsible for maintaining the central |
|
|
|
|
library which holds all the shared resources (`redis`, in the above example) |
|
|
|
|
becomes the bottleneck for creating new instances for new components, which will |
|
|
|
|
further lead to re-using existing instances rather than create new ones, further |
|
|
|
|
library in which shared components live (`redis`, in the above examples) becomes |
|
|
|
|
the bottleneck for creating new instances for new components, which will further |
|
|
|
|
lead to re-using existing instances rather than creating new ones, further |
|
|
|
|
breaking compartmentalization. The person/team responsible for the central |
|
|
|
|
library often finds themselves as the maintainers of the shared resource as |
|
|
|
|
well, rather than the team actually using it. |
|
|
|
@ -174,16 +176,10 @@ some IO functionality of their own. |
|
|
|
|
Let's look at an even more complex structure, still only using the `redis` and |
|
|
|
|
`http` component types: |
|
|
|
|
|
|
|
|
|
TODO diagram: |
|
|
|
|
``` |
|
|
|
|
root |
|
|
|
|
rest-api |
|
|
|
|
redis |
|
|
|
|
http |
|
|
|
|
redis // for stats keeping |
|
|
|
|
debug |
|
|
|
|
http |
|
|
|
|
``` |
|
|
|
|
{% include image.html |
|
|
|
|
dir="program-structure" file="diag3.jpg" width=729 |
|
|
|
|
descr="Example 3" |
|
|
|
|
%} |
|
|
|
|
|
|
|
|
|
This component structure contains the addition of the `debug` component. Clearly |
|
|
|
|
the `http` and `redis` components are reusable in different contexts, but for |
|
|
|
@ -212,8 +208,8 @@ func NewRestAPI() *RestAPI { |
|
|
|
|
|
|
|
|
|
// mux will route requests to different handlers based on their URL path. |
|
|
|
|
mux := http.NewServeMux() |
|
|
|
|
mux.Handle("/foo", http.HandlerFunc(r.fooHandler)) |
|
|
|
|
mux.Handle("/bar", http.HandlerFunc(r.barHandler)) |
|
|
|
|
mux.HandleFunc("/foo", r.fooHandler) |
|
|
|
|
mux.HandleFunc("/bar", r.barHandler) |
|
|
|
|
r.httpSrv := http.NewServer(mux) |
|
|
|
|
|
|
|
|
|
// Listen for requests and serve them in the background. |
|
|
|
@ -235,9 +231,9 @@ func (r *RestAPI) barHandler(rw http.ResponseWriter, r *http.Request) { |
|
|
|
|
} |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
As can be seen, `rest-api` coalesces `http` and `redis` into a simple REST api, |
|
|
|
|
using pre-made library components. `main.go`, the root component, does much the |
|
|
|
|
same: |
|
|
|
|
As can be seen, `rest-api` coalesces `http` and `redis` into a simple REST-like |
|
|
|
|
api, using pre-made library components. `main.go`, the root component, does much |
|
|
|
|
the same: |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
func main() { |
|
|
|
@ -262,14 +258,14 @@ One thing which is clearly missing in this program is proper configuration, |
|
|
|
|
whether from command-line, environment variables, etc.... As it stands, all |
|
|
|
|
configuration parameters, such as the redis addresses and http listen addresses, |
|
|
|
|
are hardcoded. Proper configuration actually ends up being somewhat difficult, |
|
|
|
|
as the ideal case would be for each component to set up the configuration |
|
|
|
|
variables of itself, without its parent needing to be aware. For example, |
|
|
|
|
`redis` could set up `addr` and `pool-size` parameters. The problem is that |
|
|
|
|
there are two `redis` components in the program, and their parameters would |
|
|
|
|
therefore conflict with each other. An elegant solution to this problem is |
|
|
|
|
discussed in the next section. |
|
|
|
|
as the ideal case would be for each component to set up its own configuration |
|
|
|
|
variables, without its parent needing to be aware. For example, `redis` could |
|
|
|
|
set up `addr` and `pool-size` parameters. The problem is that there are two |
|
|
|
|
`redis` components in the program, and their parameters would therefore conflict |
|
|
|
|
with each other. An elegant solution to this problem is discussed in the next |
|
|
|
|
section. |
|
|
|
|
|
|
|
|
|
## Part 2: Context, Configuration, and Runtime |
|
|
|
|
## Part 2: Components, Configuration, and Runtime |
|
|
|
|
|
|
|
|
|
The key to the configuration problem is to recognize that, even if there are two |
|
|
|
|
of the same component in a program, they can't occupy the same place in the |
|
|
|
@ -296,54 +292,45 @@ components in that program would set up: |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
So how can we enable each component to know its path in the component structure? |
|
|
|
|
To answer this we'll have to take a detour through go's `Context` type. |
|
|
|
|
To answer this we'll have to take a detour through a type, called `Component`. |
|
|
|
|
|
|
|
|
|
### Context and Configuration |
|
|
|
|
### Component and Configuration |
|
|
|
|
|
|
|
|
|
As I mentioned in the Introduction, my example language in this post is Go, but |
|
|
|
|
there's nothing about the concepts I'm presenting which are specific to Go. To |
|
|
|
|
put it simply, Go's builtin `context` package implements a type called |
|
|
|
|
`context.Context` which is, for all intents and purposes, an immutable key/value |
|
|
|
|
store. This means that when you set a key to a value on a Context (using the |
|
|
|
|
`context.WithValue` function) a new Context is returned. The new Context |
|
|
|
|
contains all of the original's key/values, plus the one just set. The original |
|
|
|
|
remains untouched. |
|
|
|
|
The `Component` type is a made up type (though you'll be able to find an |
|
|
|
|
implementation of it at the end of this post). It has a single primary purpose, |
|
|
|
|
and that is to convey the program's structure to new components. |
|
|
|
|
|
|
|
|
|
(Go's Context also has some behavior built into it surrounding deadlines and |
|
|
|
|
process cancellation, but those aren't relevant for this discussion.) |
|
|
|
|
|
|
|
|
|
Context makes sense to use for carrying information about the program's |
|
|
|
|
structure to it's different components; it is informing each of what _context_ |
|
|
|
|
it exists in within the larger structure. To use Context effectively, however, |
|
|
|
|
it is necessary to implement some helper functions. Here are their function |
|
|
|
|
signatures: |
|
|
|
|
To see how this is done, let's look at a couple of `Component`'s methods: |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
// Package mctx |
|
|
|
|
// Package mcmp |
|
|
|
|
|
|
|
|
|
// New returns a new Component which has no parents or children. It is therefore |
|
|
|
|
// the root component of a component hierarchy. |
|
|
|
|
func New() *Component |
|
|
|
|
|
|
|
|
|
// NewChild creates and returns a new Context based off of the parent one. The |
|
|
|
|
// child will have a path which is the parent's path appended with the given |
|
|
|
|
// name. |
|
|
|
|
func NewChild(parent context.Context, name string) context.Context |
|
|
|
|
// Child returns a new child of the called upon Component. |
|
|
|
|
func (*Component) Child(name string) *Component |
|
|
|
|
|
|
|
|
|
// Path returns the sequence of names which were used to produce this Context |
|
|
|
|
// via calls to the NewChild function. |
|
|
|
|
func Path(ctx context.Context) []string |
|
|
|
|
// Path returns the Component's path in the component hierarchy. It will return |
|
|
|
|
// an empty slice if the Component is the root component. |
|
|
|
|
func (*Component) Path() []string |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
`NewChild` is used to create a new Context, corresponding to a new child node in |
|
|
|
|
the component structure, and `Path` is used retrieve the path of any Context |
|
|
|
|
within that structure. For the sake of keeping the examples simple let's pretend |
|
|
|
|
these functions have been implemented in a package called `mctx`. Here's an |
|
|
|
|
example of how `mctx` might be used in the `redis` component's code: |
|
|
|
|
|
|
|
|
|
`Child` is used to create a new `Component`, corresponding to a new child node |
|
|
|
|
in the component structure, and `Path` is used retrieve the path of any |
|
|
|
|
`Component` within that structure. For the sake of keeping the examples simple |
|
|
|
|
let's pretend these functions have been implemented in a package called `mcmp`. |
|
|
|
|
Here's an example of how `Component` might be used in the `redis` component's |
|
|
|
|
code: |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
// Package redis |
|
|
|
|
|
|
|
|
|
func NewConn(ctx context.Context, defaultAddr string) *RedisConn { |
|
|
|
|
ctx = mctx.NewChild(ctx, "redis") |
|
|
|
|
ctxPath := mctx.Path(ctx) |
|
|
|
|
paramPrefix := strings.Join(ctxPath, "-") |
|
|
|
|
func NewConn(cmp *mcmp.Component, defaultAddr string) *RedisConn { |
|
|
|
|
cmp = cmp.Child("redis") |
|
|
|
|
paramPrefix := strings.Join(cmp.Path(), "-") |
|
|
|
|
|
|
|
|
|
addrParam := flag.String(paramPrefix+"-addr", defaultAddr, "Address of redis instance to connect to") |
|
|
|
|
// finish setup |
|
|
|
@ -355,59 +342,69 @@ func NewConn(ctx context.Context, defaultAddr string) *RedisConn { |
|
|
|
|
In our above example, the two `redis` components' parameters would be: |
|
|
|
|
|
|
|
|
|
``` |
|
|
|
|
// This first parameter is for stats redis, whose parent is the root and |
|
|
|
|
// This first parameter is for the stats redis, whose parent is the root and |
|
|
|
|
// therefore doesn't have a prefix. Perhaps stats should be broken into its own |
|
|
|
|
// component in order to fix this. |
|
|
|
|
--redis-addr |
|
|
|
|
--rest-api-redis-addr |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
The prefix joining stuff will probably get annoying after a while though, so |
|
|
|
|
let's invent a new package, `mcfg`, which acts like `flag` but is aware of |
|
|
|
|
`mctx`. Then `redis.NewConn` is reduced to: |
|
|
|
|
`Component` definitely makes it easier to instantiate multiple redis components |
|
|
|
|
in our program, since it allows them to know their place in the component |
|
|
|
|
structure. |
|
|
|
|
|
|
|
|
|
Having to construct the prefix for the parameters ourselves is pretty annoying |
|
|
|
|
though, so let's introduce a new package, `mcfg`, which acts like `flag` but is |
|
|
|
|
aware of `Component`. Then `redis.NewConn` is reduced to: |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
// Package redis |
|
|
|
|
|
|
|
|
|
func NewConn(ctx context.Context, defaultAddr string) *RedisConn { |
|
|
|
|
ctx = mctx.NewChild(ctx, "redis") |
|
|
|
|
addrParam := flag.String(ctx, "-addr", defaultAddr, "Address of redis instance to connect to") |
|
|
|
|
func NewConn(cmp *mcmp.Component, defaultAddr string) *RedisConn { |
|
|
|
|
cmp = cmp.Child("redis") |
|
|
|
|
addrParam := flag.String(cmp, "-addr", defaultAddr, "Address of redis instance to connect to") |
|
|
|
|
// finish setup |
|
|
|
|
|
|
|
|
|
return redisConn |
|
|
|
|
} |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
Easy-peazy. |
|
|
|
|
|
|
|
|
|
#### But What About Parse? |
|
|
|
|
|
|
|
|
|
Sharp-eyed gophers will notice that there's a key piece missing: When is |
|
|
|
|
`mcfg.Parse` called? When does `addrParam` actually get populated? Because you |
|
|
|
|
can't create the redis connection until that happens, but that can't happen |
|
|
|
|
inside `redis.NewConn` because there might be other things after `redis.NewConn` |
|
|
|
|
which want to set up parameters. To illustrate the problem, let's look at a |
|
|
|
|
simple program which wants to set up two `redis` components: |
|
|
|
|
`flag.Parse`, or its `mcfg` counterpart, called? When does `addrParam` actually |
|
|
|
|
get populated? You can't use the redis connection until that happens, but that |
|
|
|
|
can't happen inside `redis.NewConn` because there might be other components |
|
|
|
|
after `redis.NewConn` which want to set up parameters. To illustrate the |
|
|
|
|
problem, let's look at a simple program which wants to set up two `redis` |
|
|
|
|
components: |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
func main() { |
|
|
|
|
// Create the root context, an empty Context. |
|
|
|
|
ctx := context.Background() |
|
|
|
|
// Create the root Component, an empty Component. |
|
|
|
|
cmp := mcmp.New() |
|
|
|
|
|
|
|
|
|
// Create the Contexts for two sub-components, foo and bar. |
|
|
|
|
ctxFoo := mctx.NewChild(ctx, "foo") |
|
|
|
|
ctxBar := mctx.NewChild(ctx, "bar") |
|
|
|
|
// Create the Components for two sub-components, foo and bar. |
|
|
|
|
cmpFoo := cmp.Child("foo") |
|
|
|
|
cmpBar := cmp.Child("bar") |
|
|
|
|
|
|
|
|
|
// Now we want to try to create a redis sub-component for each component. |
|
|
|
|
|
|
|
|
|
// This will set up the parameter "--foo-redis-addr", but bar hasn't had a |
|
|
|
|
// chance to set up its corresponding parameter, so the command-line can't |
|
|
|
|
// be parsed yet. |
|
|
|
|
fooRedis := redis.NewConn(ctxFoo, "127.0.0.1:6379") |
|
|
|
|
fooRedis := redis.NewConn(cmpFoo, "127.0.0.1:6379") |
|
|
|
|
|
|
|
|
|
// This will set up the parameter "--bar-redis-addr", but, as mentioned |
|
|
|
|
// before, redis.NewConn can't parse command-line. |
|
|
|
|
barRedis := redis.NewConn(ctxBar, "127.0.0.1:6379") |
|
|
|
|
barRedis := redis.NewConn(cmpBar, "127.0.0.1:6379") |
|
|
|
|
|
|
|
|
|
// If the command-line is parsed here, then how can fooRedis and barRedis |
|
|
|
|
// have been created yet? Creating the redis connection depends on the addr |
|
|
|
|
// parameters having already been parsed and filled. |
|
|
|
|
// have been created yet? It's only _after_ this point that `fooRedis` and |
|
|
|
|
// `barRedis` could possibly be usable. |
|
|
|
|
mcfg.Parse() |
|
|
|
|
} |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
@ -417,14 +414,14 @@ We will solve this problem in the next section. |
|
|
|
|
|
|
|
|
|
Let's break down `redis.NewConn` into two phases: instantiation and initialization. |
|
|
|
|
Instantiation refers to creating the component on the component structure and |
|
|
|
|
having it declare what it needs in order to initialize. After instantiation |
|
|
|
|
nothing external to the program has been done; no IO, no reading of the |
|
|
|
|
command-line, no logging, etc... All that's happened is that the empty shell of |
|
|
|
|
a `redis` component has been created. |
|
|
|
|
having it declare what it needs in order to initialize (e.g. configuration |
|
|
|
|
parameters). During instantiation nothing external to the program is performed; |
|
|
|
|
no IO, no reading of the command-line, no logging, etc... All that's happened is |
|
|
|
|
that the empty template of a `redis` component has been created. |
|
|
|
|
|
|
|
|
|
Initialization is the phase when that shell is filled. Configuration parameters |
|
|
|
|
are read, startup actions like the creation of database connections are |
|
|
|
|
performed, and logging is output for informational and debugging purposes. |
|
|
|
|
Initialization is the phase when that template is filled in. Configuration |
|
|
|
|
parameters are read, startup actions like the creation of database connections |
|
|
|
|
are performed, and logging is output for informational and debugging purposes. |
|
|
|
|
|
|
|
|
|
The key to making effective use of this dichotemy is to allow _all_ components |
|
|
|
|
to instantiate themselves before they initialize themselves. By doing this we |
|
|
|
@ -436,41 +433,52 @@ sense to leave instantiation related code where it is, but we need a mechanism |
|
|
|
|
by which we can declare initialization code before actually calling it. For |
|
|
|
|
this, I will introduce the idea of a "hook". |
|
|
|
|
|
|
|
|
|
A hook is, simply a function which will run later. We will declare a new |
|
|
|
|
package, calling it `mrun`, and say that it has two new functions: |
|
|
|
|
#### But First: Augment Component |
|
|
|
|
|
|
|
|
|
In order to support hooks, however, `Component` will need to be augmented with |
|
|
|
|
a few new methods. Right now it can only carry with it information about the |
|
|
|
|
component structure, but here we will add the ability to carry arbitrary |
|
|
|
|
key/value information as well: |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
// Package mrun |
|
|
|
|
// Package mcmp |
|
|
|
|
|
|
|
|
|
// WithInitHook returns a new Context based off the passed in one, with the // |
|
|
|
|
given hook registered to it. |
|
|
|
|
func WithInitHook(ctx context.Context, hook func()) context.Context |
|
|
|
|
// SetValue sets the given key to the given value on the Component, overwriting |
|
|
|
|
// any previous value for that key. |
|
|
|
|
func (*Component) SetValue(key, value interface{}) |
|
|
|
|
|
|
|
|
|
// Init runs all hooks registered using WithInitHook. Hooks are run in the order |
|
|
|
|
// they were registered. |
|
|
|
|
func Init(ctx context.Context) |
|
|
|
|
// Value returns the value which has been set for the given key, or nil if the |
|
|
|
|
// key was never set. |
|
|
|
|
func (*Component) Value(key interface{}) interface{} |
|
|
|
|
|
|
|
|
|
// Children returns the Component's children in the order they were created. |
|
|
|
|
func (*Component) Children() []*Component |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
With these two functions we are able to defer the initialization phase of |
|
|
|
|
startup by using the same Contexts we were passing around for the purpose of |
|
|
|
|
denoting component structure. One thing to note is that, since hooks are being |
|
|
|
|
registered onto Contexts within the component instantiation code, the parent |
|
|
|
|
Context will not know about these hooks. Therefore it is necessary to add the |
|
|
|
|
child component's Context back into the parent. To do this we add two final |
|
|
|
|
functions to the `mctx` package: |
|
|
|
|
The final method allows us to, starting at the root `Component`, traverse the |
|
|
|
|
component structure, interacting with each `Component`'s key/value store. This |
|
|
|
|
will be useful for implementing hooks. |
|
|
|
|
|
|
|
|
|
#### Hooks |
|
|
|
|
|
|
|
|
|
A hook is, simply a function which will run later. We will declare a new |
|
|
|
|
package, calling it `mrun`, and say that it has two new functions: |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
// Package mctx |
|
|
|
|
// Package mrun |
|
|
|
|
|
|
|
|
|
// WithChild returns a copy of the parent with the child added to it. Children |
|
|
|
|
// of a Context can be retrieved using the Children function. |
|
|
|
|
func WithChild(parent, child context.Context) context.Context |
|
|
|
|
// InitHook registers the given hook to the given Component. |
|
|
|
|
func InitHook(cmp *mcmp.Component, hook func()) |
|
|
|
|
|
|
|
|
|
// Children returns all child Contexts which have been added to the given one |
|
|
|
|
// using WithChild, in the order they were added. |
|
|
|
|
func Children(ctx context.Context) []context.Context |
|
|
|
|
// Init runs all hooks registered using InitHook. Hooks are run in the order |
|
|
|
|
// they were registered. |
|
|
|
|
func Init(cmp *mcmp.Component) |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
With these two functions we are able to defer the initialization phase of |
|
|
|
|
startup by using the same `Component`s we were passing around for the purpose of |
|
|
|
|
denoting component structure. |
|
|
|
|
|
|
|
|
|
Now, with these few extra pieces of functionality in place, let's reconsider the |
|
|
|
|
most recent example, and make a program which creates two redis components which |
|
|
|
|
exist independently of each other: |
|
|
|
@ -478,61 +486,54 @@ exist independently of each other: |
|
|
|
|
```go |
|
|
|
|
// Package redis |
|
|
|
|
|
|
|
|
|
// NOTE that NewConn has been renamed to WithConn, to reflect that the given |
|
|
|
|
// Context is being returned _with_ a redis component added to it. |
|
|
|
|
// NOTE that NewConn has been renamed to InstConn, to reflect that the returned |
|
|
|
|
// *RedisConn is merely instantiated, not initialized. |
|
|
|
|
|
|
|
|
|
func WithConn(parent context.Context, defaultAddr string) (context.Context, *RedisConn) { |
|
|
|
|
ctx = mctx.NewChild(parent, "redis") |
|
|
|
|
func InstConn(cmp *mcmp.Component, defaultAddr string) *RedisConn { |
|
|
|
|
cmp = cmp.Child("redis") |
|
|
|
|
|
|
|
|
|
// we instantiate an empty RedisConn instance and parameters for it. Neither |
|
|
|
|
// has been initialized yet. They will remain empty until initialization has |
|
|
|
|
// occurred. |
|
|
|
|
redisConn := new(RedisConn) |
|
|
|
|
addrParam := flag.String(ctx, "-addr", defaultAddr, "Address of redis instance to connect to") |
|
|
|
|
addrParam := mcfg.String(cmp, "-addr", defaultAddr, "Address of redis instance to connect to") |
|
|
|
|
|
|
|
|
|
ctx = mrun.WithInitHook(ctx, func() { |
|
|
|
|
mrun.InitHook(cmp, func() { |
|
|
|
|
// This hook will run after parameter initialization has happened, and |
|
|
|
|
// so addrParam will be usable. redisConn will be usable after this hook |
|
|
|
|
// has run as well. |
|
|
|
|
// so addrParam will be usable. Once this hook as run, redisConn will be |
|
|
|
|
// usable as well. |
|
|
|
|
*redisConn = makeRedisConnection(*addrParam) |
|
|
|
|
}) |
|
|
|
|
|
|
|
|
|
// Now that ctx has had configuration parameters and intialization hooks |
|
|
|
|
// instantiated into it, return both it and the empty redisConn instance |
|
|
|
|
// back to the parent. |
|
|
|
|
return mctx.WithChild(parent, ctx), redisConn |
|
|
|
|
// Now that cmp has had configuration parameters and intialization hooks |
|
|
|
|
// set into it, return the empty redisConn instance back to the parent. |
|
|
|
|
return redisConn |
|
|
|
|
} |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
//////////////////////////////////////////////////////////////////////////////// |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
// Package main |
|
|
|
|
|
|
|
|
|
func main() { |
|
|
|
|
// Create the root context, an empty Context. |
|
|
|
|
ctx := context.Background() |
|
|
|
|
|
|
|
|
|
// Create the Contexts for two sub-components, foo and bar. |
|
|
|
|
ctxFoo := mctx.NewChild(ctx, "foo") |
|
|
|
|
ctxBar := mctx.NewChild(ctx, "bar") |
|
|
|
|
// Create the root Component, an empty Component. |
|
|
|
|
cmp := mcmp.New() |
|
|
|
|
|
|
|
|
|
// Add redis components to each of the foo and bar sub-components. The |
|
|
|
|
// returned Contexts will be used to initialize the redis components. |
|
|
|
|
ctxFoo, redisFoo := redis.WithConn(ctxFoo, "127.0.0.1:6379") |
|
|
|
|
ctxBar, redisBar := redis.WithConn(ctxBar, "127.0.0.1:6379") |
|
|
|
|
// Create the Components for two sub-components, foo and bar. |
|
|
|
|
cmpFoo := cmp.Child("foo") |
|
|
|
|
cmpBar := cmp.Child("bar") |
|
|
|
|
|
|
|
|
|
// Add the sub-component contexts back to the root, so they can all be |
|
|
|
|
// initialized at once. |
|
|
|
|
ctx = mctx.WithChild(ctx, ctxFoo) |
|
|
|
|
ctx = mctx.WithChild(ctx, ctxBar) |
|
|
|
|
// Add redis components to each of the foo and bar sub-components. |
|
|
|
|
redisFoo := redis.InstConn(cmpFoo, "127.0.0.1:6379") |
|
|
|
|
redisBar := redis.InstConn(cmpBar, "127.0.0.1:6379") |
|
|
|
|
|
|
|
|
|
// Parse will descend into the Context and all of its children, discovering |
|
|
|
|
// all registered configuration parameters and filling them from the |
|
|
|
|
// command-line. |
|
|
|
|
mcfg.Parse(ctx) |
|
|
|
|
// Parse will descend into the Component and all of its children, |
|
|
|
|
// discovering all registered configuration parameters and filling them from |
|
|
|
|
// the command-line. |
|
|
|
|
mcfg.Parse(cmp) |
|
|
|
|
|
|
|
|
|
// Now that configuration has been initialized, run the Init hooks for each |
|
|
|
|
// of the sub-components. |
|
|
|
|
mrun.Init(ctx) |
|
|
|
|
// Now that configuration parameters have been initialized, run the Init |
|
|
|
|
// hooks for all Components. |
|
|
|
|
mrun.Init(cmp) |
|
|
|
|
|
|
|
|
|
// At this point the redis components have been fully initialized and may be |
|
|
|
|
// used. For this example we'll copy all keys from one to the other. |
|
|
|
@ -544,117 +545,37 @@ func main() { |
|
|
|
|
} |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
### Full example |
|
|
|
|
|
|
|
|
|
TODO |
|
|
|
|
|
|
|
|
|
## Part 3: Annotations, Logging, and Errors |
|
|
|
|
|
|
|
|
|
Let's shift gears away from the component structure for a bit, and talk about a |
|
|
|
|
separate, but related, set of issues: those related to logging and errors. |
|
|
|
|
|
|
|
|
|
Both logging and error creation share the same problem, that of collecting as |
|
|
|
|
much contextual information around an event as possible. This is often done |
|
|
|
|
through string formatting, like so: |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
// ServeHTTP implements the http.Handler method and is used to serve App's HTTP |
|
|
|
|
// endpoints. |
|
|
|
|
func (app *App) ServeHTTP(rw http.ResponseWriter, r *http.Request) { |
|
|
|
|
log.Printf("incoming request from remoteAddr:%s for url:%s", r.RemoteAddr, r.URL.String()) |
|
|
|
|
|
|
|
|
|
// begin actual request handling |
|
|
|
|
} |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
In this example the code is logging an event, an incoming HTTP request, and |
|
|
|
|
including contextual information in that log about the remote address of the |
|
|
|
|
requester and the URL being requested. |
|
|
|
|
|
|
|
|
|
Similarly, an error might be created like this: |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
func (app *App) GetUsername(userID int) (string, error) { |
|
|
|
|
userName, err := app.Redis.Command("GET", userID) |
|
|
|
|
if err != nil { |
|
|
|
|
return "", fmt.Errorf("could not get username for userID:%d: %s", userID, err) |
|
|
|
|
} |
|
|
|
|
return userName, nil |
|
|
|
|
} |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
In that example, when redis returns an error, the error is extended to include |
|
|
|
|
contextual information about what was attempting to be done (`could not get |
|
|
|
|
username`) and the userID involved. In newer versions of Go, and indeed in many |
|
|
|
|
other programming languages, the error will also include information about where |
|
|
|
|
in the source code it occurred, such as file name and line number. |
|
|
|
|
|
|
|
|
|
It is my experience that both logging and error creation often take up an |
|
|
|
|
inordinate amount of space in many programs. This is due to a desire to |
|
|
|
|
contextualize as much as possible, since in a large program it can be difficult |
|
|
|
|
to tell exactly where something is happening, even if you're looking at the log |
|
|
|
|
entry or error. For example, if a program has a set of HTTP endpoints, each one |
|
|
|
|
performing a redis call, what good is it to see the log entry `redis command had |
|
|
|
|
an error: took too long` without also knowing which command is involved, and |
|
|
|
|
which endpoint is calling it? Very little. |
|
|
|
|
|
|
|
|
|
Many programs end up looking like this: |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
func (app *App) httpEndpointA(rw http.ResponseWriter, r *http.Request) { |
|
|
|
|
err := app.Redis.Command("SET", "foo", "bar") |
|
|
|
|
if err != nil { |
|
|
|
|
log.Printf("redis error occurred in EndpointA, calling SET: %s", err) |
|
|
|
|
} |
|
|
|
|
} |
|
|
|
|
|
|
|
|
|
func (app *App) httpEndpointB(rw http.ResponseWriter, r *http.Request) { |
|
|
|
|
err := app.Redis.Command("INCR", "baz") |
|
|
|
|
if err != nil { |
|
|
|
|
log.Printf("redis error occurred in EndpointA, calling INCR: %s", err) |
|
|
|
|
} |
|
|
|
|
} |
|
|
|
|
|
|
|
|
|
// etc... |
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
Obviously logging is taking up the majority of the code-space in those examples, |
|
|
|
|
and that doesn't even include potentially pertinent information such as IP |
|
|
|
|
address, or log entries for non-error events. |
|
|
|
|
|
|
|
|
|
Another aspect of the logging/error dichotemy is that they are often dealing in |
|
|
|
|
essentially the same data. This makes sense, as both are really dealing with the |
|
|
|
|
same thing: capturing context for the purpose of later debugging. So rather than |
|
|
|
|
formatting strings by hand for each use-case, let's instead use our friend, |
|
|
|
|
`context.Context`, to carry the data for us. |
|
|
|
|
## Conclusion |
|
|
|
|
|
|
|
|
|
While the examples given here are fairly simplistic, the pattern itself is quite |
|
|
|
|
powerful. Codebases naturally accumulate small, domain specific behaviors and |
|
|
|
|
optimizations over time, especially around the IO components of the program. |
|
|
|
|
Databases are used with specific options that an organization finds useful, |
|
|
|
|
logging is performed in particular places, metrics are counted around certain |
|
|
|
|
pieces of code, etc... |
|
|
|
|
|
|
|
|
|
### Annotations |
|
|
|
|
By programming with component structure in mind we are able to keep these |
|
|
|
|
optimizations while also keeping the clarity and compartmentalization of the |
|
|
|
|
code in-tact. We are able to keep our code flexible and configurable, while also |
|
|
|
|
re-usable and testable. And the simplicity of the tools involved means it can be |
|
|
|
|
extended and retrofitted for nearly any situation or use-case. |
|
|
|
|
|
|
|
|
|
I will here introduce the idea of "annotations", which are essentially key/value |
|
|
|
|
pairs which can be attached to a Context and retrieved later. To implement |
|
|
|
|
annotations I will introduce two new functions to the `mctx` package: |
|
|
|
|
Overall, it's a powerful pattern that I've found myself unable to do without |
|
|
|
|
once I began using it. |
|
|
|
|
|
|
|
|
|
```go |
|
|
|
|
// Package mctx |
|
|
|
|
|
|
|
|
|
// Annotate returns a new Context with the given key/value pairs embedded into |
|
|
|
|
// it, which can be later retrieved using the Annotations method. If any keys |
|
|
|
|
// conflict with previous annotations, their values will overwrite the |
|
|
|
|
// previously annotated values for those keys. |
|
|
|
|
func Annotate(ctx context.Context, keyvals ...interface{}) context.Context |
|
|
|
|
|
|
|
|
|
// Annotations returns all annotations which have been set on the Context using |
|
|
|
|
// Annotate. |
|
|
|
|
func Annotations(ctx context.Context) map[interface{}]interface{} |
|
|
|
|
``` |
|
|
|
|
### Implementation |
|
|
|
|
|
|
|
|
|
### Aside: Structural vs Runtime Contexts |
|
|
|
|
As a final note, you can find an example implementation of the packages |
|
|
|
|
described in this post here: |
|
|
|
|
|
|
|
|
|
It may seem strange that we're about to use Contexts for a use-case that's |
|
|
|
|
completely different than the one discussed in Part 1, and I've been asked |
|
|
|
|
before if perhaps that doesn't indicate the two should be separated into |
|
|
|
|
separate entities: a structural context type which behaves as shown in Part 1, |
|
|
|
|
and a runtime context type whose behavior we've just looked at. |
|
|
|
|
* [mcmp](https://godoc.org/github.com/mediocregopher/mediocre-go-lib/mcmp) |
|
|
|
|
* [mcfg](https://godoc.org/github.com/mediocregopher/mediocre-go-lib/mcfg) |
|
|
|
|
* [mrun](https://godoc.org/github.com/mediocregopher/mediocre-go-lib/mrun) |
|
|
|
|
|
|
|
|
|
I think this is a compelling idea... |
|
|
|
|
The packages are not stable and are likely to change frequently. You'll also |
|
|
|
|
find that they have been extended quite a bit from the simple descriptions found |
|
|
|
|
here, based on what I've found useful as I've implemented programs using |
|
|
|
|
component structures. With these two points in mind, I would encourage you to |
|
|
|
|
look in and take whatever functionality you find useful for yourself, and not |
|
|
|
|
use the packages directly. The core pieces are not different from what has been |
|
|
|
|
described in this post. |
|
|
|
|