2019-05-18 20:29:48 +00:00
|
|
|
|
---
|
|
|
|
|
title: >-
|
|
|
|
|
Program Structure and Composability
|
|
|
|
|
description: >-
|
|
|
|
|
Discussing the nature of program structure, the problems presented by
|
2019-08-13 04:01:28 +00:00
|
|
|
|
complex structures, and a pattern that helps in solving those problems.
|
2019-05-18 20:29:48 +00:00
|
|
|
|
---
|
|
|
|
|
|
2019-05-19 19:07:02 +00:00
|
|
|
|
## Part 0: Introduction
|
2019-05-18 20:29:48 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
This post is focused on a concept I call “program structure,” which I will try
|
|
|
|
|
to shed some light on before discussing complex program structures. I will then
|
|
|
|
|
discuss why complex structures can be problematic to deal with, and will finally
|
|
|
|
|
discuss a pattern for dealing with those problems.
|
2019-05-18 20:29:48 +00:00
|
|
|
|
|
|
|
|
|
My background is as a backend engineer working on large projects that have had
|
2019-08-13 04:01:28 +00:00
|
|
|
|
many moving parts; most had multiple programs interacting with each other, used
|
|
|
|
|
many different databases in various contexts, and faced large amounts of load
|
2019-05-19 19:07:02 +00:00
|
|
|
|
from millions of users. Most of this post will be framed from my perspective,
|
|
|
|
|
and will present problems in the way I have experienced them. I believe,
|
|
|
|
|
however, that the concepts and problems I discuss here are applicable to many
|
|
|
|
|
other domains, and I hope those with a foot in both backend systems and a second
|
|
|
|
|
domain can help to translate the ideas between the two.
|
|
|
|
|
|
|
|
|
|
Also note that I will be using Go as my example language, but none of the
|
2019-08-13 04:01:28 +00:00
|
|
|
|
concepts discussed here are specific to Go. To that end, I’ve decided to favor
|
|
|
|
|
readable code over “correct” code, and so have elided things that most gophers
|
2019-08-02 23:15:44 +00:00
|
|
|
|
hold near-and-dear, such as error checking and proper documentation, in order to
|
|
|
|
|
make the code as accessible as possible to non-gophers as well. As with before,
|
2019-08-13 04:01:28 +00:00
|
|
|
|
I trust that someone with a foot in Go and another language can help me
|
2019-08-02 23:15:44 +00:00
|
|
|
|
translate between the two.
|
2019-05-18 20:29:48 +00:00
|
|
|
|
|
|
|
|
|
## Part 1: Program Structure
|
|
|
|
|
|
2019-05-19 19:07:02 +00:00
|
|
|
|
In this section I will discuss the difference between directory and program
|
|
|
|
|
structure, show how global state is antithetical to compartmentalization (and
|
|
|
|
|
therefore good program structure), and finally discuss a more effective way to
|
|
|
|
|
think about program structure.
|
|
|
|
|
|
|
|
|
|
### Directory Structure
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
For a long time, I thought about program structure in terms of the hierarchy
|
|
|
|
|
present in the filesystem. In my mind, a program’s structure looked like this:
|
2019-05-18 20:29:48 +00:00
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
// The directory structure of a project called gobdns.
|
|
|
|
|
src/
|
|
|
|
|
config/
|
|
|
|
|
dns/
|
|
|
|
|
http/
|
|
|
|
|
ips/
|
|
|
|
|
persist/
|
|
|
|
|
repl/
|
|
|
|
|
snapshot/
|
|
|
|
|
main.go
|
|
|
|
|
```
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
What I grew to learn was that this conflation of “program structure” with
|
|
|
|
|
“directory structure” is ultimately unhelpful. While it can’t be denied that
|
|
|
|
|
every program has a directory structure (and if not, it ought to), this does not
|
|
|
|
|
mean that the way the program looks in a filesystem in any way corresponds to
|
|
|
|
|
how it looks in our mind’s eye.
|
2019-05-18 20:29:48 +00:00
|
|
|
|
|
|
|
|
|
The most notable way to show this is to consider a library package. Here is the
|
|
|
|
|
structure of a simple web-app which uses redis (my favorite database) as a
|
|
|
|
|
backend:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
src/
|
|
|
|
|
redis/
|
|
|
|
|
http/
|
|
|
|
|
main.go
|
|
|
|
|
```
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
If I were to ask you, based on that directory structure, what the program does
|
|
|
|
|
in the most abstract terms, you might say something like: “The program
|
|
|
|
|
establishes an http server that listens for requests. It also establishes a
|
2019-08-02 23:15:44 +00:00
|
|
|
|
connection to the redis server. The program then interacts with redis in
|
2019-08-13 04:01:28 +00:00
|
|
|
|
different ways based on the http requests that are received on the server.”
|
2019-05-18 20:29:48 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
And that would be a good guess. Here’s a diagram that depicts the program
|
2019-05-19 19:07:02 +00:00
|
|
|
|
structure, wherein the root node, `main.go`, takes in requests from `http` and
|
|
|
|
|
processes them using `redis`.
|
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
{% include image.html
|
|
|
|
|
dir="program-structure" file="diag1.jpg" width=519
|
|
|
|
|
descr="Example 1"
|
|
|
|
|
%}
|
2019-05-19 19:07:02 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
This is certainly a viable guess for how a program with that directory
|
|
|
|
|
structure operates, but consider another answer: “A component of the program
|
|
|
|
|
called `server` establishes an http server that listens for requests. `server`
|
|
|
|
|
also establishes a connection to a redis server. `server` then interacts with
|
|
|
|
|
that redis connection in different ways based on the http requests that are
|
2019-08-02 23:15:44 +00:00
|
|
|
|
received on the http server. Additionally, `server` tracks statistics about
|
|
|
|
|
these interactions and makes them available to other components. The root
|
|
|
|
|
component of the program establishes a connection to a second redis server, and
|
2019-08-13 04:01:28 +00:00
|
|
|
|
stores those statistics in that redis server.” Here’s another diagram to depict
|
2019-08-02 23:15:44 +00:00
|
|
|
|
_that_ program.
|
|
|
|
|
|
|
|
|
|
{% include image.html
|
|
|
|
|
dir="program-structure" file="diag2.jpg" width=712
|
|
|
|
|
descr="Example 2"
|
|
|
|
|
%}
|
2019-05-18 20:29:48 +00:00
|
|
|
|
|
|
|
|
|
The directory structure could apply to either description; `redis` is just a
|
2019-08-13 04:01:28 +00:00
|
|
|
|
library which allows for interaction with a redis server, but it doesn’t
|
|
|
|
|
specify _which_ or _how many_ servers. However, those are extremely important
|
|
|
|
|
factors that are definitely reflected in our concept of the program’s
|
|
|
|
|
structure, and not in the directory structure. **What the directory structure
|
|
|
|
|
reflects are the different _kinds_ of components available to use, but it does
|
|
|
|
|
not reflect how a program will use those components.**
|
|
|
|
|
|
2019-05-18 20:29:48 +00:00
|
|
|
|
|
2019-06-14 20:23:50 +00:00
|
|
|
|
### Global State vs Compartmentalization
|
2019-05-18 20:29:48 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
The directory-centric view of structure often leads to the use of global
|
2019-05-18 20:29:48 +00:00
|
|
|
|
singletons to manage access to external resources like RPC servers and
|
2019-08-02 23:15:44 +00:00
|
|
|
|
databases. In examples 1 and 2 the `redis` library might contain code which
|
2019-08-13 04:01:28 +00:00
|
|
|
|
looks something like this:
|
2019-05-18 20:29:48 +00:00
|
|
|
|
|
|
|
|
|
```go
|
2019-05-19 19:07:02 +00:00
|
|
|
|
// A mapping of connection names to redis connections.
|
2019-05-22 20:26:51 +00:00
|
|
|
|
var globalConns = map[string]*RedisConn{}
|
2019-05-19 19:07:02 +00:00
|
|
|
|
|
2019-05-22 20:26:51 +00:00
|
|
|
|
func Get(name string) *RedisConn {
|
2019-05-19 19:07:02 +00:00
|
|
|
|
if globalConns[name] == nil {
|
2019-05-29 21:16:57 +00:00
|
|
|
|
globalConns[name] = makeRedisConnection(name)
|
2019-05-19 19:07:02 +00:00
|
|
|
|
}
|
|
|
|
|
return globalConns[name]
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Even though this pattern would work, it breaks with our conception of the
|
2019-08-13 04:01:28 +00:00
|
|
|
|
program structure in more complex cases like example 2. Rather than the `redis`
|
|
|
|
|
component being owned by the `server` component, which actually uses it, it
|
|
|
|
|
would be practically owned by _all_ components, since all are able to use it.
|
2019-08-02 23:15:44 +00:00
|
|
|
|
Compartmentalization has been broken, and can only be held together through
|
|
|
|
|
sheer human discipline.
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
**This is the problem with all global state. It is shareable among all
|
|
|
|
|
components of a program, and so is accountable to none of them.** One must look
|
|
|
|
|
at an entire codebase to understand how a globally held component is used,
|
|
|
|
|
which might not even be possible for a large codebase. Therefore, the
|
|
|
|
|
maintainers of these shared components rely entirely on the discipline of their
|
|
|
|
|
fellow coders when making changes, usually discovering where that discipline
|
|
|
|
|
broke down once the changes have been pushed live.
|
2019-08-02 23:15:44 +00:00
|
|
|
|
|
|
|
|
|
Global state also makes it easier for disparate programs/components to share
|
|
|
|
|
datastores for completely unrelated tasks. In example 2, rather than creating a
|
2019-08-13 04:01:28 +00:00
|
|
|
|
new redis instance for the root component’s statistics storage, the coder might
|
|
|
|
|
have instead said, “well, there’s already a redis instance available, I’ll just
|
|
|
|
|
use that.” And so, compartmentalization would have been broken further. Perhaps
|
|
|
|
|
the two instances _could_ be coalesced into the same instance for the sake of
|
2019-08-02 23:15:44 +00:00
|
|
|
|
resource efficiency, but that decision would be better made at runtime via the
|
|
|
|
|
configuration of the program, rather than being hardcoded into the code.
|
2019-05-19 19:07:02 +00:00
|
|
|
|
|
|
|
|
|
From the perspective of team management, global state-based patterns do nothing
|
|
|
|
|
except slow teams down. The person/team responsible for maintaining the central
|
2019-08-13 04:01:28 +00:00
|
|
|
|
library in which shared components live (`redis`, in the above examples)
|
|
|
|
|
becomes the bottleneck for creating new instances for new components, which
|
|
|
|
|
will further lead to re-using existing instances rather than creating new ones,
|
|
|
|
|
further breaking compartmentalization. Additionally the person/team responsible
|
|
|
|
|
for the central library, rather than the team using it, often finds themselves
|
|
|
|
|
as the maintainers of the shared resource.
|
2019-05-19 19:07:02 +00:00
|
|
|
|
|
2019-05-22 20:26:51 +00:00
|
|
|
|
### Component Structure
|
2019-05-19 19:07:02 +00:00
|
|
|
|
|
|
|
|
|
So what does proper program structure look like? In my mind the structure of a
|
2019-08-13 04:01:28 +00:00
|
|
|
|
program is a hierarchy of components, or, in other words, a tree. The leaf
|
|
|
|
|
nodes of the tree are almost _always_ IO related components, e.g., database
|
|
|
|
|
connections, RPC server frameworks or clients, message queue consumers, etc.
|
|
|
|
|
The non-leaf nodes will _generally_ be components that bring together the
|
2019-05-19 19:07:02 +00:00
|
|
|
|
functionalities of their children in some useful way, though they may also have
|
|
|
|
|
some IO functionality of their own.
|
|
|
|
|
|
|
|
|
|
Let's look at an even more complex structure, still only using the `redis` and
|
|
|
|
|
`http` component types:
|
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
{% include image.html
|
|
|
|
|
dir="program-structure" file="diag3.jpg" width=729
|
|
|
|
|
descr="Example 3"
|
|
|
|
|
%}
|
2019-05-19 19:07:02 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
This component structure contains the addition of the `debug` component.
|
|
|
|
|
Clearly the `http` and `redis` components are reusable in different contexts,
|
|
|
|
|
but for this example the `debug` endpoint is as well. It creates a separate
|
|
|
|
|
http server that can be queried to perform runtime debugging of the program,
|
|
|
|
|
and can be tacked onto virtually any program. The `rest-api` component is
|
|
|
|
|
specific to this program and is therefore not reusable. Let’s dive into it a
|
|
|
|
|
bit to see how it might be implemented:
|
2019-05-19 19:07:02 +00:00
|
|
|
|
|
|
|
|
|
```go
|
|
|
|
|
// RestAPI is very much not thread-safe, hopefully it doesn't have to handle
|
|
|
|
|
// more than one request at once.
|
|
|
|
|
type RestAPI struct {
|
2019-05-22 20:26:51 +00:00
|
|
|
|
redisConn *redis.RedisConn
|
2019-05-19 19:07:02 +00:00
|
|
|
|
httpSrv *http.Server
|
|
|
|
|
|
|
|
|
|
// Statistics exported for other components to see
|
|
|
|
|
RequestCount int
|
|
|
|
|
FooRequestCount int
|
|
|
|
|
BarRequestCount int
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func NewRestAPI() *RestAPI {
|
|
|
|
|
r := new(RestAPI)
|
|
|
|
|
r.redisConn := redis.NewConn("127.0.0.1:6379")
|
|
|
|
|
|
|
|
|
|
// mux will route requests to different handlers based on their URL path.
|
|
|
|
|
mux := http.NewServeMux()
|
2019-08-02 23:15:44 +00:00
|
|
|
|
mux.HandleFunc("/foo", r.fooHandler)
|
|
|
|
|
mux.HandleFunc("/bar", r.barHandler)
|
2019-05-19 19:07:02 +00:00
|
|
|
|
r.httpSrv := http.NewServer(mux)
|
|
|
|
|
|
|
|
|
|
// Listen for requests and serve them in the background.
|
|
|
|
|
go r.httpSrv.Listen(":8000")
|
|
|
|
|
|
|
|
|
|
return r
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func (r *RestAPI) fooHandler(rw http.ResponseWriter, r *http.Request) {
|
|
|
|
|
r.redisConn.Command("INCR", "fooKey")
|
|
|
|
|
r.RequestCount++
|
|
|
|
|
r.FooRequestCount++
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func (r *RestAPI) barHandler(rw http.ResponseWriter, r *http.Request) {
|
|
|
|
|
r.redisConn.Command("INCR", "barKey")
|
|
|
|
|
r.RequestCount++
|
|
|
|
|
r.BarRequestCount++
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
|
|
|
|
|
In that snippet `rest-api` coalesced `http` and `redis` into a simple REST-like
|
|
|
|
|
api using pre-made library components. `main.go`, the root component, does much
|
2019-08-02 23:15:44 +00:00
|
|
|
|
the same:
|
2019-05-19 19:07:02 +00:00
|
|
|
|
|
|
|
|
|
```go
|
|
|
|
|
func main() {
|
|
|
|
|
// Create debug server and start listening in the background
|
|
|
|
|
debugSrv := debug.NewServer()
|
|
|
|
|
|
|
|
|
|
// Set up the RestAPI, this will automatically start listening
|
|
|
|
|
restAPI := NewRestAPI()
|
2019-05-18 20:29:48 +00:00
|
|
|
|
|
2019-05-19 19:07:02 +00:00
|
|
|
|
// Create another redis connection and use it to store statistics
|
|
|
|
|
statsRedisConn := redis.NewConn("127.0.0.1:6380")
|
|
|
|
|
for {
|
|
|
|
|
time.Sleep(1 * time.Second)
|
|
|
|
|
statsRedisConn.Command("SET", "numReqs", restAPI.RequestCount)
|
|
|
|
|
statsRedisConn.Command("SET", "numFooReqs", restAPI.FooRequestCount)
|
|
|
|
|
statsRedisConn.Command("SET", "numBarReqs", restAPI.BarRequestCount)
|
2019-05-18 20:29:48 +00:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
One thing that is clearly missing in this program is proper configuration,
|
|
|
|
|
whether from command-line or environment variables, etc. As it stands, all
|
|
|
|
|
configuration parameters, such as the redis addresses and http listen
|
|
|
|
|
addresses, are hardcoded. Proper configuration actually ends up being somewhat
|
|
|
|
|
difficult, as the ideal case would be for each component to set up its own
|
|
|
|
|
configuration variables without its parent needing to be aware. For example,
|
|
|
|
|
`redis` could set up `addr` and `pool-size` parameters. The problem is that there
|
|
|
|
|
are two `redis` components in the program, and their parameters would therefore
|
|
|
|
|
conflict with each other. An elegant solution to this problem is discussed in
|
|
|
|
|
the next section.
|
2019-05-19 19:07:02 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
## Part 2: Components, Configuration, and Runtime
|
2019-05-19 19:07:02 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
The key to the configuration problem is to recognize that, even if there are
|
|
|
|
|
two of the same component in a program, they can’t occupy the same place in the
|
|
|
|
|
program’s structure. In the above example, there are two `http` components: one
|
|
|
|
|
under `rest-api` and the other under `debug`. Because the structure is
|
|
|
|
|
represented as a tree of components, the “path” of any node in the tree
|
|
|
|
|
uniquely represents it in the structure. For example, the two `http` components
|
|
|
|
|
in the previous example have these paths:
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
root -> rest-api -> http
|
|
|
|
|
root -> debug -> http
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
If each component were to know its place in the component tree, then it would
|
2019-08-13 04:01:28 +00:00
|
|
|
|
easily be able to ensure that its configuration and initialization didn’t
|
|
|
|
|
conflict with other components of the same type. If the `http` component sets
|
|
|
|
|
up a command-line parameter to know what address to listen on, the two `http`
|
2019-05-22 20:26:51 +00:00
|
|
|
|
components in that program would set up:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
--rest-api-listen-addr
|
|
|
|
|
--debug-listen-addr
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
So how can we enable each component to know its path in the component structure?
|
2019-08-13 04:01:28 +00:00
|
|
|
|
To answer this, we’ll have to take a detour through a type, called `Component`.
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
### Component and Configuration
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
The `Component` type is a made-up type (though you’ll be able to find an
|
2019-08-02 23:15:44 +00:00
|
|
|
|
implementation of it at the end of this post). It has a single primary purpose,
|
2019-08-13 04:01:28 +00:00
|
|
|
|
and that is to convey the program’s structure to new components.
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
To see how this is done, let's look at a couple of `Component`'s methods:
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
|
|
|
|
```go
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Package mcmp
|
|
|
|
|
|
|
|
|
|
// New returns a new Component which has no parents or children. It is therefore
|
|
|
|
|
// the root component of a component hierarchy.
|
|
|
|
|
func New() *Component
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Child returns a new child of the called upon Component.
|
|
|
|
|
func (*Component) Child(name string) *Component
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Path returns the Component's path in the component hierarchy. It will return
|
|
|
|
|
// an empty slice if the Component is the root component.
|
|
|
|
|
func (*Component) Path() []string
|
2019-05-22 20:26:51 +00:00
|
|
|
|
```
|
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
`Child` is used to create a new `Component`, corresponding to a new child node
|
|
|
|
|
in the component structure, and `Path` is used retrieve the path of any
|
2019-08-13 04:01:28 +00:00
|
|
|
|
`Component` within that structure. For the sake of keeping the examples simple,
|
|
|
|
|
let’s pretend these functions have been implemented in a package called `mcmp`.
|
|
|
|
|
Here’s an example of how `Component` might be used in the `redis` component’s
|
2019-08-02 23:15:44 +00:00
|
|
|
|
code:
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
|
|
|
|
```go
|
2019-05-29 21:16:57 +00:00
|
|
|
|
// Package redis
|
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
func NewConn(cmp *mcmp.Component, defaultAddr string) *RedisConn {
|
|
|
|
|
cmp = cmp.Child("redis")
|
|
|
|
|
paramPrefix := strings.Join(cmp.Path(), "-")
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
|
|
|
|
addrParam := flag.String(paramPrefix+"-addr", defaultAddr, "Address of redis instance to connect to")
|
|
|
|
|
// finish setup
|
|
|
|
|
|
|
|
|
|
return redisConn
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
In our above example, the two `redis` components' parameters would be:
|
|
|
|
|
|
|
|
|
|
```
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// This first parameter is for the stats redis, whose parent is the root and
|
2019-05-22 20:26:51 +00:00
|
|
|
|
// therefore doesn't have a prefix. Perhaps stats should be broken into its own
|
|
|
|
|
// component in order to fix this.
|
|
|
|
|
--redis-addr
|
|
|
|
|
--rest-api-redis-addr
|
|
|
|
|
```
|
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
`Component` definitely makes it easier to instantiate multiple redis components
|
|
|
|
|
in our program, since it allows them to know their place in the component
|
|
|
|
|
structure.
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
Having to construct the prefix for the parameters ourselves is pretty annoying,
|
|
|
|
|
so let’s introduce a new package, `mcfg`, which acts like `flag` but is aware
|
|
|
|
|
of `Component`. Then `redis.NewConn` is reduced to:
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
|
|
|
|
```go
|
2019-05-29 21:16:57 +00:00
|
|
|
|
// Package redis
|
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
func NewConn(cmp *mcmp.Component, defaultAddr string) *RedisConn {
|
|
|
|
|
cmp = cmp.Child("redis")
|
2019-09-10 16:16:22 +00:00
|
|
|
|
addrParam := mcfg.String(cmp, "addr", defaultAddr, "Address of redis instance to connect to")
|
2019-05-22 20:26:51 +00:00
|
|
|
|
// finish setup
|
|
|
|
|
|
|
|
|
|
return redisConn
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
Easy-peasy.
|
2019-08-02 23:15:44 +00:00
|
|
|
|
|
|
|
|
|
#### But What About Parse?
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
Sharp-eyed gophers will notice that there is a key piece missing: When is
|
2019-08-02 23:15:44 +00:00
|
|
|
|
`flag.Parse`, or its `mcfg` counterpart, called? When does `addrParam` actually
|
2019-09-10 16:55:16 +00:00
|
|
|
|
get populated? It can’t happen inside `redis.NewConn` because there might be
|
|
|
|
|
other components after `redis.NewConn` that want to set up parameters. To
|
|
|
|
|
illustrate the problem, let’s look at a simple program that wants to set up two
|
|
|
|
|
`redis` components:
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
|
|
|
|
```go
|
|
|
|
|
func main() {
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Create the root Component, an empty Component.
|
|
|
|
|
cmp := mcmp.New()
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Create the Components for two sub-components, foo and bar.
|
|
|
|
|
cmpFoo := cmp.Child("foo")
|
|
|
|
|
cmpBar := cmp.Child("bar")
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
2019-05-29 21:16:57 +00:00
|
|
|
|
// Now we want to try to create a redis sub-component for each component.
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
|
|
|
|
// This will set up the parameter "--foo-redis-addr", but bar hasn't had a
|
|
|
|
|
// chance to set up its corresponding parameter, so the command-line can't
|
|
|
|
|
// be parsed yet.
|
2019-08-02 23:15:44 +00:00
|
|
|
|
fooRedis := redis.NewConn(cmpFoo, "127.0.0.1:6379")
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
|
|
|
|
// This will set up the parameter "--bar-redis-addr", but, as mentioned
|
2019-05-29 21:16:57 +00:00
|
|
|
|
// before, redis.NewConn can't parse command-line.
|
2019-08-02 23:15:44 +00:00
|
|
|
|
barRedis := redis.NewConn(cmpBar, "127.0.0.1:6379")
|
2019-05-22 20:26:51 +00:00
|
|
|
|
|
2019-09-10 16:55:16 +00:00
|
|
|
|
// It is only after all components have been instantiated that the
|
|
|
|
|
// command-line arguments can be parsed
|
2019-08-02 23:15:44 +00:00
|
|
|
|
mcfg.Parse()
|
2019-05-22 20:26:51 +00:00
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2019-09-10 16:55:16 +00:00
|
|
|
|
While this solves our argument parsing problem, fooRedis and barRedis are not
|
|
|
|
|
usable yet because the actual connections have not been made. This is a classic
|
|
|
|
|
chicken and the egg problem. The func `redis.NewConn` needs to make a connection
|
|
|
|
|
which it cannot do until _after_ `mcfg.Parse` is called, but `mcfg.Parse` cannot
|
|
|
|
|
be called until after `redis.NewConn` has returned. We will solve this problem
|
|
|
|
|
in the next section.
|
2019-05-19 19:07:02 +00:00
|
|
|
|
|
2019-05-29 21:16:57 +00:00
|
|
|
|
### Instantiation vs Initialization
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
Let’s break down `redis.NewConn` into two phases: instantiation and
|
|
|
|
|
initialization. Instantiation refers to creating the component on the component
|
|
|
|
|
structure and having it declare what it needs in order to initialize (e.g.,
|
|
|
|
|
configuration parameters). During instantiation, nothing external to the
|
|
|
|
|
program is performed; no IO, no reading of the command-line, no logging, etc.
|
|
|
|
|
All that’s happened is that the empty template of a `redis` component has been
|
|
|
|
|
created.
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
Initialization is the phase during which the template is filled in.
|
|
|
|
|
Configuration parameters are read, startup actions like the creation of database
|
|
|
|
|
connections are performed, and logging is output for informational and debugging
|
|
|
|
|
purposes.
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
The key to making effective use of this dichotomy is to allow _all_ components
|
2019-05-29 21:16:57 +00:00
|
|
|
|
to instantiate themselves before they initialize themselves. By doing this we
|
2019-08-13 04:01:28 +00:00
|
|
|
|
can ensure, for example, that all components have had the chance to declare
|
2019-05-29 21:16:57 +00:00
|
|
|
|
their configuration parameters before configuration parsing is done.
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
So let’s modify `redis.NewConn` so that it follows this dichotomy. It makes
|
|
|
|
|
sense to leave instantiation-related code where it is, but we need a mechanism
|
2019-05-29 21:16:57 +00:00
|
|
|
|
by which we can declare initialization code before actually calling it. For
|
2019-08-13 04:01:28 +00:00
|
|
|
|
this, I will introduce the idea of a “hook.”
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
#### But First: Augment Component
|
|
|
|
|
|
|
|
|
|
In order to support hooks, however, `Component` will need to be augmented with
|
2019-08-13 04:01:28 +00:00
|
|
|
|
a few new methods. Right now, it can only carry with it information about the
|
2019-08-02 23:15:44 +00:00
|
|
|
|
component structure, but here we will add the ability to carry arbitrary
|
|
|
|
|
key/value information as well:
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
|
|
|
|
```go
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Package mcmp
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// SetValue sets the given key to the given value on the Component, overwriting
|
|
|
|
|
// any previous value for that key.
|
|
|
|
|
func (*Component) SetValue(key, value interface{})
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Value returns the value which has been set for the given key, or nil if the
|
|
|
|
|
// key was never set.
|
|
|
|
|
func (*Component) Value(key interface{}) interface{}
|
|
|
|
|
|
|
|
|
|
// Children returns the Component's children in the order they were created.
|
|
|
|
|
func (*Component) Children() []*Component
|
2019-05-29 21:16:57 +00:00
|
|
|
|
```
|
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
The final method allows us to, starting at the root `Component`, traverse the
|
2019-08-13 04:01:28 +00:00
|
|
|
|
component structure and interact with each `Component`’s key/value store. This
|
2019-08-02 23:15:44 +00:00
|
|
|
|
will be useful for implementing hooks.
|
|
|
|
|
|
|
|
|
|
#### Hooks
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
A hook is simply a function that will run later. We will declare a new package,
|
|
|
|
|
calling it `mrun`, and say that it has two new functions:
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
|
|
|
|
```go
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Package mrun
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// InitHook registers the given hook to the given Component.
|
|
|
|
|
func InitHook(cmp *mcmp.Component, hook func())
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Init runs all hooks registered using InitHook. Hooks are run in the order
|
|
|
|
|
// they were registered.
|
|
|
|
|
func Init(cmp *mcmp.Component)
|
2019-05-29 21:16:57 +00:00
|
|
|
|
```
|
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
With these two functions, we are able to defer the initialization phase of
|
|
|
|
|
startup by using the same `Components` we were passing around for the purpose
|
|
|
|
|
of denoting component structure.
|
2019-08-02 23:15:44 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
Now, with these few extra pieces of functionality in place, let’s reconsider the
|
|
|
|
|
most recent example, and make a program that creates two redis components which
|
2019-05-29 21:16:57 +00:00
|
|
|
|
exist independently of each other:
|
|
|
|
|
|
|
|
|
|
```go
|
|
|
|
|
// Package redis
|
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// NOTE that NewConn has been renamed to InstConn, to reflect that the returned
|
|
|
|
|
// *RedisConn is merely instantiated, not initialized.
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
func InstConn(cmp *mcmp.Component, defaultAddr string) *RedisConn {
|
|
|
|
|
cmp = cmp.Child("redis")
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
|
|
|
|
// we instantiate an empty RedisConn instance and parameters for it. Neither
|
|
|
|
|
// has been initialized yet. They will remain empty until initialization has
|
|
|
|
|
// occurred.
|
|
|
|
|
redisConn := new(RedisConn)
|
2019-09-10 16:20:30 +00:00
|
|
|
|
addrParam := mcfg.String(cmp, "addr", defaultAddr, "Address of redis instance to connect to")
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
mrun.InitHook(cmp, func() {
|
2019-05-29 21:16:57 +00:00
|
|
|
|
// This hook will run after parameter initialization has happened, and
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// so addrParam will be usable. Once this hook as run, redisConn will be
|
|
|
|
|
// usable as well.
|
2019-05-29 21:16:57 +00:00
|
|
|
|
*redisConn = makeRedisConnection(*addrParam)
|
|
|
|
|
})
|
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Now that cmp has had configuration parameters and intialization hooks
|
|
|
|
|
// set into it, return the empty redisConn instance back to the parent.
|
|
|
|
|
return redisConn
|
2019-05-29 21:16:57 +00:00
|
|
|
|
}
|
2019-08-02 23:15:44 +00:00
|
|
|
|
```
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
```go
|
2019-05-29 21:16:57 +00:00
|
|
|
|
// Package main
|
|
|
|
|
|
|
|
|
|
func main() {
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Create the root Component, an empty Component.
|
|
|
|
|
cmp := mcmp.New()
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Create the Components for two sub-components, foo and bar.
|
|
|
|
|
cmpFoo := cmp.Child("foo")
|
|
|
|
|
cmpBar := cmp.Child("bar")
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Add redis components to each of the foo and bar sub-components.
|
|
|
|
|
redisFoo := redis.InstConn(cmpFoo, "127.0.0.1:6379")
|
|
|
|
|
redisBar := redis.InstConn(cmpBar, "127.0.0.1:6379")
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Parse will descend into the Component and all of its children,
|
|
|
|
|
// discovering all registered configuration parameters and filling them from
|
|
|
|
|
// the command-line.
|
|
|
|
|
mcfg.Parse(cmp)
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
// Now that configuration parameters have been initialized, run the Init
|
|
|
|
|
// hooks for all Components.
|
|
|
|
|
mrun.Init(cmp)
|
2019-05-29 21:16:57 +00:00
|
|
|
|
|
|
|
|
|
// At this point the redis components have been fully initialized and may be
|
|
|
|
|
// used. For this example we'll copy all keys from one to the other.
|
|
|
|
|
keys := redisFoo.Command("KEYS", "*")
|
|
|
|
|
for i := range keys {
|
|
|
|
|
val := redisFoo.Command("GET", keys[i])
|
|
|
|
|
redisBar.Command("SET", keys[i], val)
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
## Conclusion
|
2019-06-01 21:39:14 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
While the examples given here are fairly simplistic, the pattern itself is quite
|
2019-08-13 04:01:28 +00:00
|
|
|
|
powerful. Codebases naturally accumulate small, domain-specific behaviors and
|
2019-08-02 23:15:44 +00:00
|
|
|
|
optimizations over time, especially around the IO components of the program.
|
|
|
|
|
Databases are used with specific options that an organization finds useful,
|
|
|
|
|
logging is performed in particular places, metrics are counted around certain
|
2019-08-13 04:01:28 +00:00
|
|
|
|
pieces of code, etc.
|
2019-06-14 20:23:50 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
By programming with component structure in mind, we are able to keep these
|
2019-08-02 23:15:44 +00:00
|
|
|
|
optimizations while also keeping the clarity and compartmentalization of the
|
2019-08-13 04:01:28 +00:00
|
|
|
|
code intact. We can keep our code flexible and configurable, while also
|
|
|
|
|
re-usable and testable. Also, the simplicity of the tools involved means they
|
|
|
|
|
can be extended and retrofitted for nearly any situation or use-case.
|
2019-06-01 21:39:14 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
Overall, this is a powerful pattern that I’ve found myself unable to do without
|
2019-08-02 23:15:44 +00:00
|
|
|
|
once I began using it.
|
2019-06-01 21:39:14 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
### Implementation
|
2019-06-14 20:23:50 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
As a final note, you can find an example implementation of the packages
|
|
|
|
|
described in this post here:
|
2019-06-14 20:23:50 +00:00
|
|
|
|
|
2019-08-02 23:15:44 +00:00
|
|
|
|
* [mcmp](https://godoc.org/github.com/mediocregopher/mediocre-go-lib/mcmp)
|
|
|
|
|
* [mcfg](https://godoc.org/github.com/mediocregopher/mediocre-go-lib/mcfg)
|
|
|
|
|
* [mrun](https://godoc.org/github.com/mediocregopher/mediocre-go-lib/mrun)
|
2019-06-14 20:23:50 +00:00
|
|
|
|
|
2019-08-13 04:01:28 +00:00
|
|
|
|
The packages are not stable and are likely to change frequently. You’ll also
|
2019-08-02 23:15:44 +00:00
|
|
|
|
find that they have been extended quite a bit from the simple descriptions found
|
2019-08-13 04:01:28 +00:00
|
|
|
|
here, based on what I’ve found useful as I’ve implemented programs using
|
2019-08-02 23:15:44 +00:00
|
|
|
|
component structures. With these two points in mind, I would encourage you to
|
2019-08-13 04:01:28 +00:00
|
|
|
|
look and take whatever functionality you find useful for yourself, and not use
|
|
|
|
|
the packages directly. The core pieces are not different from what has been
|
2019-08-02 23:15:44 +00:00
|
|
|
|
described in this post.
|