mediocre-blog/_posts/2020-11-16-component-oriented-programming.md

566 lines
18 KiB
Markdown
Raw Normal View History

2020-11-17 05:41:02 +00:00
---
title: >-
Component Oriented Programming
description: >-
A concise description of.
---
[A previous post in this
blog](2019-08-02-program-structure-and-composability.html) focused on a
framework developed to make designing component-based programs easier. In
2020-11-20 05:23:33 +00:00
retrospect, the pattern/framework proposed was over-engineered. This post
attempts to present the same ideas in a more distilled form, as a simple
programming pattern and without the unnecessary framework.
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
## Components
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
Many languages, libraries, and patterns make use of a concept called
"component", but in each case the meaning of "component" might be slightly
different. Therefore to begin talking about components we must first describe
specifically what is meant by "component" in this post.
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
For the purposes of this post, properties of components include:
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
 1... **Abstract**: A component is an interface consisting of one or more
methods. Being an interface, a component may have one or more implementations,
but generally will have a primary implementation, which is used during a
program's runtime, and secondary "mock" implementations, which are only used
when testing other components.
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
   1a... A function might be considered a single-method
component if the language supports first-class functions.
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
 2... **Creatable**: An instance of a component, given some defined set of
parameters, can be created independently of any other instance of that or any
other component.
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
 3... **Composable**: A component may be used as a parameter of another
component's instantiation. This would make it a child component of the one being
instantiated (i.e. the parent).
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
 4... **Isolated**: A component may not use mutable global variables (i.e.
singletons) or impure global functions (e.g. system calls). It may only use
constants and variables/components given to it during instantiation.
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
 5... **Ephemeral**: A component may have a specific method used to clean
up all resources that it's holding (e.g. network connections, file handles,
language-specific lightweight threads, etc).
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
   5a... This cleanup method should _not_ clean up any child
components given as instantiation parameters.
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
   5b... This cleanup method should not return until the
component's cleanup is complete.
2020-11-17 05:41:02 +00:00
Components are composed together to create programs. This is done by passing
components as parameters to other components during instantiation. The `main`
process of the program is responsible for instantiating and composing most, if
not all, components in the program.
A component oriented program is one which primarily, if not entirely, uses
components for its functionality. Components generally have the quality of being
able to interact with code written in other patterns without any toes being
stepped on.
## Example
Let's start with an example: suppose a program is desired which accepts a string
over stdin, hashes it, then writes the string to a file whose name is the hash.
A naive implementation of this program in go might look like:
```go
package main
import (
"crypto/sha1"
"encoding/hex"
"io"
"io/ioutil"
"os"
)
func hashFileWriter() error {
h := sha1.New()
r := io.TeeReader(os.Stdin, h)
body, _ := ioutil.ReadAll(r)
fileName := hex.EncodeToString(h.Sum(nil))
if err := ioutil.WriteFile(fileName, body, 0644); err != nil {
return err
}
return nil
}
func main() {
if err := hashFileWriter(); err != nil {
panic(err) // consider the error handled
}
}
```
Notice that there's not a clear separation here between different components;
`hashFileWriter` _might_ be considered a one method component, except that it
breaks component property 4, which says that a component may not use mutable
global variables (`os.Stdin`) or impure global functions (`ioutil.WriteFile`).
Notice also that testing the program would require integration tests, and could
not be unit tested (because there are no units, i.e. components). For a trivial
program like this one writing unit and integration tests would be redundant, but
for larger programs it may not be. Unit tests are important because they are
fast to run, (usually) easy to formulate, and yield consistent results.
This program could instead be written as being composed of three components:
2020-11-20 05:23:33 +00:00
* `stdin`: a construct given by the runtime which outputs a stream of bytes.
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
* `disk`: accepts a file name and file contents as input, writes the file
2020-11-17 05:41:02 +00:00
contents to a file of the given name, and potentially returns an error back.
2020-11-20 05:23:33 +00:00
* `hashFileWriter`: reads a stream of bytes off a `stdin`, collects the stream
2020-11-17 05:41:02 +00:00
into a string, hashes that string to generate a file name, and uses `disk` to
create a corresponding file with the string as its contents. If `disk` returns
an error then `hashFileWriter` returns that error.
Sprucing up our previous example to use these more clearly defined components
might look like:
```go
package main
import (
"crypto/sha1"
"encoding/hex"
"fmt"
"io"
"io/ioutil"
"os"
)
// Disk defines the methods of the disk component.
type Disk interface {
WriteFile(fileName string, fileContents []byte) error
}
// disk is the primary implementation of Disk. It implements the methods of
// Disk (WriteFile) by performing actual system calls.
type disk struct{}
func NewDisk() Disk { return disk{} }
func (disk) WriteFile(fileName string, fileContents []byte) error {
return ioutil.WriteFile(fileName, fileContents, 0644)
}
func hashFileWriter(stdin io.Reader, disk Disk) error {
h := sha1.New()
r := io.TeeReader(stdin, h)
body, err := ioutil.ReadAll(r)
if err != nil {
return fmt.Errorf("reading input: %w", err)
}
fileName := hex.EncodeToString(h.Sum(nil))
if err := disk.WriteFile(fileName, body); err != nil {
return fmt.Errorf("writing to file %q: %w", fileName, err)
}
return nil
}
func main() {
if err := hashFileWriter(os.Stdin, NewDisk()); err != nil {
panic(err) // consider the error handled
}
}
```
`hashFileWriter` no longer directly uses `os.Stdin` and `ioutil.WriteFile`, but
instead takes in components wrapping them; `io.Reader` is a built-in interface
which `os.Stdin` inherently implements, and `Disk` is a simple interface defined
just for this program.
At first glance this would seem to have doubled the line-count for very little
gain. This is because we have not yet written tests.
## Testing
2020-11-20 05:23:33 +00:00
Testing is important. This post won't attempt to defend that statement, that's
for another time. Let's just accept it as true for now.
2020-11-17 05:41:02 +00:00
In the second form of the program we can test the core-functionality of the
`hashFileWriter` component without resorting to using the actual `stdin` and
`disk` components. Instead we use mocks of those components. A mock component
implements the same input/outputs that the "real" component does, but in a way
2020-11-20 05:23:33 +00:00
which makes it possible to write tests of another component which don't reach
outside the process. These are unit tests.
2020-11-17 05:41:02 +00:00
Tests for the latest form of the program might look like this:
```go
package main
import (
"strings"
"testing"
)
// mockDisk implements the Disk interface. When WriteFile is called mockDisk
// will pretend to write the file, but instead will simply store what arguments
// WriteFile was called with.
type mockDisk struct {
fileName string
fileContents []byte
}
func (d *mockDisk) WriteFile(fileName string, fileContents []byte) error {
d.fileName = fileName
d.fileContents = fileContents
return nil
}
func TestHashFileWriter(t *testing.T) {
type test struct {
in string
expFileName string
// expFileContents can be inferred from in
}
tests := []test{
{
in: "",
expFileName: "da39a3ee5e6b4b0d3255bfef95601890afd80709",
},
{
in: "hello",
expFileName: "aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d",
},
{
in: "hello\nworld", // make sure newlines don't break things
expFileName: "7db827c10afc1719863502cf95397731b23b8bae",
},
}
for _, test := range tests {
// stdin is mocked via a strings.Reader, which outputs the string it was
// initialized with as a stream of bytes.
in := strings.NewReader(test.in)
// Disk is mocked by mockDisk, go figure.
disk := new(mockDisk)
if err := hashFileWriter(in, disk); err != nil {
t.Errorf("in:%q got err:%v", test.in, err)
} else if string(disk.fileContents) != test.in {
t.Errorf("in:%q got contents:%q", test.in, disk.fileContents)
} else if string(disk.fileName) != test.expFileName {
t.Errorf("in:%q got fileName:%q", test.in, disk.fileName)
}
}
}
```
Notice that these tests do not _completely_ cover the desired functionality of
the program: if `disk` returns an error that error should be returned from
2020-11-20 05:23:33 +00:00
`hashFileWriter`, but this functionality is not tested. Whether or not this must
be tested as well, and indeed the pedantry level of tests overall, is a matter
of taste. I believe these tests to be sufficient.
2020-11-17 05:41:02 +00:00
## Configuration
Practically all programs require some level of runtime configuration. This may
take the form of command-line arguments, environment variables, configuration
files, etc. Almost all configuration methods will require some system call, and
so any component accessing configuration directly would likely break component
property 4.
Instead each component should take in whatever configuration parameters it needs
during instantiation, and let `main` handle collecting all configuration from
outside of the process and instantiating the components appropriately.
Let's take our previous program, but add in two new desired behaviors: first,
there should be a command-line parameter which allows for specifying the string
on the command-line, rather than reading from stdin, and second, there should be
a command-line parameter declaring which directory to write files into. The new
implementation looks like:
2020-11-20 05:23:33 +00:00
```go
2020-11-17 05:41:02 +00:00
package main
import (
"crypto/sha1"
"encoding/hex"
"flag"
"fmt"
"io"
"io/ioutil"
"os"
"path/filepath"
"strings"
)
// Disk defines the methods of the disk component.
type Disk interface {
WriteFile(fileName string, fileContents []byte) error
}
// disk is the concrete implementation of Disk. It implements the methods of
// Disk (WriteFile) by performing actual OS calls.
type disk struct {
dir string
}
func NewDisk(dir string) Disk { return disk{dir: dir} }
func (d disk) WriteFile(fileName string, fileContents []byte) error {
fileName = filepath.Join(d.dir, fileName)
return ioutil.WriteFile(fileName, fileContents, 0644)
}
func hashFileWriter(in io.Reader, disk Disk) error {
h := sha1.New()
r := io.TeeReader(in, h)
body, err := ioutil.ReadAll(r)
if err != nil {
return fmt.Errorf("reading input: %w", err)
}
fileName := hex.EncodeToString(h.Sum(nil))
if err := disk.WriteFile(fileName, body); err != nil {
return fmt.Errorf("writing to file %q: %w", fileName, err)
}
return nil
}
func main() {
str := flag.String("str", "", "If set, hash and write this string instead of stdin")
dir := flag.String("dir", ".", "Directory which files should be written to")
flag.Parse()
var in io.Reader
if *str == "" {
in = os.Stdin
} else {
in = strings.NewReader(*str)
}
disk := NewDisk(*dir)
if err := hashFileWriter(in, disk); err != nil {
panic(err) // consider the error handled
}
}
```
Very little has changed, and in fact `hashFileWriter` was not touched at all,
meaning all unit tests remained valid.
## Setup/Runtime/Cleanup
A program can be split into three stages: setup, runtime, and cleanup. Setup
is the stage during which internal state is assembled in order to make runtime
possible. Runtime is the stage during which a program's actual function is being
2020-11-20 05:23:33 +00:00
performed. Cleanup is the stage during which runtime stops and internal state is
2020-11-17 05:41:02 +00:00
disassembled.
A graceful (i.e. reliably correct) setup is quite natural to accomplish, but
2020-11-20 05:23:33 +00:00
unfortunately a graceful cleanup is not a programmer's first concern (and
frequently is not a concern at all). However, when building reliable and correct
2020-11-17 05:41:02 +00:00
programs, a graceful cleanup is as important as a graceful setup and runtime. A
program is still running while it is being cleaned up, and it's possibly even
acting on the outside world still. Shouldn't it behave correctly during that
time?
Achieving a graceful setup and cleanup with components is quite simple:
During setup a single-threaded process (usually `main`) will construct the
"leaf" components (those which have no child components of their own) first,
then the components which take those leaves as parameters, then the components
which take _those_ as parameters, and so on, until all are constructed. The
2020-11-20 05:23:33 +00:00
components end up assembled into a directed acyclic graph (DAG).
In the previous examples our DAG looked like this:
```
---> stdin
/
hashFileWriter
\
---> disk
```
2020-11-17 05:41:02 +00:00
At this point the program will begin runtime.
Once runtime is over and it is time for the program to exit it's only necessary
to call each component's cleanup method(s) in the reverse of the order the
components were instantiated in. A component's cleanup method should not be
called until all of its parent components have been cleaned up.
Inherent to the pattern is the fact that each component will certainly be
cleaned up before any of its child components, since its child components must
have been instantiated first and a component will not clean up child components
2020-11-20 05:23:33 +00:00
given as parameters (as-per component property 5a). Therefore the pattern avoids
use-after-cleanup situations.
2020-11-17 05:41:02 +00:00
With go this pattern can be achieved easily using `defer`, but writing it out
manually is not so hard, as in this toy example:
2020-11-20 05:23:33 +00:00
```go
2020-11-17 05:41:02 +00:00
package main
import (
"fmt"
"time"
)
// sleeper is a component which prints its children and sleeps when it's time to
// cleanup.
type sleeper struct {
children []*sleeper
toSleep time.Duration
// The builtin time.Sleep is an impure global function, a component can't
// use it, so the component must be instantiated with it as a parameter.
sleep func(time.Duration)
// likewise os.Stdout is a global singleton, and so must also be a
parameter.
stdout io.Writer
}
func (s *sleeper) print() {
fmt.Fprintf(s.stdout, "I will sleep for %v\n", s.toSleep)
for _, child := range s.children {
child.print()
}
}
func (s *sleeper) cleanup() {
s.sleep(s.toSleep)
fmt.Fprintf(s.stdout, "I slept for %v\n", s.toSleep)
}
func main() {
// Within main we make a helper function to easily construct sleepers. for a
// toy like this it's not worth the effort of giving sleeper a real
// initialization function.
newSleeper := func(toSleep time.Duration, children ...*sleeper) *sleeper {
return &sleeper{
children: children,
toSleep: toSleep,
sleep: time.Sleep,
stdout: os.Stdout,
}
}
aa := newSleeper(250 * time.Millisecond)
defer aa.cleanup()
ab := newSleeper(250 * time.Millisecond)
defer ab.cleanup()
// A's children are AA and AB
a := newSleeper(500*time.Millisecond, aa, ab)
defer a.cleanup()
b := newSleeper(750 * time.Millisecond)
defer b.cleanup()
// root's children are A and B
root := newSleeper(1*time.Second, a, b)
defer root.cleanup()
// All components are now instantiated and runtime begins.
root.print()
// ... and just like that, runtime ends.
fmt.Println("--- Alright, fun is over, time for bed ---")
// Now to clean up, cleanup methods are called in the reverse order of the
// component's instantiation.
root.cleanup()
b.cleanup()
a.cleanup()
ab.cleanup()
aa.cleanup()
// Expected output is:
//
// I will sleep for 1s
// I will sleep for 500ms
// I will sleep for 250ms
// I will sleep for 250ms
// I will sleep for 750ms
// --- Alright, fun is over, time for bed ---
// I slept for 1s
// I slept for 750ms
// I slept for 500ms
// I slept for 250ms
// I slept for 250ms
}
```
## Criticisms
In lieu of a FAQ I will attempt to premeditate criticisms of the component
oriented pattern laid out in this post:
2020-11-20 05:23:33 +00:00
**This seems like a lot of extra work.**
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
Building reliable programs is a lot of work, just as building a
reliable-anything is a lot of work. Many of us work in an industry which likes
to balance reliability (sometimes referred to by the more specious "quality")
with maleability and deliverability, which naturally leads to skepticism of any
suggestions requiring more time spent on reliability. This is not necessarily a
bad thing, it's just how the industry functions.
2020-11-17 05:41:02 +00:00
All that said, a pattern need not be followed perfectly to be worthwhile, and
the amount of extra work incurred by it can be decided based on practical
considerations. I merely maintain that when it comes time to revisit some
existing code, either to fix or augment it, that the job will be notably easier
if the code _mostly_ follows this pattern.
2020-11-20 05:23:33 +00:00
**My language makes this difficult.**
2020-11-17 05:41:02 +00:00
2020-11-20 05:23:33 +00:00
I don't know of any language which makes this pattern particularly easier than
others, so unfortunately we're all in the same boat to some extent (though I
recognize that some languages, or their ecosystems, make it more difficult than
others). It seems to me that this pattern shouldn't be unbearably difficult for
anyone to implement in any language either, however, as the only language
feature needed is abstract typing.
2020-11-17 05:41:02 +00:00
It would be nice to one day see a language which explicitly supported this
pattern by baking the component properties in as compiler checked rules.
2020-11-20 05:23:33 +00:00
**This will result in over-abstraction.**
2020-11-17 05:41:02 +00:00
Abstraction is a necessary tool in a programmer's toolkit, there is simply no
way around it. The only questions are "how much?" and "where?".
The use of this pattern does not effect how those questions are answered, but
instead aims to more clearly delineate the relationships and interactions
between the different abstracted types once they've been established using other
methods. Over-abstraction is the fault of the programmer, not the language or
pattern or framework.
2020-11-20 05:23:33 +00:00
**The acronymn is CoP.**
2020-11-17 05:41:02 +00:00
Why do you think I've just been ackwardly using "this pattern" instead of the
acronymn for the whole post? Better names are welcome.
## Conclusion
The component oriented pattern helps make our code more reliable with only a
small amount of extra effort incurred. In fact most of the pattern has to do
establishing sensible abstractions around global functionality and remembering
certain idioms for how those abstractions should be composed together, something
most of us do to some extent already anyway.
While beneficial in many ways, component oriented programming is merely a tool
which can be applied in many cases. It is certain that there are cases where it
is not the right tool for the job. I've found these cases to be
few-and-far-between, however. It's a solid pattern that I've gotten good use out
of, and hopefully you'll find it, or some parts of it, to be useful as well.