better errors

2021-03-20 09:54:22 -06:00 · 2021-03-20 09:54:22 -06:00 · 7ab3b7ef36
commit 7ab3b7ef36
parent 78a5df1684
1 changed files with 227 additions and 0 deletions
--- a/src/_posts/2021-03-20-a-simple-rule-for-better-errors.md
+++ b/src/_posts/2021-03-20-a-simple-rule-for-better-errors.md
@ -0,0 +1,227 @@
+---
+title: >-
+    A Simple Rule for Better Errors
+description: >-
+    ...and some examples of the rule in action.
+tags: tech
+---
+
+This post will describe a simple rule for writing error messages that I've
+been using for some time and have found to be worthwhile. Using this rule I can
+be sure that my errors are propagated upwards with everything needed to debug
+problems, while not containing tons of extraneous or duplicate information.
+
+This rule is not specific to any particular language, pattern of error
+propagation (e.g. exceptions, signals, simple strings), or method of embedding
+information in errors (e.g. key/value pairs, formatted strings).
+
+I do not claim to have invented this system, I'm just describing it.
+
+## The Rule
+
+Without more ado, here's the rule:
+
+> A function sending back an error should not include information the caller
+> could already know.
+
+Pretty simple, really, but the best rules are. Keeping to this rule will result
+in error messages which, once propagated up to their final destination (usually
+some kind of logger), will contain only the information relevant to the error
+itself, with minimal duplication.
+
+The reason this rule works in tandem with good encapsulation of function
+behavior. The caller of a function knows only the inputs to the function and, in
+general terms, what the function is going to do with those inputs. If the
+returned error only includes information outside of those two things then the
+caller knows everything it needs to know about the error, and can continue on to
+propagate that error up the stack (with more information tacked on if necessary)
+or handle it in some other way.
+
+## Examples
+
+(For examples I'll use Go, but as previously mentioned this rule will be useful
+in any other language as well.)
+
+Let's go through a few examples, to show the various ways that this rule can
+manifest in actual code.
+
+**Example 1: Nothing to add**
+
+In this example we have a function which merely wraps a call to `io.Copy` for
+two files:
+
+```go
+func copyFile(dst, src *os.File) error {
+	_, err := io.Copy(dst, src)
+	return err
+}
+```
+
+In this example there's no need to modify the error from `io.Copy` before
+returning it to the caller. What would we even add? The caller already knows
+which files were involved in the error, and that the error was encountered
+during some kind of copy operation (since that's what the function says it
+does), so there's nothing more to say about it.
+
+**Example 2: Annotating which step an error occurs at**
+
+In this example we will open a file, read its contents, and return them as a
+string:
+
+```go
+func readFile(path string) (string, error) {
+	f, err := os.Open(path)
+	if err != nil {
+		return "", fmt.Errorf("opening file: %w", err)
+	}
+	defer f.Close()
+
+	contents, err := io.ReadAll(f)
+	if err != nil {
+		return "", fmt.Errorf("reading contents: %w", err)
+	}
+
+	return string(contents), nil
+}
+```
+
+In this example there are two different steps which could result in an error:
+opening the file and reading its contents. If an error is returned then our
+imaginary caller doesn't know which step the error occurred at. Using our rule
+we can infer that it would be good to annotate at _which_ step the error is
+from, so the caller is able to have a fuller picture of what went wrong.
+
+Note that each annotation does _not_ include the file path which was passed into
+the function. The caller already knows this path, so an error being returned
+back which reiterates the path is unnecessary.
+
+**Example 3: Annotating which argument was involved**
+
+In this example we will read two files using our function from example 2, and
+return the concatenation of their contents as a string.
+
+```go
+func concatFiles(pathA, pathB string) (string, error) {
+	contentsA, err := readFile(pathA)
+	if err != nil {
+		return "", fmt.Errorf("reading contents of %q: %w", pathA, err)
+	}
+
+	contentsB, err := readFile(pathB)
+	if err != nil {
+		return "", fmt.Errorf("reading contents of %q: %w", pathB, err)
+	}
+
+	return contentsA + contentsB, nil
+}
+```
+
+Like in example 2 we annotate each error, but instead of annotating the action
+we annotate which file path was involved in each error. This is because if we
+simply annotated with the string `reading contents` like before it wouldn't be
+clear to the caller _which_ file's contents couldn't be read. Therefore we
+include which path the error is relevant to.
+
+**Example 4: Layering**
+
+In this example we will show how using this rule habitually results in easy to
+read errors which contain all relevant information surrounding the error. Our
+example reads one file, the "full" file, using our `readFile` function from
+example 2. It then reads the concatenation of two files, the "split" files,
+using our `concatFiles` function from example 3. It finally determines if the
+two strings are equal:
+
+```go
+func verifySplits(fullFilePath, splitFilePathA, splitFilePathB string) error {
+	fullContents, err := readFile(fullFilePath)
+	if err != nil {
+		return fmt.Errorf("reading contents of full file: %w", err)
+	}
+
+	splitContents, err := concatFiles(splitFilePathA, splitFilePathB)
+	if err != nil {
+		return fmt.Errorf("reading concatenation of split files: %w", err)
+	}
+
+	if fullContents != splitContents {
+		return errors.New("full file's contents do not match the split files' contents")
+	}
+
+	return nil
+}
+```
+
+As previously, we don't annotate the file paths for the different possible
+errors, but instead say _which_ files were involved. The caller already knows
+the paths, there's no need to reiterate them if there's another way of referring
+to them.
+
+Let's see what our errors actually look like! We run our new function using the
+following:
+
+```go
+	err := verifySplits("full.txt", "splitA.txt", "splitB.txt")
+	fmt.Println(err)
+```
+
+Let's say `full.txt` doesn't exist, we'll get the following error:
+
+```
+reading contents of full file: opening file: open full.txt: no such file or directory
+```
+
+The error is simple, and gives you everything you need to understand what went
+wrong: while attempting to read the full file, during the opening of that file,
+our code found that there was no such file. In fact, the error returned by
+`os.Open` contains the name of the file, which goes against our rule, but it's
+the standard library so what can ya do?
+
+Now, let's say that `splitA.txt` doesn't exist, then we'll get this error:
+
+```
+reading concatenation of split files: reading contents of "splitA.txt": opening file: open splitA.txt: no such file or directory
+```
+
+Now we did include the file path here, and so the standard library's failure to
+follow our rule is causing us some repitition. But overall, within the parts of
+the error we have control over, the error is concise and gives you everything
+you need to know what happened.
+
+## Exceptions
+
+As with all rules, there are certainly exceptions. The primary one I've found is
+that certain helper functions can benefit from bending this rule a bit. For
+example, if there is a helper function which is called to verify some kind of
+user input in many places, it can be helpful to include that input value within
+the error returned from the helper function:
+
+```go
+func verifyInput(str string) error {
+    if err := check(str); err != nil {
+        return fmt.Errorf("input %q was bad: %w", str, err)
+    }
+    return nil
+}
+```
+
+`str` is known to the caller so, according to our rule, we don't need to include
+it in the error. But if you're going to end up wrapping the error returned from
+`verifyInput` with `str` at every call site anyway it can be convenient to save
+some energy and break the rule. It's a trade-off, convenience in exchange for
+consistency.
+
+Another exception might be made with regards to stack traces.
+
+In the set of examples given above I tended to annotate each error being
+returned with a description of where in the function the error was being
+returned from. If your language automatically includes some kind of stack trace
+with every error, and if you find that you are generally able to reconcile that
+stack trace with actual code, then it may be that annotating each error site is
+unnecessary, except when annotating actual runtime values (e.g. an input
+string).
+
+As in all things with programming, there are no hard rules; everything is up to
+interpretation and the specific use-case being worked on. That said, I hope what
+I've laid out here will prove generally useful to you, in whatever way you might
+try to use it.
+