copy over blog posts from cryptic.io blog
190
_posts/2014-10-29-erlang-pitfalls.md
Normal file
@ -0,0 +1,190 @@
|
||||
---
|
||||
layout: post
|
||||
title: Erlang Pitfalls
|
||||
---
|
||||
|
||||
I've been involved with a large-ish scale erlang project at Grooveshark since
|
||||
sometime around 2011. I started this project knowing absolutely nothing about
|
||||
erlang, but now I feel I have accumulated enough knowlege over time that I could
|
||||
conceivably give some back. Specifically, common pitfalls that people may run
|
||||
into when designing and writing a large-scale erlang application. Some of these
|
||||
may show up when searching for them, but some of them you may not even know you
|
||||
need to search for.
|
||||
|
||||
## now() vs timestamp()
|
||||
|
||||
The cononical way of getting the current timestamp in erlang is to use
|
||||
`erlang:now()`. This works great at small loads, but if you find your
|
||||
application slowing down greatly at highly parallel loads and you're calling
|
||||
`erlang:now()` a lot, it may be the culprit.
|
||||
|
||||
A property of this method you may not realize is that it is monotonically
|
||||
increasing, meaning even if two processes call it at the *exact* same time they
|
||||
will both receive different output. This is done through some locking on the
|
||||
low-level, as well as a bit of math to balance out the time getting out of sync
|
||||
in the scenario.
|
||||
|
||||
There are situations where fetching always unique timestamps is useful, such as
|
||||
seeding RNGs and generating unique identifiers for things, but usually when
|
||||
people fetch a timestamp they just want a timestamp. For these cases,
|
||||
`os:timestamp()` can be used. It is not blocked by any locks, it simply returns
|
||||
the time.
|
||||
|
||||
## The rpc module is slow
|
||||
|
||||
The built-in `rpc` module is slower than you'd think. This mostly stems from it
|
||||
doing a lot of extra work for every `call` and `cast` that you do, ensuring that
|
||||
certain conditions are accounted for. If, however, it's sufficient for the
|
||||
calling side to know that a call timed-out on them and not worry about it any
|
||||
further you may benefit from simply writing your own rpc module. Alternatively,
|
||||
use [one which already exists](https://github.com/cloudant/rexi).
|
||||
|
||||
## Don't send anonymous functions between nodes
|
||||
|
||||
One of erlang's niceties is transparent message sending between two phsyical
|
||||
erlang nodes. Once nodes are connected, a process on one can send any message to
|
||||
a process on the other exactly as if they existed on the same node. This is fine
|
||||
for many data-types, but for anonymous functions it should be avoided.
|
||||
|
||||
For example:
|
||||
|
||||
```erlang
|
||||
RemotePid ! {fn, fun(I) -> I + 1 end}.
|
||||
```
|
||||
|
||||
Would be better written as
|
||||
|
||||
```erlang
|
||||
incr(I) ->
|
||||
I + 1.
|
||||
|
||||
RemotePid ! {fn, ?MODULE, incr}.
|
||||
```
|
||||
|
||||
and then using an `apply` on the RemotePid to actually execute the function.
|
||||
|
||||
This is because hot-swapping code messes with anonymous functions quite a bit.
|
||||
Erlang isn't actually sending a function definition across the wire; it's simply
|
||||
sending a reference to a function. If you've changed the code within the
|
||||
anonymous function on a node, that reference changes. The sending node is
|
||||
sending a reference to a function which may not exist anymore on the receiving
|
||||
node, and you'll get a weird error which Google doesn't return many results for.
|
||||
|
||||
Alternatively, if you simply send atoms across the wire and use `apply` on the
|
||||
other side, only atoms are sent and the two nodes involved can have totally
|
||||
different ideas of what the function itself does without any problems.
|
||||
|
||||
## Hot-swapping code is a convenience, not a crutch
|
||||
|
||||
Hot swapping code is the bees-knees. It lets you not have to worry about
|
||||
rolling-restarts for trivial code changes, and so adds stability to your
|
||||
cluster. My warning is that you should not rely on it. If your cluster can't
|
||||
survive a node being restarted for a code change, then it can't survive if that
|
||||
node fails completely, or fails and comes back up. Design your system pretending
|
||||
that hot-swapping does not exist, and only once you've done that allow yourself
|
||||
to use it.
|
||||
|
||||
## GC sometimes needs a boost
|
||||
|
||||
Erlang garbage collection (GC) acts on a per-erlang-process basis, meaning that
|
||||
each process decides on its own to garbage collect itself. This is nice because
|
||||
it means stop-the-world isn't a problem, but it does have some interesting
|
||||
effects.
|
||||
|
||||
We had a problem with our node memory graphs looking like an upwards facing
|
||||
line, instead of a nice sinusoid relative to the number of connections during
|
||||
the day. We couldn't find a memory leak *anywhere*, and so started profiling. We
|
||||
found that the memory seemed to be comprised of mostly binary data in process
|
||||
heaps. On a hunch my coworker Mike Cugini (who gets all the credit for this) ran
|
||||
the following on a node:
|
||||
|
||||
```erlang
|
||||
lists:foreach(erlang:garbage_collect/1, erlang:processes()).
|
||||
```
|
||||
|
||||
and saw memory drop in a huge way. We made that code run every 10 minutes or so
|
||||
and suddenly our memory problem went away.
|
||||
|
||||
The problem is that we had a lot of processes which individually didn't have
|
||||
much heap data, but all-together were crushing the box. Each didn't think it had
|
||||
enough to garbage collect very often, so memory just kept going up. Calling the
|
||||
above forces all processes to garbage collect, and thus throw away all those
|
||||
little binary bits they were hoarding.
|
||||
|
||||
## These aren't the solutions you are looking for
|
||||
|
||||
The `erl` process has tons of command-line options which allow you to tweak all
|
||||
kinds of knobs. We've had tons of performance problems with our application, as
|
||||
of yet not a single one has been solved with turning one of these knobs. They've
|
||||
all been design issues or just run-of-the-mill bugs. I'm not saying the knobs
|
||||
are *never* useful, but I haven't seen it yet.
|
||||
|
||||
## Erlang processes are great, except when they're not
|
||||
|
||||
The erlang model of allowing processes to manage global state works really well
|
||||
in many cases. Possibly even most cases. There are, however, times when it
|
||||
becomes a performance problem. This became apparent in the project I was working
|
||||
on for Grooveshark, which was, at its heart, a pubsub server.
|
||||
|
||||
The architecture was very simple: each channel was managed by a process, client
|
||||
connection processes subscribed to that channel and received publishes from it.
|
||||
Easy right? The problem was that extremely high volume channels were simply not
|
||||
able to keep up with the load. The channel process could do certain things very
|
||||
fast, but there were some operations which simply took time and slowed
|
||||
everything down. For example, channels could have arbitrary properties set on
|
||||
them by their owners. Retrieving an arbitrary property from a channel was a
|
||||
fairly fast operation: client `call`s the channel process, channel process
|
||||
immediately responds with the property value. No blocking involved.
|
||||
|
||||
But as soon as there was any kind of call which required the channel process to
|
||||
talk to yet *another* process (unfortunately necessary), things got hairy. On
|
||||
high volume channels publishes/gets/set operations would get massively backed up
|
||||
in the message queue while the process was blocked on another process. We tried
|
||||
many things, but ultimately gave up on the process-per-channel approach.
|
||||
|
||||
We instead decided on keeping *all* channel state in a transactional database.
|
||||
When client processes "called" operations on a channel, they really are just
|
||||
acting on the database data inline, no message passing involved. This means that
|
||||
read-only operations are super-fast because there is minimal blocking, and if
|
||||
some random other process is being slow it only affects the one client making
|
||||
the call which is causing it to be slow, and not holding up a whole host of
|
||||
other clients.
|
||||
|
||||
## Mnesia might not be what you want
|
||||
|
||||
This one is probably a bit controversial, and definitely subject to use-cases.
|
||||
Do your own testing and profiling, find out what's right for you.
|
||||
|
||||
Mnesia is erlang's solution for global state. It's an in-memory transactional
|
||||
database which can scale to N nodes and persist to disk. It is hosted
|
||||
directly in the erlang processes memory so you interact with it in erlang
|
||||
directly in your code; no calling out to database drivers and such. Sounds great
|
||||
right?
|
||||
|
||||
Unfortunately mnesia is not a very full-featured database. It is essentially a
|
||||
key-value store which can hold arbitrary erlang data-types, albeit in a set
|
||||
schema which you lay out for it during startup. This means that more complex
|
||||
types like sorted sets and hash maps (although this was addressed with the
|
||||
introduction of the map data-type in R17) are difficult to work with within
|
||||
mnesia. Additionally, erlang's data model of immutability, while awesome
|
||||
usually, can bite you here because it's difficult (impossible?) to pull out
|
||||
chunks of data within a record without accessing the whole record.
|
||||
|
||||
For example, when retrieving the list of processes subscribed to a channel our
|
||||
application doesn't simply pull the full list and iterate over it. This is too
|
||||
slow, and in some cases the subscriber list was so large it wasn't actually
|
||||
feasible. The channel process wasn't cleaning up its heap fast enough, so
|
||||
multiple publishes would end up with multiple copies of the giant list in
|
||||
memory. This became a problem. Instead we chain spawned processes, each of which
|
||||
pull a set chunk of the subsciber list, and iterate over that. This is very
|
||||
difficult to implement in mnesia without pulling the full subscriber list into
|
||||
the process' memory at some point in the process.
|
||||
|
||||
It is, however, fairly trivial to implement in redis using sorted sets. For this
|
||||
case, and many other cases after, the motto for performance improvements became
|
||||
"stick it in redis". The application is at the point where *all* state which
|
||||
isn't directly tied to a specific connection is kept in redis, encoded using
|
||||
`term_to_binary`. The performance hit of going to an outside process for data
|
||||
was actually much less than we'd originally thought, and ended up being a plus
|
||||
since we had much more freedom to do interesting hacks to speedup up our
|
||||
accesses.
|
545
_posts/2015-07-15-go-http.md
Normal file
@ -0,0 +1,545 @@
|
||||
---
|
||||
layout: post
|
||||
title: Go's http package by example
|
||||
---
|
||||
|
||||
Go's [http](http://golang.org/pkg/net/http/) package has turned into one of my
|
||||
favorite things about the Go programming language. Initially it appears to be
|
||||
somewhat complex, but in reality it can be broken down into a couple of simple
|
||||
components that are extremely flexible in how they can be used. This guide will
|
||||
cover the basic ideas behind the http package, as well as examples in using,
|
||||
testing, and composing apps built with it.
|
||||
|
||||
This guide assumes you have some basic knowledge of what an interface in Go is,
|
||||
and some idea of how HTTP works and what it can do.
|
||||
|
||||
## Handler
|
||||
|
||||
The building block of the entire http package is the `http.Handler` interface,
|
||||
which is defined as follows:
|
||||
|
||||
```go
|
||||
type Handler interface {
|
||||
ServeHTTP(ResponseWriter, *Request)
|
||||
}
|
||||
```
|
||||
|
||||
Once implemented the `http.Handler` can be passed to `http.ListenAndServe`,
|
||||
which will call the `ServeHTTP` method on every incoming request.
|
||||
|
||||
`http.Request` contains all relevant information about an incoming http request
|
||||
which is being served by your `http.Handler`.
|
||||
|
||||
The `http.ResponseWriter` is the interface through which you can respond to the
|
||||
request. It implements the `io.Writer` interface, so you can use methods like
|
||||
`fmt.Fprintf` to write a formatted string as the response body, or ones like
|
||||
`io.Copy` to write out the contents of a file (or any other `io.Reader`). The
|
||||
response code can be set before you begin writing data using the `WriteHeader`
|
||||
method.
|
||||
|
||||
Here's an example of an extremely simple http server:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
type helloHandler struct{}
|
||||
|
||||
func (h helloHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
|
||||
fmt.Fprintf(w, "hello, you've hit %s\n", r.URL.Path)
|
||||
}
|
||||
|
||||
func main() {
|
||||
err := http.ListenAndServe(":9999", helloHandler{})
|
||||
log.Fatal(err)
|
||||
}
|
||||
```
|
||||
|
||||
`http.ListenAndServe` serves requests using the handler, listening on the given
|
||||
address:port. It will block unless it encounters an error listening, in which
|
||||
case we `log.Fatal`.
|
||||
|
||||
Here's an example of using this handler with curl:
|
||||
|
||||
```
|
||||
~ $ curl localhost:9999/foo/bar
|
||||
hello, you've hit /foo/bar
|
||||
```
|
||||
|
||||
|
||||
## HandlerFunc
|
||||
|
||||
Often defining a full type to implement the `http.Handler` interface is a bit
|
||||
overkill, especially for extremely simple `ServeHTTP` functions like the one
|
||||
above. The `http` package provides a helper function, `http.HandlerFunc`, which
|
||||
wraps a function which has the signature
|
||||
`func(w http.ResponseWriter, r *http.Request)`, returning an `http.Handler`
|
||||
which will call it in all cases.
|
||||
|
||||
The following behaves exactly like the previous example, but uses
|
||||
`http.HandlerFunc` instead of defining a new type.
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
func main() {
|
||||
h := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
fmt.Fprintf(w, "hello, you've hit %s\n", r.URL.Path)
|
||||
})
|
||||
|
||||
err := http.ListenAndServe(":9999", h)
|
||||
log.Fatal(err)
|
||||
}
|
||||
```
|
||||
|
||||
## ServeMux
|
||||
|
||||
On their own, the previous examples don't seem all that useful. If we wanted to
|
||||
have different behavior for different endpoints we would end up with having to
|
||||
parse path strings as well as numerous `if` or `switch` statements. Luckily
|
||||
we're provided with `http.ServeMux`, which does all of that for us. Here's an
|
||||
example of it being used:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
func main() {
|
||||
h := http.NewServeMux()
|
||||
|
||||
h.HandleFunc("/foo", func(w http.ResponseWriter, r *http.Request) {
|
||||
fmt.Fprintln(w, "Hello, you hit foo!")
|
||||
})
|
||||
|
||||
h.HandleFunc("/bar", func(w http.ResponseWriter, r *http.Request) {
|
||||
fmt.Fprintln(w, "Hello, you hit bar!")
|
||||
})
|
||||
|
||||
h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
|
||||
w.WriteHeader(404)
|
||||
fmt.Fprintln(w, "You're lost, go home")
|
||||
})
|
||||
|
||||
err := http.ListenAndServe(":9999", h)
|
||||
log.Fatal(err)
|
||||
}
|
||||
```
|
||||
|
||||
The `http.ServeMux` is itself an `http.Handler`, so it can be passed into
|
||||
`http.ListenAndServe`. When it receives a request it will check if the request's
|
||||
path is prefixed by any of its known paths, choosing the longest prefix match it
|
||||
can find. We use the `/` endpoint as a catch-all to catch any requests to
|
||||
unknown endpoints. Here's some examples of it being used:
|
||||
|
||||
```
|
||||
~ $ curl localhost:9999/foo
|
||||
Hello, you hit foo!
|
||||
|
||||
~ $ curl localhost:9999/bar
|
||||
Hello, you hit bar!
|
||||
|
||||
~ $ curl localhost:9999/baz
|
||||
You're lost, go home
|
||||
```
|
||||
|
||||
`http.ServeMux` has both `Handle` and `HandleFunc` methods. These do the same
|
||||
thing, except that `Handle` takes in an `http.Handler` while `HandleFunc` merely
|
||||
takes in a function, implicitly wrapping it just as `http.HandlerFunc` does.
|
||||
|
||||
### Other muxes
|
||||
|
||||
There are numerous replacements for `http.ServeMux` like
|
||||
[gorilla/mux](http://www.gorillatoolkit.org/pkg/mux) which give you things like
|
||||
automatically pulling variables out of paths, easily asserting what http methods
|
||||
are allowed on an endpoint, and more. Most of these replacements will implement
|
||||
`http.Handler` like `http.ServeMux` does, and accept `http.Handler`s as
|
||||
arguments, and so are easy to use in conjunction with the rest of the things
|
||||
I'm going to talk about in this post.
|
||||
|
||||
## Composability
|
||||
|
||||
When I say that the `http` package is composable I mean that it is very easy to
|
||||
create re-usable pieces of code and glue them together into a new working
|
||||
application. The `http.Handler` interface is the way all pieces communicate with
|
||||
each other. Here's an example of where we use the same `http.Handler` to handle
|
||||
multiple endpoints, each slightly differently:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
type numberDumper int
|
||||
|
||||
func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) {
|
||||
fmt.Fprintf(w, "Here's your number: %d\n", n)
|
||||
}
|
||||
|
||||
func main() {
|
||||
h := http.NewServeMux()
|
||||
|
||||
h.Handle("/one", numberDumper(1))
|
||||
h.Handle("/two", numberDumper(2))
|
||||
h.Handle("/three", numberDumper(3))
|
||||
h.Handle("/four", numberDumper(4))
|
||||
h.Handle("/five", numberDumper(5))
|
||||
|
||||
h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
|
||||
w.WriteHeader(404)
|
||||
fmt.Fprintln(w, "That's not a supported number!")
|
||||
})
|
||||
|
||||
err := http.ListenAndServe(":9999", h)
|
||||
log.Fatal(err)
|
||||
}
|
||||
```
|
||||
|
||||
`numberDumper` implements `http.Handler`, and can be passed into the
|
||||
`http.ServeMux` multiple times to serve multiple endpoints. Here's it in action:
|
||||
|
||||
```
|
||||
~ $ curl localhost:9999/one
|
||||
Here's your number: 1
|
||||
~ $ curl localhost:9999/five
|
||||
Here's your number: 5
|
||||
~ $ curl localhost:9999/bazillion
|
||||
That's not a supported number!
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
Testing http endpoints is extremely easy in Go, and doesn't even require you to
|
||||
actually listen on any ports! The `httptest` package provides a few handy
|
||||
utilities, including `NewRecorder` which implements `http.ResponseWriter` and
|
||||
allows you to effectively make an http request by calling `ServeHTTP` directly.
|
||||
Here's an example of a test for our previously implemented `numberDumper`,
|
||||
commented with what exactly is happening:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
. "testing"
|
||||
)
|
||||
|
||||
func TestNumberDumper(t *T) {
|
||||
// We first create the http.Handler we wish to test
|
||||
n := numberDumper(1)
|
||||
|
||||
// We create an http.Request object to test with. The http.Request is
|
||||
// totally customizable in every way that a real-life http request is, so
|
||||
// even the most intricate behavior can be tested
|
||||
r, _ := http.NewRequest("GET", "/one", nil)
|
||||
|
||||
// httptest.Recorder implements the http.ResponseWriter interface, and as
|
||||
// such can be passed into ServeHTTP to receive the response. It will act as
|
||||
// if all data being given to it is being sent to a real client, when in
|
||||
// reality it's being buffered for later observation
|
||||
w := httptest.NewRecorder()
|
||||
|
||||
// Pass in our httptest.Recorder and http.Request to our numberDumper. At
|
||||
// this point the numberDumper will act just as if it was responding to a
|
||||
// real request
|
||||
n.ServeHTTP(w, r)
|
||||
|
||||
// httptest.Recorder gives a number of fields and methods which can be used
|
||||
// to observe the response made to our request. Here we check the response
|
||||
// code
|
||||
if w.Code != 200 {
|
||||
t.Fatalf("wrong code returned: %d", w.Code)
|
||||
}
|
||||
|
||||
// We can also get the full body out of the httptest.Recorder, and check
|
||||
// that its contents are what we expect
|
||||
body := w.Body.String()
|
||||
if body != fmt.Sprintf("Here's your number: 1\n") {
|
||||
t.Fatalf("wrong body returned: %s", body)
|
||||
}
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
In this way it's easy to create tests for your individual components that you
|
||||
are using to build your application, keeping the tests near to the functionality
|
||||
they're testing.
|
||||
|
||||
Note: if you ever do need to spin up a test server in your tests, `httptest`
|
||||
also provides a way to create a server listening on a random open port for use
|
||||
in tests as well.
|
||||
|
||||
## Middleware
|
||||
|
||||
Serving endpoints is nice, but often there's functionality you need to run for
|
||||
*every* request before the actual endpoint's handler is run. For example, access
|
||||
logging. A middleware component is one which implements `http.Handler`, but will
|
||||
actually pass the request off to another `http.Handler` after doing some set of
|
||||
actions. The `http.ServeMux` we looked at earlier is actually an example of
|
||||
middleware, since it passes the request off to another `http.Handler` for actual
|
||||
processing. Here's an example of our previous example with some logging
|
||||
middleware:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
type numberDumper int
|
||||
|
||||
func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) {
|
||||
fmt.Fprintf(w, "Here's your number: %d\n", n)
|
||||
}
|
||||
|
||||
func logger(h http.Handler) http.Handler {
|
||||
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
log.Printf("%s requested %s", r.RemoteAddr, r.URL)
|
||||
h.ServeHTTP(w, r)
|
||||
})
|
||||
}
|
||||
|
||||
func main() {
|
||||
h := http.NewServeMux()
|
||||
|
||||
h.Handle("/one", numberDumper(1))
|
||||
h.Handle("/two", numberDumper(2))
|
||||
h.Handle("/three", numberDumper(3))
|
||||
h.Handle("/four", numberDumper(4))
|
||||
h.Handle("/five", numberDumper(5))
|
||||
|
||||
h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
|
||||
w.WriteHeader(404)
|
||||
fmt.Fprintln(w, "That's not a supported number!")
|
||||
})
|
||||
|
||||
hl := logger(h)
|
||||
|
||||
err := http.ListenAndServe(":9999", hl)
|
||||
log.Fatal(err)
|
||||
}
|
||||
```
|
||||
|
||||
`logger` is a function which takes in an `http.Handler` called `h`, and returns
|
||||
a new `http.Handler` which, when called, will log the request it was called with
|
||||
and then pass off its arguments to `h`. To use it we pass in our
|
||||
`http.ServeMux`, so all incoming requests will first be handled by the logging
|
||||
middleware before being passed to the `http.ServeMux`.
|
||||
|
||||
Here's an example log entry which is output when the `/five` endpoint is hit:
|
||||
|
||||
```
|
||||
2015/06/30 20:15:41 [::1]:34688 requested /five
|
||||
```
|
||||
|
||||
## Middleware chaining
|
||||
|
||||
Being able to chain middleware together is an incredibly useful ability which we
|
||||
get almost for free, as long as we use the signature
|
||||
`func(http.Handler) http.Handler`. A middleware component returns the same type
|
||||
which is passed into it, so simply passing the output of one middleware
|
||||
component into the other is sufficient.
|
||||
|
||||
However, more complex behavior with middleware can be tricky. For instance, what
|
||||
if you want a piece of middleware which takes in a parameter upon creation?
|
||||
Here's an example of just that, with a piece of middleware which will set a
|
||||
header and its value for all requests:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
type numberDumper int
|
||||
|
||||
func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) {
|
||||
fmt.Fprintf(w, "Here's your number: %d\n", n)
|
||||
}
|
||||
|
||||
func logger(h http.Handler) http.Handler {
|
||||
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
log.Printf("%s requested %s", r.RemoteAddr, r.URL)
|
||||
h.ServeHTTP(w, r)
|
||||
})
|
||||
}
|
||||
|
||||
type headerSetter struct {
|
||||
key, val string
|
||||
handler http.Handler
|
||||
}
|
||||
|
||||
func (hs headerSetter) ServeHTTP(w http.ResponseWriter, r *http.Request) {
|
||||
w.Header().Set(hs.key, hs.val)
|
||||
hs.handler.ServeHTTP(w, r)
|
||||
}
|
||||
|
||||
func newHeaderSetter(key, val string) func(http.Handler) http.Handler {
|
||||
return func(h http.Handler) http.Handler {
|
||||
return headerSetter{key, val, h}
|
||||
}
|
||||
}
|
||||
|
||||
func main() {
|
||||
h := http.NewServeMux()
|
||||
|
||||
h.Handle("/one", numberDumper(1))
|
||||
h.Handle("/two", numberDumper(2))
|
||||
h.Handle("/three", numberDumper(3))
|
||||
h.Handle("/four", numberDumper(4))
|
||||
h.Handle("/five", numberDumper(5))
|
||||
|
||||
h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
|
||||
w.WriteHeader(404)
|
||||
fmt.Fprintln(w, "That's not a supported number!")
|
||||
})
|
||||
|
||||
hl := logger(h)
|
||||
hhs := newHeaderSetter("X-FOO", "BAR")(hl)
|
||||
|
||||
err := http.ListenAndServe(":9999", hhs)
|
||||
log.Fatal(err)
|
||||
}
|
||||
```
|
||||
|
||||
And here's the curl output:
|
||||
|
||||
```
|
||||
~ $ curl -i localhost:9999/three
|
||||
HTTP/1.1 200 OK
|
||||
X-Foo: BAR
|
||||
Date: Wed, 01 Jul 2015 00:39:48 GMT
|
||||
Content-Length: 22
|
||||
Content-Type: text/plain; charset=utf-8
|
||||
|
||||
Here's your number: 3
|
||||
|
||||
```
|
||||
|
||||
`newHeaderSetter` returns a function which accepts and returns an
|
||||
`http.Handler`. Calling that returned function with an `http.Handler` then gets
|
||||
you an `http.Handler` which will set the header given to `newHeaderSetter`
|
||||
before continuing on to the given `http.Handler`.
|
||||
|
||||
This may seem like a strange way of organizing this; for this example the
|
||||
signature for `newHeaderSetter` could very well have looked like this:
|
||||
|
||||
```
|
||||
func newHeaderSetter(key, val string, h http.Handler) http.Handler
|
||||
```
|
||||
|
||||
And that implementation would have worked fine. But it would have been more
|
||||
difficult to compose going forward. In the next section I'll show what I mean.
|
||||
|
||||
## Composing middleware with alice
|
||||
|
||||
[Alice](https://github.com/justinas/alice) is a very simple and convenient
|
||||
helper for working with middleware using the function signature we've been using
|
||||
thusfar. Alice is used to create and use chains of middleware. Chains can even
|
||||
be appended to each other, giving even further flexibility. Here's our previous
|
||||
example with a couple more headers being set, but also using alice to manage the
|
||||
added complexity.
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
|
||||
"github.com/justinas/alice"
|
||||
)
|
||||
|
||||
type numberDumper int
|
||||
|
||||
func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) {
|
||||
fmt.Fprintf(w, "Here's your number: %d\n", n)
|
||||
}
|
||||
|
||||
func logger(h http.Handler) http.Handler {
|
||||
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
log.Printf("%s requested %s", r.RemoteAddr, r.URL)
|
||||
h.ServeHTTP(w, r)
|
||||
})
|
||||
}
|
||||
|
||||
type headerSetter struct {
|
||||
key, val string
|
||||
handler http.Handler
|
||||
}
|
||||
|
||||
func (hs headerSetter) ServeHTTP(w http.ResponseWriter, r *http.Request) {
|
||||
w.Header().Set(hs.key, hs.val)
|
||||
hs.handler.ServeHTTP(w, r)
|
||||
}
|
||||
|
||||
func newHeaderSetter(key, val string) func(http.Handler) http.Handler {
|
||||
return func(h http.Handler) http.Handler {
|
||||
return headerSetter{key, val, h}
|
||||
}
|
||||
}
|
||||
|
||||
func main() {
|
||||
h := http.NewServeMux()
|
||||
|
||||
h.Handle("/one", numberDumper(1))
|
||||
h.Handle("/two", numberDumper(2))
|
||||
h.Handle("/three", numberDumper(3))
|
||||
h.Handle("/four", numberDumper(4))
|
||||
|
||||
fiveHS := newHeaderSetter("X-FIVE", "the best number")
|
||||
h.Handle("/five", fiveHS(numberDumper(5)))
|
||||
|
||||
h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
|
||||
w.WriteHeader(404)
|
||||
fmt.Fprintln(w, "That's not a supported number!")
|
||||
})
|
||||
|
||||
chain := alice.New(
|
||||
newHeaderSetter("X-FOO", "BAR"),
|
||||
newHeaderSetter("X-BAZ", "BUZ"),
|
||||
logger,
|
||||
).Then(h)
|
||||
|
||||
err := http.ListenAndServe(":9999", chain)
|
||||
log.Fatal(err)
|
||||
}
|
||||
```
|
||||
|
||||
In this example all requests will have the headers `X-FOO` and `X-BAZ` set, but
|
||||
the `/five` endpoint will *also* have the `X-FIVE` header set.
|
||||
|
||||
## Fin
|
||||
|
||||
Starting with a simple idea of an interface, the `http` package allows us to
|
||||
create for ourselves an incredibly useful and flexible (yet still rather simple)
|
||||
ecosystem for building web apps with re-usable components, all without breaking
|
||||
our static checks.
|
237
_posts/2015-11-21-happy-trees.md
Normal file
@ -0,0 +1,237 @@
|
||||
---
|
||||
layout: post
|
||||
title: Happy Trees
|
||||
---
|
||||
|
||||
Source code related to this post is available [here](https://github.com/mediocregopher/happy-tree).
|
||||
|
||||
This project was inspired by [this video](https://www.youtube.com/watch?v=_DpzAvb3Vk4),
|
||||
which you should watch first in order to really understand what's going on.
|
||||
|
||||
My inspiration came from his noting that happification could be done on numbers
|
||||
in bases other than 10. I immediately thought of hexadecimal, base-16, since I'm
|
||||
a programmer and that's what I think of. I also was trying to think of how one
|
||||
would graphically represent a large happification tree, when I realized that
|
||||
hexadecimal numbers are colors, and colors graphically represent things nicely!
|
||||
|
||||
## Colors
|
||||
|
||||
Colors to computers are represented using 3-bytes, encompassing red, green, and
|
||||
blue. Each byte is represented by two hexadecimal digits, and they are appended
|
||||
together. For example `FF0000` represents maximum red (`FF`) added to no green
|
||||
and no blue. `FF5500` represents maximum red (`FF`), some green (`55`) and no
|
||||
blue (`00`), which when added together results in kind of an orange color.
|
||||
|
||||
## Happifying colors
|
||||
|
||||
In base 10, happifying a number is done by splitting its digits, squaring each
|
||||
one individually, and adding the resulting numbers. The principal works the same
|
||||
for hexadecimal numbers:
|
||||
|
||||
```
|
||||
A4F
|
||||
A*A + 4*4 + F*F
|
||||
64 + 10 + E1
|
||||
155 // 341 in decimal
|
||||
```
|
||||
|
||||
So if all colors are 6-digit hexadecimal numbers, they can be happified easily!
|
||||
|
||||
```
|
||||
FF5500
|
||||
F*F + F*F + 5*5 + 5*5 + 0*0 + 0*0
|
||||
E1 + E1 + 19 + 19 + 0 + 0
|
||||
0001F4
|
||||
```
|
||||
|
||||
So `FF5500` (an orangish color) happifies to `0001F4` (a darker blue). Since
|
||||
order of digits doesn't matter, `5F50F0` also happifies to `0001F4`. From this
|
||||
fact, we can make a tree (hence the happification tree). I can do this process
|
||||
on every color from `000000` (black) to `FFFFFF` (white), so I will!
|
||||
|
||||
## Representing the tree
|
||||
|
||||
So I know I can represent the tree using color, but there's more to decide on
|
||||
than that. The easy way to represent a tree would be to simply draw a literal
|
||||
tree graph, with a circle for each color and lines pointing to its parent and
|
||||
children. But this is boring, and also if I want to represent *all* colors the
|
||||
resulting image would be enormous and/or unreadable.
|
||||
|
||||
I decided on using a hollow, multi-level pie-chart. Using the example
|
||||
of `000002`, it would look something like this:
|
||||
|
||||
![An example of a partial multi-level pie chart](/img/happy-tree/partial.png)
|
||||
|
||||
The inner arc represents the color `000002`. The second arc represents the 15
|
||||
different colors which happify into `000002`, each of them may also have their
|
||||
own outer arc of numbers which happify to them, and so on.
|
||||
|
||||
This representation is nice because a) It looks cool and b) it allows the
|
||||
melancoils of the hexadecimals to be placed around the happification tree
|
||||
(numbers which happify into `000001`), which is convenient. It's also somewhat
|
||||
easier to code than a circle/branch based tree diagram.
|
||||
|
||||
An important feature I had to implement was proportional slice sizes. If I were
|
||||
to give each child of a color an equal size on that arc's edge the image would
|
||||
simply not work. Some branches of the tree are extremely deep, while others are
|
||||
very shallow. If all were given the same space, those deep branches wouldn't
|
||||
even be representable by a single pixel's width, and would simply fail to show
|
||||
up. So I implemented proportional slice sizes, where the size of every slice is
|
||||
determined to be proportional to how many total (recursively) children it has.
|
||||
You can see this in the above example, where the second level arc is largely
|
||||
comprised of one giant slice, with many smaller slices taking up the end.
|
||||
|
||||
## First attempt
|
||||
|
||||
My first attempt resulted in this image (click for 5000x5000 version):
|
||||
|
||||
[![Result of first attempt](/img/happy-tree/happy-tree-atmp1-small.png)](/img/happy-tree/happy-tree-atmp1.png)
|
||||
|
||||
The first thing you'll notice is that it looks pretty neat.
|
||||
|
||||
The second thing you'll notice is that there's actually only one melancoil in
|
||||
the 6-digit hexadecimal number set. The innermost black circle is `000000` which
|
||||
only happifies to itself, and nothing else will happify to it (sad `000000`).
|
||||
The second circle represents `000001`, and all of its runty children. And
|
||||
finally the melancoil, comprised of:
|
||||
|
||||
```
|
||||
00000D -> 0000A9 -> 0000B5 -> 000092 -> 000055 -> 00003 -> ...
|
||||
```
|
||||
|
||||
The final thing you'll notice (or maybe it was the first, since it's really
|
||||
obvious) is that it's very blue. Non-blue colors are really only represented as
|
||||
leaves on their trees and don't ever really have any children of their own, so
|
||||
the blue and black sections take up vastly more space.
|
||||
|
||||
This makes sense. The number which should generate the largest happification
|
||||
result, `FFFFFF`, only results in `000546`, which is primarily blue. So in effect
|
||||
all colors happify to some shade of blue.
|
||||
|
||||
This might have been it, technically this is the happification tree and the
|
||||
melancoil of 6 digit hexadecimal numbers represented as colors. But it's also
|
||||
boring, and I wanted to do better.
|
||||
|
||||
## Second attempt
|
||||
|
||||
The root of the problem is that the definition of "happification" I used
|
||||
resulted in not diverse enough results. I wanted something which would give me
|
||||
numbers where any of the digits could be anything. Something more random.
|
||||
|
||||
I considered using a hash instead, like md5, but that has its own problems.
|
||||
There's no gaurantee that any number would actually reach `000001`, which isn't
|
||||
required but it's a nice feature that I wanted. It also would be unlikely that
|
||||
there would be any melancoils that weren't absolutely gigantic.
|
||||
|
||||
I ended up redefining what it meant to happify a hexadecimal number. Instead of
|
||||
adding all the digits up, I first split up the red, green, and blue digits into
|
||||
their own numbers, happified those numbers, and finally reassembled the results
|
||||
back into a single number. For example:
|
||||
|
||||
```
|
||||
FF5500
|
||||
FF, 55, 00
|
||||
F*F + F*F, 5*5 + 5*5, 0*0 + 0*0
|
||||
1C2, 32, 00
|
||||
C23200
|
||||
```
|
||||
|
||||
I drop that 1 on the `1C2`, because it has no place in this system. Sorry 1.
|
||||
|
||||
Simply replacing that function resulted in this image (click for 5000x5000) version:
|
||||
|
||||
[![Result of second attempt](/img/happy-tree/happy-tree-atmp2-small.png)](/img/happy-tree/happy-tree-atmp2.png)
|
||||
|
||||
The first thing you notice is that it's so colorful! So that goal was achieved.
|
||||
|
||||
The second thing you notice is that there's *significantly* more melancoils.
|
||||
Hundreds, even. Here's a couple of the melancoils (each on its own line):
|
||||
|
||||
```
|
||||
00000D -> 0000A9 -> 0000B5 -> 000092 -> 000055 -> 000032 -> ...
|
||||
000D0D -> 00A9A9 -> 00B5B5 -> 009292 -> 005555 -> 003232 -> ...
|
||||
0D0D0D -> A9A9A9 -> B5B5B5 -> 929292 -> 555555 -> 323232 -> ...
|
||||
0D0D32 -> A9A90D -> B5B5A9 -> 9292B5 -> 555592 -> 323255 -> ...
|
||||
...
|
||||
```
|
||||
|
||||
And so on. You'll notice the first melancoil listed is the same as the one from
|
||||
the first attempt. You'll also notice that the same numbers from the that
|
||||
melancoil are "re-used" in the rest of them as well. The second coil listed is
|
||||
the same as the first, just with the numbers repeated in the 3rd and 4th digits.
|
||||
The third coil has those numbers repeated once more in the 1st and 2nd digits.
|
||||
The final coil is the same numbers, but with the 5th and 6th digits offset one
|
||||
place in the rotation.
|
||||
|
||||
The rest of the melancoils in this attempt work out to just be every conceivable
|
||||
iteration of the above. This is simply a property of the algorithm chosen, and
|
||||
there's not a whole lot we can do about it.
|
||||
|
||||
## Third attempt
|
||||
|
||||
After talking with [Mr. Marco](/members/#marcopolo) about the previous attempts
|
||||
I got an idea that would lead me towards more attempts. The main issue I was
|
||||
having in coming up with new happification algorithms was figuring out what to
|
||||
do about getting a number greater than `FFFFFF`. Dropping the leading digits
|
||||
just seemed.... lame.
|
||||
|
||||
One solution I came up with was to simply happify again. And again, and again.
|
||||
Until I got a number less than or equal to `FFFFFF`.
|
||||
|
||||
With this new plan, I could increase the power by which I'm raising each
|
||||
individual digit, and drop the strategy from the second attempt of splitting the
|
||||
number into three parts. In the first attempt I was doing happification to the
|
||||
power of 2, but what if I wanted to happify to the power of 6? It would look
|
||||
something like this (starting with the number `34BEEF`):
|
||||
|
||||
```
|
||||
34BEEF
|
||||
3^6 + 4^6 + B^6 + E^6 + E^6 + E^6 + F^6
|
||||
2D9 + 1000 + 1B0829 + 72E440 + 72E440 + ADCEA1
|
||||
1AEB223
|
||||
|
||||
1AEB223 is greater than FFFFFF, so we happify again
|
||||
|
||||
1^6 + A^6 + E^6 + B^6 + 2^6 + 2^6 + 3^6
|
||||
1 + F4240 + 72E440 + 1B0829 + 40 + 40 + 2D9
|
||||
9D3203
|
||||
```
|
||||
|
||||
So `34BEEF` happifies to `9D3203`, when happifying to the power of 6.
|
||||
|
||||
As mentioned before the first attempt in this blog was the 2nd power tree,
|
||||
here's the trees for the 3rd, 4th, 5th, and 6th powers (each image is a link to
|
||||
a larger version):
|
||||
|
||||
3rd power:
|
||||
[![Third attempt, 3rd power](/img/happy-tree/happy-tree-atmp3-pow3-small.png)](/img/happy-tree/happy-tree-atmp3-pow3.png)
|
||||
|
||||
4th power:
|
||||
[![Third attempt, 4th power](/img/happy-tree/happy-tree-atmp3-pow4-small.png)](/img/happy-tree/happy-tree-atmp3-pow4.png)
|
||||
|
||||
5th power:
|
||||
[![Third attempt, 5th power](/img/happy-tree/happy-tree-atmp3-pow5-small.png)](/img/happy-tree/happy-tree-atmp3-pow5.png)
|
||||
|
||||
6th power:
|
||||
[![Third attempt, 6th power](/img/happy-tree/happy-tree-atmp3-pow6-small.png)](/img/happy-tree/happy-tree-atmp3-pow6.png)
|
||||
|
||||
A couple things to note:
|
||||
|
||||
* 3-5 are still very blue. It's not till the 6th power that the distribution
|
||||
becomes random enough to become very colorful.
|
||||
|
||||
* Some powers have more coils than others. Power of 3 has a lot, and actually a
|
||||
lot of them aren't coils, but single narcissistic numbers. Narcissistic
|
||||
numbers are those which happify to themselves. `000000` and `000001` are
|
||||
narcissistic numbers in all powers, power of 3 has quite a few more.
|
||||
|
||||
* 4 looks super cool.
|
||||
|
||||
Using unsigned 64-bit integers I could theoretically go up to the power of 15.
|
||||
But I hit a roadblock at power of 7, in that there's actually a melancoil which
|
||||
occurs whose members are all greater than `FFFFFF`. This means that my strategy
|
||||
of repeating happifying until I get under `FFFFFF` doesn't work for any numbers
|
||||
which lead into that coil.
|
||||
|
||||
All images linked to in this post are licensed under the [Do what the fuck you
|
||||
want to public license](http://www.wtfpl.net/txt/copying/).
|
164
_posts/2015-3-11-rabbit-hole.md
Normal file
@ -0,0 +1,164 @@
|
||||
---
|
||||
layout: post
|
||||
title: Rabbit Hole
|
||||
---
|
||||
|
||||
We've begun rolling out [SkyDNS][skydns] at my job, which has been pretty neat.
|
||||
We're basing a couple future projects around being able to use it, and it's made
|
||||
dynamic configuration and service discovery nice and easy.
|
||||
|
||||
This post chronicles catching a bug because of our switch to SkyDNS, and how we
|
||||
discover its root cause. I like to call these kinds of bugs "rabbit holes"; they
|
||||
look shallow at first, but anytime you make a little progress forward a little
|
||||
more is always required, until you discover the ending somewhere totally
|
||||
unrelated to the start.
|
||||
|
||||
## The Bug
|
||||
|
||||
We are seeing *tons* of these in the SkyDNS log:
|
||||
|
||||
```
|
||||
[skydns] Feb 20 17:21:15.168 INFO | no nameservers defined or name too short, can not forward
|
||||
```
|
||||
|
||||
I fire up tcpdump to see if I can see anything interesting, and sure enough run
|
||||
across a bunch of these:
|
||||
|
||||
```
|
||||
# tcpdump -vvv -s 0 -l -n port 53
|
||||
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
|
||||
...
|
||||
$fen_ip.50257 > $skydns_ip.domain: [udp sum ok] 16218+ A? unknown. (25)
|
||||
$fen_ip.27372 > $skydns_ip.domain: [udp sum ok] 16218+ A? unknown. (25)
|
||||
$fen_ip.35634 > $skydns_ip.domain: [udp sum ok] 59227+ A? unknown. (25)
|
||||
$fen_ip.64363 > $skydns_ip.domain: [udp sum ok] 59227+ A? unknown. (25)
|
||||
```
|
||||
|
||||
It appears that some of our front end nodes (FENs) are making tons of DNS
|
||||
fequests trying to find the A record of `unknown`. Something on our FENs is
|
||||
doing something insane and is breaking.
|
||||
|
||||
## The FENs
|
||||
|
||||
Hopping over to my favorite FEN we're able to see the packets in question
|
||||
leaving on a tcpdump as well, but that's not helpful for finding the root cause.
|
||||
We have lots of processes running on the FENs and any number of them could be
|
||||
doing something crazy.
|
||||
|
||||
We fire up sysdig, which is similar to systemtap and strace in that it allows
|
||||
you to hook into the kernel and view various kernel activites in real time, but
|
||||
it's easier to use than both. The following command dumps all UDP packets being
|
||||
sent and what process is sending them:
|
||||
|
||||
```
|
||||
# sysdig fd.l4proto=udp
|
||||
...
|
||||
2528950 22:17:35.260606188 0 php-fpm (21477) < connect res=0 tuple=$fen_ip:61173->$skydns_ip:53
|
||||
2528961 22:17:35.260611327 0 php-fpm (21477) > sendto fd=102(<4u>$fen_ip:61173->$skydns_ip:53) size=25 tuple=NULL
|
||||
2528991 22:17:35.260631917 0 php-fpm (21477) < sendto res=25 data=.r...........unknown.....
|
||||
2530470 22:17:35.261879032 0 php-fpm (21477) > ioctl fd=102(<4u>$fen_ip:61173->$skydns_ip:53) request=541B argument=7FFF82DC8728
|
||||
2530472 22:17:35.261880574 0 php-fpm (21477) < ioctl res=0
|
||||
2530474 22:17:35.261881226 0 php-fpm (21477) > recvfrom fd=102(<4u>$fen_ip:61173->$skydns_ip:53) size=1024
|
||||
2530476 22:17:35.261883424 0 php-fpm (21477) < recvfrom res=25 data=.r...........unknown..... tuple=$skydns_ip:53->$fen_ip:61173
|
||||
2530485 22:17:35.261888997 0 php-fpm (21477) > close fd=102(<4u>$fen_ip:61173->$skydns_ip:53)
|
||||
2530488 22:17:35.261892626 0 php-fpm (21477) < close res=0
|
||||
```
|
||||
|
||||
Aha! We can see php-fpm is requesting something over udp with the string
|
||||
`unknown` in it. We've now narrowed down the guilty process, the rest should be
|
||||
easy right?
|
||||
|
||||
## Which PHP?
|
||||
|
||||
Unfortunately we're a PHP shop; knowing that php-fpm is doing something on a FEN
|
||||
narrows down the guilty codebase little. Taking the FEN out of our load-balancer
|
||||
stops the requests for `unknown`, so we *can* say that it's some user-facing
|
||||
code that is the culprit. Our setup on the FENs involves users hitting nginx
|
||||
for static content and nginx proxying PHP requests back to php-fpm. Since all
|
||||
our virtual domains are defined in nginx, we are able to do something horrible.
|
||||
|
||||
On the particular FEN we're on we make a guess about which virtual domain the
|
||||
problem is likely coming from (our main app), and proxy all traffic from all
|
||||
other domains to a different FEN. We still see requests for `unknown` leaving
|
||||
the box, so we've narrowed the problem down a little more.
|
||||
|
||||
## The Despair
|
||||
|
||||
Nothing in our code is doing any direct DNS calls as far as we can find, and we
|
||||
don't see any places PHP might be doing it for us. We have lots of PHP
|
||||
extensions in place, all written in C and all black boxes; any of them could be
|
||||
the culprit. Grepping through the likely candidates' source code for the string
|
||||
`unknown` proves fruitless.
|
||||
|
||||
We try xdebug at this point. xdebug is a profiler for php which will create
|
||||
cachegrind files for the running code. With cachegrind you can see every
|
||||
function which was ever called, how long spent within each function, a full
|
||||
call-graph, and lots more. Unfortunately xdebug outputs cachegrind files on a
|
||||
per-php-fpm-process basis, and overwrites the previous file on each new request.
|
||||
So xdebug is pretty much useless, since what is in the cachegrind file isn't
|
||||
necessarily what spawned the DNS request.
|
||||
|
||||
## Gotcha (sorta)
|
||||
|
||||
We turn back to the tried and true method of dumping all the traffic using
|
||||
tcpdump and perusing through that manually.
|
||||
|
||||
What we find is that nearly everytime there is a DNS request for `unknown`, if
|
||||
we scroll up a bit there is (usually) a particular request to memcache. The
|
||||
requested key is always in the style of `function-name:someid:otherstuff`. When
|
||||
looking in the code around that function name we find this ominous looking call:
|
||||
|
||||
```php
|
||||
$ipAddress = getIPAddress();
|
||||
$geoipInfo = getCountryInfoFromIP($ipAddress);
|
||||
```
|
||||
|
||||
This points us in the right direction. On a hunch we add some debug
|
||||
logging to print out the `$ipAddress` variable, and sure enough it comes back as
|
||||
`unknown`. AHA!
|
||||
|
||||
So what we surmise is happening is that for some reason our geoip extension,
|
||||
which we use to get the location data of an IP address and which
|
||||
`getCountryInfoFromIP` calls, is seeing something which is *not* an IP address
|
||||
and trying to resolve it.
|
||||
|
||||
## Gotcha (for real)
|
||||
|
||||
So the question becomes: why are we getting the string `unknown` as an IP
|
||||
address?
|
||||
|
||||
Adding some debug logging around the area we find before showed that
|
||||
`$_SERVER['REMOTE_ADDR']`, which is the variable populated with the IP address
|
||||
of the client, is sometimes `unknown`. We guess that this has something to do
|
||||
with some magic we are doing on nginx's side to populate `REMOTE_ADDR` with the
|
||||
real IP address of the client in the case of them going through a proxy.
|
||||
|
||||
Many proxies send along the header `X-Forwarded-For` to indicate the real IP of
|
||||
the client they're proxying for, otherwise the server would only see the proxy's
|
||||
IP. In our setup I decided that in those cases we should set the `REMOTE_ADDR`
|
||||
to the real client IP so our application logic doesn't even have to worry about
|
||||
it. There are a couple problems with this which render it a bad decision, one
|
||||
being that if some misbahaving proxy was to, say, start sending
|
||||
`X-Forwarded-For: unknown` then some written applications might mistake that to
|
||||
mean the client's IP is `unknown`.
|
||||
|
||||
## The Fix
|
||||
|
||||
The fix here was two-fold:
|
||||
|
||||
1) We now always set `$_SERVER['REMOTE_ADDR']` to be the remote address of the
|
||||
requests, regardless of if it's a proxy, and also send the application the
|
||||
`X-Forwarded-For` header to do with as it pleases.
|
||||
|
||||
2) Inside our app we look at all the headers sent and do some processing to
|
||||
decide what the actual client IP is. PHP can handle a lot more complex logic
|
||||
than nginx can, so we can do things like check to make sure the IP is an IP, and
|
||||
also that it's not some NAT'd internal ip, and so forth.
|
||||
|
||||
And that's it. From some weird log messages on our DNS servers to an nginx
|
||||
mis-configuration on an almost unrelated set of servers, this is one of those
|
||||
strange bugs that never has a nice solution and goes unsolved for a long time.
|
||||
Spending the time to dive down the rabbit hole and find the answer is often
|
||||
tedious, but also often very rewarding.
|
||||
|
||||
[skydns]: https://github.com/skynetservices/skydns
|
BIN
img/happy-tree/happy-tree-atmp1-small.png
Normal file
After Width: | Height: | Size: 381 KiB |
BIN
img/happy-tree/happy-tree-atmp1.png
Normal file
After Width: | Height: | Size: 4.0 MiB |
BIN
img/happy-tree/happy-tree-atmp2-small.png
Normal file
After Width: | Height: | Size: 2.3 MiB |
BIN
img/happy-tree/happy-tree-atmp2.png
Normal file
After Width: | Height: | Size: 37 MiB |
BIN
img/happy-tree/happy-tree-atmp3-pow3-small.png
Normal file
After Width: | Height: | Size: 1.5 MiB |
BIN
img/happy-tree/happy-tree-atmp3-pow3.png
Normal file
After Width: | Height: | Size: 14 MiB |
BIN
img/happy-tree/happy-tree-atmp3-pow4-small.png
Normal file
After Width: | Height: | Size: 747 KiB |
BIN
img/happy-tree/happy-tree-atmp3-pow4.png
Normal file
After Width: | Height: | Size: 6.0 MiB |
BIN
img/happy-tree/happy-tree-atmp3-pow5-small.png
Normal file
After Width: | Height: | Size: 1.2 MiB |
BIN
img/happy-tree/happy-tree-atmp3-pow5.png
Normal file
After Width: | Height: | Size: 20 MiB |
BIN
img/happy-tree/happy-tree-atmp3-pow6-small.png
Normal file
After Width: | Height: | Size: 811 KiB |
BIN
img/happy-tree/happy-tree-atmp3-pow6.png
Normal file
After Width: | Height: | Size: 9.7 MiB |
BIN
img/happy-tree/partial.png
Normal file
After Width: | Height: | Size: 154 KiB |