gq and reference-style-link everything

This commit is contained in:
Brian Picciano 2013-10-08 22:26:08 -04:00
parent 6d80ce514e
commit 35f1708c1f
2 changed files with 100 additions and 58 deletions

View File

@ -1,46 +1,62 @@
# Erlang, tcp sockets, and active true # Erlang, tcp sockets, and active true
If you don't know erlang then [you're missing out](http://learnyousomeerlang.com/content). If you don't know erlang then [you're missing out][0]. If you do know erlang,
If you do know erlang, you've probably at some point done something with tcp sockets. Erlang's you've probably at some point done something with tcp sockets. Erlang's highly
highly concurrent model of execution lends itself well to server programs where a high number concurrent model of execution lends itself well to server programs where a high
of active connections is desired. Each thread can autonomously handle its single client, number of active connections is desired. Each thread can autonomously handle its
greatly simplifying the logic of the whole application while still retaining single client, greatly simplifying the logic of the whole application while
[great performance characteristics](http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1). still retaining [great performance characteristics][1].
# Background # Background
For an erlang thread which owns a single socket there are three different ways to receive data For an erlang thread which owns a single socket there are three different ways
off of that socket. These all revolve around the `active` [setopts](http://www.erlang.org/doc/man/inet.html#setopts-2) to receive data off of that socket. These all revolve around the `active`
flag. A socket can be set to one of: [setopts][2] flag. A socket can be set to one of:
* `{active,false}` - All data must be obtained through [recv/2](http://www.erlang.org/doc/man/gen_tcp.html#recv-2) * `{active,false}` - All data must be obtained through [recv/2][3] calls. This
calls. This amounts to syncronous socket reading. amounts to syncronous socket reading.
* `{active,true}` - All data on the socket gets sent to the controlling thread as a normal erlang
message. It is the thread's responsibility to keep up with the buffered data * `{active,true}` - All data on the socket gets sent to the controlling thread
in the message queue. This amounts to asyncronous socket reading. as a normal erlang message. It is the thread's
* `{active,once}` - When set the socket is placed in `{active,true}` for a single packet. That responsibility to keep up with the buffered data in the
is, once set the thread can expect a single message to be sent to when data message queue. This amounts to asyncronous socket reading.
comes in. To receive any more data off of the socket the socket must either
be read from using [recv/2](http://www.erlang.org/doc/man/gen_tcp.html#recv-2) * `{active,once}` - When set the socket is placed in `{active,true}` for a
or be put in `{active,once}` or `{active,true}`. single packet. That is, once set the thread can expect a
single message to be sent to when data comes in. To receive
any more data off of the socket the socket must either be
read from using [recv/2][3] or be put in `{active,once}` or
`{active,true}`.
# Which to use? # Which to use?
Many (most?) tutorials advocate using `{active,once}` in your application [0][1][2]. This has to do with usability and Many (most?) tutorials advocate using `{active,once}` in your application
security. When in `{active,true}` it's possible for a client to flood the connection faster than the receiving process \[0]\[1]\[2]. This has to do with usability and security. When in `{active,true}`
will process those messages, potentially eating up a lot of memory in the VM. However, if you want to be able to receive it's possible for a client to flood the connection faster than the receiving
both tcp data messages as well as other messages from other erlang processes at the same time you can't use `{active,false}`. process will process those messages, potentially eating up a lot of memory in
So `{active,once}` is generally preferred because it deals with both of these problems quite well. the VM. However, if you want to be able to receive both tcp data messages as
well as other messages from other erlang processes at the same time you can't
use `{active,false}`. So `{active,once}` is generally preferred because it
deals with both of these problems quite well.
# Why not to use `{active,once}` # Why not to use `{active,once}`
Here's what your classic `{active,once}` enabled tcp socket implementation will probably look like: Here's what your classic `{active,once}` enabled tcp socket implementation will
probably look like:
```erlang ```erlang
-module(tcp_test). -module(tcp_test).
-compile(export_all). -compile(export_all).
-define(TCP_OPTS, [binary, {packet, raw}, {nodelay,true}, {active, false}, {reuseaddr, true}, {keepalive,true}, {backlog,500}]). -define(TCP_OPTS, [
binary,
{packet, raw},
{nodelay,true},
{active, false},
{reuseaddr, true},
{keepalive,true},
{backlog,500}
]).
%Start listening %Start listening
listen(Port) -> listen(Port) ->
@ -66,15 +82,16 @@ read_loop(Socket) ->
end. end.
``` ```
This code isn't actually usable for a production system; it doesn't even spawn a new process for the new socket. But that's not This code isn't actually usable for a production system; it doesn't even spawn a
the point I'm making. If I run it with `tcp_test:listen(8000)`, and in other window do: new process for the new socket. But that's not the point I'm making. If I run it
with `tcp_test:listen(8000)`, and in other window do:
```bash ```bash
while [ 1 ]; do echo "aloha"; done | nc localhost 8000 while [ 1 ]; do echo "aloha"; done | nc localhost 8000
``` ```
We'll be flooding the the server with data pretty well. Using [eprof](http://www.erlang.org/doc/man/eprof.html) we can get an idea We'll be flooding the the server with data pretty well. Using [eprof][4] we can
of how our code performs, and where the hang-ups are: get an idea of how our code performs, and where the hang-ups are:
```erlang ```erlang
1> eprof:start(). 1> eprof:start().
@ -111,18 +128,30 @@ inet:setopts/2 12303598 5.72 4533863 [ 0.37]
erlang:port_control/3 12303600 77.13 61085040 [ 4.96] erlang:port_control/3 12303600 77.13 61085040 [ 4.96]
``` ```
eprof shows us where our process is spending the majority of its time. The `%` column indicates percentage of time the process spent eprof shows us where our process is spending the majority of its time. The `%`
during profiling inside any function. We can pretty clearly see that the vast majority of time was spent inside `erlang:port_control/3`, column indicates percentage of time the process spent during profiling inside
the BIF that `inet:setopts/2` uses to switch the socket to `{active,once}` mode. Amongst the calls which were called on every loop, any function. We can pretty clearly see that the vast majority of time was spent
it takes up by far the most amount of time. In addition all of those other calls are also related to `inet:setopts/2`. inside `erlang:port_control/3`, the BIF that `inet:setopts/2` uses to switch the
socket to `{active,once}` mode. Amongst the calls which were called on every
loop, it takes up by far the most amount of time. In addition all of those other
calls are also related to `inet:setopts/2`.
I'm gonna rewrite our little listen server to use `{active,true}`, and we'll do it all again: I'm gonna rewrite our little listen server to use `{active,true}`, and we'll do
it all again:
```erlang ```erlang
-module(tcp_test). -module(tcp_test).
-compile(export_all). -compile(export_all).
-define(TCP_OPTS, [binary, {packet, raw}, {nodelay,true}, {active, false}, {reuseaddr, true}, {keepalive,true}, {backlog,500}]). -define(TCP_OPTS, [
binary,
{packet, raw},
{nodelay,true},
{active, false},
{reuseaddr, true},
{keepalive,true},
{backlog,500}
]).
%Start listening %Start listening
listen(Port) -> listen(Port) ->
@ -194,20 +223,30 @@ erlang:port_control/3 3 0.00 59 [ 19.67]
tcp_test:read_loop/1 20716370 100.00 12187488 [ 0.59] tcp_test:read_loop/1 20716370 100.00 12187488 [ 0.59]
``` ```
This time our process spent almost no time at all (according to eprof, 0%) fiddling with the socket opts. This time our process spent almost no time at all (according to eprof, 0%)
Instead it spent all of its time in the read_loop doing the work we actually want to be doing. fiddling with the socket opts. Instead it spent all of its time in the
read_loop doing the work we actually want to be doing.
# So what does this mean? # So what does this mean?
I'm by no means advocating never using `{active,once}`. The security concern is still a completely valid concern and one I'm by no means advocating never using `{active,once}`. The security concern is
that `{active,once}` mitigates quite well. I'm simply pointing out that this mitigation has some fairly serious performance still a completely valid concern and one that `{active,once}` mitigates quite
implications which have the potential to bite you if you're not careful, especially in cases where a socket is going to be well. I'm simply pointing out that this mitigation has some fairly serious
receiving a large amount of traffic. performance implications which have the potential to bite you if you're not
careful, especially in cases where a socket is going to be receiving a large
amount of traffic.
# Meta # Meta
These tests were done using R15B03, but I've done similar ones in R14 and found similar results. I have not tested R16. These tests were done using R15B03, but I've done similar ones in R14 and found
similar results. I have not tested R16.
* [0] http://learnyousomeerlang.com/buckets-of-sockets * \[0] http://learnyousomeerlang.com/buckets-of-sockets
* [1] http://www.erlang.org/doc/man/gen_tcp.html#examples * \[1] http://www.erlang.org/doc/man/gen_tcp.html#examples
* [2] http://erlycoder.com/25/erlang-tcp-server-tcp-client-sockets-with-gen_tcp * \[2] http://erlycoder.com/25/erlang-tcp-server-tcp-client-sockets-with-gen_tcp
[0]: http://learnyousomeerlang.com/content
[1]: http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1
[2]: http://www.erlang.org/doc/man/inet.html#setopts-2
[3]: http://www.erlang.org/doc/man/gen_tcp.html#recv-2
[4]: http://www.erlang.org/doc/man/eprof.html

View File

@ -1,16 +1,19 @@
# Go and project root # Go and project root
Compared to other languages go has some strange behavior regarding its project root settings. If you Compared to other languages go has some strange behavior regarding its project
import a library called `somelib`, go will look for a `src/somelib` folder in all of the folders in root settings. If you import a library called `somelib`, go will look for a
the `$GOPATH` environment variable. This works nicely for globally installed packages, but it makes `src/somelib` folder in all of the folders in the `$GOPATH` environment
encapsulating a project with a specific version, or modified version, rather tedious. Whenever you go variable. This works nicely for globally installed packages, but it makes
to work on this project you'll have to add its path to your `$GOPATH`, or add the path permanently, encapsulating a project with a specific version, or modified version, rather
which could break other projects which may use a different version of `somelib`. tedious. Whenever you go to work on this project you'll have to add its path to
your `$GOPATH`, or add the path permanently, which could break other projects
which may use a different version of `somelib`.
My solution is in the form of a simple script I'm calling go+. go+ will search in currrent directory My solution is in the form of a simple script I'm calling go+. go+ will search
and all of its parents for a file called `GOPROJROOT`. If it finds that file in a directory, it in currrent directory and all of its parents for a file called `GOPROJROOT`. If
prepends that directory's absolute path to your `$GOPATH` and stops the search. Regardless of whether it finds that file in a directory, it prepends that directory's absolute path to
or not `GOPROJROOT` was found go+ will passthrough all arguments to the actual go call. The your `$GOPATH` and stops the search. Regardless of whether or not `GOPROJROOT`
was found go+ will passthrough all arguments to the actual go call. The
modification to `$GOPATH` will only last the duration of the call. modification to `$GOPATH` will only last the duration of the call.
As an example, consider the following: As an example, consider the following:
@ -23,8 +26,8 @@ As an example, consider the following:
/hello.go /hello.go
``` ```
If `hello.go` depends on `somelib`, as long as you run go+ from `/tmp/hello` or one of its children If `hello.go` depends on `somelib`, as long as you run go+ from `/tmp/hello` or
your project will still compile one of its children your project will still compile
Here is the source code for go+: Here is the source code for go+: