gq and reference-style-link everything
This commit is contained in:
parent
6d80ce514e
commit
35f1708c1f
@ -1,46 +1,62 @@
|
|||||||
# Erlang, tcp sockets, and active true
|
# Erlang, tcp sockets, and active true
|
||||||
|
|
||||||
If you don't know erlang then [you're missing out](http://learnyousomeerlang.com/content).
|
If you don't know erlang then [you're missing out][0]. If you do know erlang,
|
||||||
If you do know erlang, you've probably at some point done something with tcp sockets. Erlang's
|
you've probably at some point done something with tcp sockets. Erlang's highly
|
||||||
highly concurrent model of execution lends itself well to server programs where a high number
|
concurrent model of execution lends itself well to server programs where a high
|
||||||
of active connections is desired. Each thread can autonomously handle its single client,
|
number of active connections is desired. Each thread can autonomously handle its
|
||||||
greatly simplifying the logic of the whole application while still retaining
|
single client, greatly simplifying the logic of the whole application while
|
||||||
[great performance characteristics](http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1).
|
still retaining [great performance characteristics][1].
|
||||||
|
|
||||||
# Background
|
# Background
|
||||||
|
|
||||||
For an erlang thread which owns a single socket there are three different ways to receive data
|
For an erlang thread which owns a single socket there are three different ways
|
||||||
off of that socket. These all revolve around the `active` [setopts](http://www.erlang.org/doc/man/inet.html#setopts-2)
|
to receive data off of that socket. These all revolve around the `active`
|
||||||
flag. A socket can be set to one of:
|
[setopts][2] flag. A socket can be set to one of:
|
||||||
|
|
||||||
* `{active,false}` - All data must be obtained through [recv/2](http://www.erlang.org/doc/man/gen_tcp.html#recv-2)
|
* `{active,false}` - All data must be obtained through [recv/2][3] calls. This
|
||||||
calls. This amounts to syncronous socket reading.
|
amounts to syncronous socket reading.
|
||||||
* `{active,true}` - All data on the socket gets sent to the controlling thread as a normal erlang
|
|
||||||
message. It is the thread's responsibility to keep up with the buffered data
|
* `{active,true}` - All data on the socket gets sent to the controlling thread
|
||||||
in the message queue. This amounts to asyncronous socket reading.
|
as a normal erlang message. It is the thread's
|
||||||
* `{active,once}` - When set the socket is placed in `{active,true}` for a single packet. That
|
responsibility to keep up with the buffered data in the
|
||||||
is, once set the thread can expect a single message to be sent to when data
|
message queue. This amounts to asyncronous socket reading.
|
||||||
comes in. To receive any more data off of the socket the socket must either
|
|
||||||
be read from using [recv/2](http://www.erlang.org/doc/man/gen_tcp.html#recv-2)
|
* `{active,once}` - When set the socket is placed in `{active,true}` for a
|
||||||
or be put in `{active,once}` or `{active,true}`.
|
single packet. That is, once set the thread can expect a
|
||||||
|
single message to be sent to when data comes in. To receive
|
||||||
|
any more data off of the socket the socket must either be
|
||||||
|
read from using [recv/2][3] or be put in `{active,once}` or
|
||||||
|
`{active,true}`.
|
||||||
|
|
||||||
# Which to use?
|
# Which to use?
|
||||||
|
|
||||||
Many (most?) tutorials advocate using `{active,once}` in your application [0][1][2]. This has to do with usability and
|
Many (most?) tutorials advocate using `{active,once}` in your application
|
||||||
security. When in `{active,true}` it's possible for a client to flood the connection faster than the receiving process
|
\[0]\[1]\[2]. This has to do with usability and security. When in `{active,true}`
|
||||||
will process those messages, potentially eating up a lot of memory in the VM. However, if you want to be able to receive
|
it's possible for a client to flood the connection faster than the receiving
|
||||||
both tcp data messages as well as other messages from other erlang processes at the same time you can't use `{active,false}`.
|
process will process those messages, potentially eating up a lot of memory in
|
||||||
So `{active,once}` is generally preferred because it deals with both of these problems quite well.
|
the VM. However, if you want to be able to receive both tcp data messages as
|
||||||
|
well as other messages from other erlang processes at the same time you can't
|
||||||
|
use `{active,false}`. So `{active,once}` is generally preferred because it
|
||||||
|
deals with both of these problems quite well.
|
||||||
|
|
||||||
# Why not to use `{active,once}`
|
# Why not to use `{active,once}`
|
||||||
|
|
||||||
Here's what your classic `{active,once}` enabled tcp socket implementation will probably look like:
|
Here's what your classic `{active,once}` enabled tcp socket implementation will
|
||||||
|
probably look like:
|
||||||
|
|
||||||
```erlang
|
```erlang
|
||||||
-module(tcp_test).
|
-module(tcp_test).
|
||||||
-compile(export_all).
|
-compile(export_all).
|
||||||
|
|
||||||
-define(TCP_OPTS, [binary, {packet, raw}, {nodelay,true}, {active, false}, {reuseaddr, true}, {keepalive,true}, {backlog,500}]).
|
-define(TCP_OPTS, [
|
||||||
|
binary,
|
||||||
|
{packet, raw},
|
||||||
|
{nodelay,true},
|
||||||
|
{active, false},
|
||||||
|
{reuseaddr, true},
|
||||||
|
{keepalive,true},
|
||||||
|
{backlog,500}
|
||||||
|
]).
|
||||||
|
|
||||||
%Start listening
|
%Start listening
|
||||||
listen(Port) ->
|
listen(Port) ->
|
||||||
@ -66,15 +82,16 @@ read_loop(Socket) ->
|
|||||||
end.
|
end.
|
||||||
```
|
```
|
||||||
|
|
||||||
This code isn't actually usable for a production system; it doesn't even spawn a new process for the new socket. But that's not
|
This code isn't actually usable for a production system; it doesn't even spawn a
|
||||||
the point I'm making. If I run it with `tcp_test:listen(8000)`, and in other window do:
|
new process for the new socket. But that's not the point I'm making. If I run it
|
||||||
|
with `tcp_test:listen(8000)`, and in other window do:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
while [ 1 ]; do echo "aloha"; done | nc localhost 8000
|
while [ 1 ]; do echo "aloha"; done | nc localhost 8000
|
||||||
```
|
```
|
||||||
|
|
||||||
We'll be flooding the the server with data pretty well. Using [eprof](http://www.erlang.org/doc/man/eprof.html) we can get an idea
|
We'll be flooding the the server with data pretty well. Using [eprof][4] we can
|
||||||
of how our code performs, and where the hang-ups are:
|
get an idea of how our code performs, and where the hang-ups are:
|
||||||
|
|
||||||
```erlang
|
```erlang
|
||||||
1> eprof:start().
|
1> eprof:start().
|
||||||
@ -111,18 +128,30 @@ inet:setopts/2 12303598 5.72 4533863 [ 0.37]
|
|||||||
erlang:port_control/3 12303600 77.13 61085040 [ 4.96]
|
erlang:port_control/3 12303600 77.13 61085040 [ 4.96]
|
||||||
```
|
```
|
||||||
|
|
||||||
eprof shows us where our process is spending the majority of its time. The `%` column indicates percentage of time the process spent
|
eprof shows us where our process is spending the majority of its time. The `%`
|
||||||
during profiling inside any function. We can pretty clearly see that the vast majority of time was spent inside `erlang:port_control/3`,
|
column indicates percentage of time the process spent during profiling inside
|
||||||
the BIF that `inet:setopts/2` uses to switch the socket to `{active,once}` mode. Amongst the calls which were called on every loop,
|
any function. We can pretty clearly see that the vast majority of time was spent
|
||||||
it takes up by far the most amount of time. In addition all of those other calls are also related to `inet:setopts/2`.
|
inside `erlang:port_control/3`, the BIF that `inet:setopts/2` uses to switch the
|
||||||
|
socket to `{active,once}` mode. Amongst the calls which were called on every
|
||||||
|
loop, it takes up by far the most amount of time. In addition all of those other
|
||||||
|
calls are also related to `inet:setopts/2`.
|
||||||
|
|
||||||
I'm gonna rewrite our little listen server to use `{active,true}`, and we'll do it all again:
|
I'm gonna rewrite our little listen server to use `{active,true}`, and we'll do
|
||||||
|
it all again:
|
||||||
|
|
||||||
```erlang
|
```erlang
|
||||||
-module(tcp_test).
|
-module(tcp_test).
|
||||||
-compile(export_all).
|
-compile(export_all).
|
||||||
|
|
||||||
-define(TCP_OPTS, [binary, {packet, raw}, {nodelay,true}, {active, false}, {reuseaddr, true}, {keepalive,true}, {backlog,500}]).
|
-define(TCP_OPTS, [
|
||||||
|
binary,
|
||||||
|
{packet, raw},
|
||||||
|
{nodelay,true},
|
||||||
|
{active, false},
|
||||||
|
{reuseaddr, true},
|
||||||
|
{keepalive,true},
|
||||||
|
{backlog,500}
|
||||||
|
]).
|
||||||
|
|
||||||
%Start listening
|
%Start listening
|
||||||
listen(Port) ->
|
listen(Port) ->
|
||||||
@ -194,20 +223,30 @@ erlang:port_control/3 3 0.00 59 [ 19.67]
|
|||||||
tcp_test:read_loop/1 20716370 100.00 12187488 [ 0.59]
|
tcp_test:read_loop/1 20716370 100.00 12187488 [ 0.59]
|
||||||
```
|
```
|
||||||
|
|
||||||
This time our process spent almost no time at all (according to eprof, 0%) fiddling with the socket opts.
|
This time our process spent almost no time at all (according to eprof, 0%)
|
||||||
Instead it spent all of its time in the read_loop doing the work we actually want to be doing.
|
fiddling with the socket opts. Instead it spent all of its time in the
|
||||||
|
read_loop doing the work we actually want to be doing.
|
||||||
|
|
||||||
# So what does this mean?
|
# So what does this mean?
|
||||||
|
|
||||||
I'm by no means advocating never using `{active,once}`. The security concern is still a completely valid concern and one
|
I'm by no means advocating never using `{active,once}`. The security concern is
|
||||||
that `{active,once}` mitigates quite well. I'm simply pointing out that this mitigation has some fairly serious performance
|
still a completely valid concern and one that `{active,once}` mitigates quite
|
||||||
implications which have the potential to bite you if you're not careful, especially in cases where a socket is going to be
|
well. I'm simply pointing out that this mitigation has some fairly serious
|
||||||
receiving a large amount of traffic.
|
performance implications which have the potential to bite you if you're not
|
||||||
|
careful, especially in cases where a socket is going to be receiving a large
|
||||||
|
amount of traffic.
|
||||||
|
|
||||||
# Meta
|
# Meta
|
||||||
|
|
||||||
These tests were done using R15B03, but I've done similar ones in R14 and found similar results. I have not tested R16.
|
These tests were done using R15B03, but I've done similar ones in R14 and found
|
||||||
|
similar results. I have not tested R16.
|
||||||
|
|
||||||
* [0] http://learnyousomeerlang.com/buckets-of-sockets
|
* \[0] http://learnyousomeerlang.com/buckets-of-sockets
|
||||||
* [1] http://www.erlang.org/doc/man/gen_tcp.html#examples
|
* \[1] http://www.erlang.org/doc/man/gen_tcp.html#examples
|
||||||
* [2] http://erlycoder.com/25/erlang-tcp-server-tcp-client-sockets-with-gen_tcp
|
* \[2] http://erlycoder.com/25/erlang-tcp-server-tcp-client-sockets-with-gen_tcp
|
||||||
|
|
||||||
|
[0]: http://learnyousomeerlang.com/content
|
||||||
|
[1]: http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1
|
||||||
|
[2]: http://www.erlang.org/doc/man/inet.html#setopts-2
|
||||||
|
[3]: http://www.erlang.org/doc/man/gen_tcp.html#recv-2
|
||||||
|
[4]: http://www.erlang.org/doc/man/eprof.html
|
||||||
|
27
goplus.md
27
goplus.md
@ -1,16 +1,19 @@
|
|||||||
# Go and project root
|
# Go and project root
|
||||||
|
|
||||||
Compared to other languages go has some strange behavior regarding its project root settings. If you
|
Compared to other languages go has some strange behavior regarding its project
|
||||||
import a library called `somelib`, go will look for a `src/somelib` folder in all of the folders in
|
root settings. If you import a library called `somelib`, go will look for a
|
||||||
the `$GOPATH` environment variable. This works nicely for globally installed packages, but it makes
|
`src/somelib` folder in all of the folders in the `$GOPATH` environment
|
||||||
encapsulating a project with a specific version, or modified version, rather tedious. Whenever you go
|
variable. This works nicely for globally installed packages, but it makes
|
||||||
to work on this project you'll have to add its path to your `$GOPATH`, or add the path permanently,
|
encapsulating a project with a specific version, or modified version, rather
|
||||||
which could break other projects which may use a different version of `somelib`.
|
tedious. Whenever you go to work on this project you'll have to add its path to
|
||||||
|
your `$GOPATH`, or add the path permanently, which could break other projects
|
||||||
|
which may use a different version of `somelib`.
|
||||||
|
|
||||||
My solution is in the form of a simple script I'm calling go+. go+ will search in currrent directory
|
My solution is in the form of a simple script I'm calling go+. go+ will search
|
||||||
and all of its parents for a file called `GOPROJROOT`. If it finds that file in a directory, it
|
in currrent directory and all of its parents for a file called `GOPROJROOT`. If
|
||||||
prepends that directory's absolute path to your `$GOPATH` and stops the search. Regardless of whether
|
it finds that file in a directory, it prepends that directory's absolute path to
|
||||||
or not `GOPROJROOT` was found go+ will passthrough all arguments to the actual go call. The
|
your `$GOPATH` and stops the search. Regardless of whether or not `GOPROJROOT`
|
||||||
|
was found go+ will passthrough all arguments to the actual go call. The
|
||||||
modification to `$GOPATH` will only last the duration of the call.
|
modification to `$GOPATH` will only last the duration of the call.
|
||||||
|
|
||||||
As an example, consider the following:
|
As an example, consider the following:
|
||||||
@ -23,8 +26,8 @@ As an example, consider the following:
|
|||||||
/hello.go
|
/hello.go
|
||||||
```
|
```
|
||||||
|
|
||||||
If `hello.go` depends on `somelib`, as long as you run go+ from `/tmp/hello` or one of its children
|
If `hello.go` depends on `somelib`, as long as you run go+ from `/tmp/hello` or
|
||||||
your project will still compile
|
one of its children your project will still compile
|
||||||
|
|
||||||
Here is the source code for go+:
|
Here is the source code for go+:
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user