Documentation fixes (typo and small reorganization)

This commit is contained in:
Alex Auvolat 2021-05-31 23:55:51 +02:00
parent 14fd3df654
commit 42f692b1e0
No known key found for this signature in database
GPG Key ID: EDABF9711E244EB1
5 changed files with 97 additions and 55 deletions

View File

@ -12,16 +12,19 @@ parameters:
- An **API access key** and its associated **secret key**. These usually look something - An **API access key** and its associated **secret key**. These usually look something
like this: `GK3515373e4c851ebaad366558` (access key), like this: `GK3515373e4c851ebaad366558` (access key),
`7d37d093435a41f2aab8f13c19ba067d9776c90215f56614adad6ece597dbb34` (secret key). `7d37d093435a41f2aab8f13c19ba067d9776c90215f56614adad6ece597dbb34` (secret key).
These keys are created and managed using the `garage` CLI, as explained in the
[quick start](../quick_start/index.md) guide.
Most S3 clients can be configured easily, provided that you follow the following guidelines: Most S3 clients can be configured easily with these parameters,
provided that you follow the following guidelines:
- **Force path style:** Garage does not support DNS-style buckets, which are now by default - **Force path style:** Garage does not support DNS-style buckets, which are now by default
on Amazon S3. Instead, Garage uses the legacy path-style bucket addressing. on Amazon S3. Instead, Garage uses the legacy path-style bucket addressing.
Remember to configure your client to acknowledge this fact. Remember to configure your client to acknowledge this fact.
- **Configuring the S3 region:** Garage requires your client to talk to the correct "S3 region", - **Configuring the S3 region:** Garage requires your client to talk to the correct "S3 region",
which is set in the configuration file. This is often set just to `garage`. Remember to which is set in the configuration file. This is often set just to `garage`.
configure your client correctly for this as otherwise it will talk to `us-east-1`. If this is not configured explicitly, clients usually try to talk to region `us-east-1`.
Garage should normally redirect your client to the correct region, Garage should normally redirect your client to the correct region,
but in case your client does not support this you might have to configure it manually. but in case your client does not support this you might have to configure it manually.

View File

@ -5,7 +5,7 @@ Similarly, Garage's cookbook contains a collection of recipes that are known to
This chapter could also be referred as "Tutorials" or "Best practices". This chapter could also be referred as "Tutorials" or "Best practices".
- **[Deploying Garage](real_world.md):** This page will walk you through all of the necessary - **[Deploying Garage](real_world.md):** This page will walk you through all of the necessary
steps to deploy Garaage in a real-world setting. steps to deploy Garage in a real-world setting.
- **[Configuring S3 clients](clients.md):** This page will explain how to configure - **[Configuring S3 clients](clients.md):** This page will explain how to configure
popular S3 clients to interact with a Garage server. popular S3 clients to interact with a Garage server.

View File

@ -2,11 +2,46 @@
To run Garage in cluster mode, we recommend having at least 3 nodes. To run Garage in cluster mode, we recommend having at least 3 nodes.
This will allow you to setup Garage for three-way replication of your data, This will allow you to setup Garage for three-way replication of your data,
the safest and most available mode avaialble. the safest and most available mode proposed by Garage.
We recommend first following the [quick start guide](../quick_start/index.md) in order We recommend first following the [quick start guide](../quick_start/index.md) in order
to get familiar with Garage's command line and usage patterns. to get familiar with Garage's command line and usage patterns.
## Prerequisites
To run a real-world deployment, make sure you the following conditions are met:
- You have at least three machines with sufficient storage space available.
- Each machine has a public IP address which is reachable by other machines.
Running behind a NAT is possible, but having several Garage nodes behind a single NAT
is slightly more involved as each will have to have a different RPC port number
(the local port number of a node must be the same as the port number exposed publicly
by the NAT).
- Ideally, each machine should have a SSD available in addition to the HDD you are dedicating
to Garage. This will allow for faster access to metadata and has the potential
to drastically reduce Garage's response times.
- This guide will assume you are using Docker containers to deploy Garage on each node.
Garage can also be run independently, for instance as a [Systemd service](systemd.md).
You can also use an orchestrator such as Nomad or Kubernetes to automatically manage
Docker containers on a fleet of nodes.
Before deploying Garage on your infrastructure, you must inventory your machines.
For our example, we will suppose the following infrastructure with IPv6 connectivity:
| Location | Name | IP Address | Disk Space |
|----------|---------|------------|------------|
| Paris | Mercury | fc00:1::1 | 1 To |
| Paris | Venus | fc00:1::2 | 2 To |
| London | Earth | fc00:B::1 | 2 To |
| Brussels | Mars | fc00:F::1 | 1.5 To |
## Get a Docker image ## Get a Docker image
Our docker image is currently named `lxpz/garage_amd64` and is stored on the [Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated). Our docker image is currently named `lxpz/garage_amd64` and is stored on the [Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated).
@ -35,43 +70,26 @@ chmod +x genkeys.sh
``` ```
It will creates a folder named `pki/` containing the keys that you will used for the cluster. It will creates a folder named `pki/` containing the keys that you will used for the cluster.
These files will have to be copied to all of your cluster nodes, as explained below.
## Deploying and configuring Garage ## Deploying and configuring Garage
To run a real-world deployment, make sure you the following conditions are met:
- You have at least three machines with sufficient storage space available
- Each machine has a public IP address which is reachable by other machines.
Running behind a NAT is possible, but having several Garage nodes behind a single NAT
is slightly more involved as each will have to have a different RPC port number
(the local port number of a node must be the same as the port number exposed publicly
by the NAT).
- Ideally, each machine should have a SSD available in addition to the HDD you are dedicating
to Garage. This will allow for faster access to metadata and has the potential
to drastically reduce Garage's response times.
Before deploying garage on your infrastructure, you must inventory your machines.
For our example, we will suppose the following infrastructure with IPv6 connectivity:
| Location | Name | IP Address | Disk Space |
|----------|---------|------------|------------|
| Paris | Mercury | fc00:1::1 | 1 To |
| Paris | Venus | fc00:1::2 | 2 To |
| London | Earth | fc00:B::1 | 2 To |
| Brussels | Mars | fc00:F::1 | 1.5 To |
On each machine, we will have a similar setup, On each machine, we will have a similar setup,
especially you must consider the following folders/files: especially you must consider the following folders/files:
- `/etc/garage/garage.toml`: Garage daemon's configuration (see below) - `/etc/garage/garage.toml`: Garage daemon's configuration (see below)
- `/etc/garage/pki/`: Folder containing Garage certificates, must be generated on your computer and copied on the servers
- `/var/lib/garage/meta/`: Folder containing Garage's metadata, put this folder on a SSD if possible - `/etc/garage/pki/`: Folder containing Garage certificates,
- `/var/lib/garage/data/`: Folder containing Garage's data, this folder will grows and must be on a large storage, possibly big HDDs. must be generated on your computer and copied on the servers.
- `/etc/systemd/system/garage.service`: Service file to start garage at boot automatically (defined below, not required if you use docker) Only the files `garage-ca.crt`, `garage.crt` and `garage.key` are necessary.
- `/var/lib/garage/meta/`: Folder containing Garage's metadata,
put this folder on a SSD if possible
- `/var/lib/garage/data/`: Folder containing Garage's data,
this folder will be your main data storage and must be on a large storage (e.g. large HDD)
A valid `/etc/garage/garage.toml` for our cluster would be: A valid `/etc/garage/garage.toml` for our cluster would be:
@ -128,14 +146,14 @@ docker run \
lxpz/garage_amd64:v0.3.0 lxpz/garage_amd64:v0.3.0
``` ```
It should be restart automatically at each reboot. It should be restarted automatically at each reboot.
Please note that we use host networking as otherwise Docker containers Please note that we use host networking as otherwise Docker containers
can not communicate with IPv6. can not communicate with IPv6.
Upgrading between Garage versions should be supported transparently, Upgrading between Garage versions should be supported transparently,
but please check the relase notes before doing so! but please check the relase notes before doing so!
To upgrade, simply stop and remove this container and To upgrade, simply stop and remove this container and
start again the command with a new version of garage. start again the command with a new version of Garage.
## Controling the daemon ## Controling the daemon
@ -166,7 +184,7 @@ You will now have a shell where the Garage binary is available as `/garage/garag
-h, --rpc-host <rpc-host> -h, --rpc-host <rpc-host>
``` ```
The 3 first ones are certificates and keys needed by TLS, the last one is simply the address of garage's RPC endpoint. The 3 first ones are certificates and keys needed by TLS, the last one is simply the address of Garage's RPC endpoint.
If you are invoking `garage` from a server node directly, you do not need to set `--rpc-host` If you are invoking `garage` from a server node directly, you do not need to set `--rpc-host`
as the default value `127.0.0.1:3901` will allow it to contact Garage correctly. as the default value `127.0.0.1:3901` will allow it to contact Garage correctly.
@ -196,18 +214,19 @@ You should get something like that as result:
``` ```
Healthy nodes: Healthy nodes:
2a638ed6c775b69a… 37f0ba978d27 [::ffff:172.20.0.101]:3901 UNCONFIGURED/REMOVED 8781c50c410a41b3… Mercury [fc00:1::1]:3901 UNCONFIGURED/REMOVED
68143d720f20c89d… 9795a2f7abb5 [::ffff:172.20.0.103]:3901 UNCONFIGURED/REMOVED 2a638ed6c775b69a… Venus [fc00:1::2]:3901 UNCONFIGURED/REMOVED
8781c50c410a41b3… 758338dde686 [::ffff:172.20.0.102]:3901 UNCONFIGURED/REMOVED 68143d720f20c89d… Earth [fc00:B::1]:3901 UNCONFIGURED/REMOVED
212f7572f0c89da9… Mars [fc00:F::1]:3901 UNCONFIGURED/REMOVED
``` ```
## Configuring a cluster ## Configuring a cluster
We will now inform garage of the disk space available on each node of the cluster We will now inform Garage of the disk space available on each node of the cluster
as well as the zone (e.g. datacenter) in which each machine is located. as well as the zone (e.g. datacenter) in which each machine is located.
For our example, we will suppose we have the following infrastructure (Capacity, Identifier and Datacenter are specific values to garage described in the following): For our example, we will suppose we have the following infrastructure (Capacity, Identifier and Datacenter are specific values to Garage described in the following):
| Location | Name | Disk Space | `Capacity` | `Identifier` | `Zone` | | Location | Name | Disk Space | `Capacity` | `Identifier` | `Zone` |
|----------|---------|------------|------------|--------------|--------------| |----------|---------|------------|------------|--------------|--------------|
@ -218,7 +237,7 @@ For our example, we will suppose we have the following infrastructure (Capacity,
#### Node identifiers #### Node identifiers
After its first launch, garage generates a random and unique identifier for each nodes, such as: After its first launch, Garage generates a random and unique identifier for each nodes, such as:
``` ```
8781c50c410a41b363167e9d49cc468b6b9e4449b6577b64f15a249a149bdcbc 8781c50c410a41b363167e9d49cc468b6b9e4449b6577b64f15a249a149bdcbc
@ -233,12 +252,13 @@ The most simple way to match an identifier to a node is to run:
garagectl status garagectl status
``` ```
It will display the IP address associated with each node; from the IP address you will be able to recognize the node. It will display the IP address associated with each node;
from the IP address you will be able to recognize the node.
#### Zones #### Zones
Zones are simply a user-chosen identifier that identify a group of server that are grouped together logically. Zones are simply a user-chosen identifier that identify a group of server that are grouped together logically.
It is up to the system administrator deploying garage to identify what does "grouped together" means. It is up to the system administrator deploying Garage to identify what does "grouped together" means.
In most cases, a zone will correspond to a geographical location (i.e. a datacenter). In most cases, a zone will correspond to a geographical location (i.e. a datacenter).
Behind the scene, Garage will use zone definition to try to store the same data on different zones, Behind the scene, Garage will use zone definition to try to store the same data on different zones,
@ -246,13 +266,15 @@ in order to provide high availability despite failure of a zone.
#### Capacity #### Capacity
Garage reasons on an arbitrary metric about disk storage that is named the *capacity* of a node. Garage reasons on an abstract metric about disk storage that is named the *capacity* of a node.
The capacity configured in Garage must be proportional to the disk space dedicated to the node. The capacity configured in Garage must be proportional to the disk space dedicated to the node.
Additionaly, the capacity values used in Garage should be as small as possible, with Due to the way the Garage allocation algorithm works, capacity values must
1 ideally representing the size of your smallest server. be **integers**, and must be **as small as possible**, for instance with
1 representing the size of your smallest server.
Here we chose that 1 unit of capacity = 0.5 To, so that we can express servers of size Here we chose that 1 unit of capacity = 0.5 To, so that we can express servers of size
1 To and 2 To, as wel as the intermediate size 1.5 To. 1 To and 2 To, as wel as the intermediate size 1.5 To, with the integer values 2, 4 and
3 respectively (see table above).
Note that the amount of data stored by Garage on each server may not be strictly proportional to Note that the amount of data stored by Garage on each server may not be strictly proportional to
its capacity value, as Garage will priorize having 3 copies of data in different zones, its capacity value, as Garage will priorize having 3 copies of data in different zones,
@ -270,3 +292,14 @@ garagectl node configure -z par1 -c 4 -t venus 2a638e
garagectl node configure -z lon1 -c 4 -t earth 68143d garagectl node configure -z lon1 -c 4 -t earth 68143d
garagectl node configure -z bru1 -c 3 -t mars 212f75 garagectl node configure -z bru1 -c 3 -t mars 212f75
``` ```
## Using your Garage cluster
Creating buckets and managing keys is done using the `garagectl` CLI,
and is covered in the [quick start guide](../quick_start/index.md).
Remember also that the CLI is self-documented thanks to the `--help` flag and
the `help` subcommand (e.g. `garage help`, `garage key --help`).
Configuring an S3 client to interact with Garage is covered
[in the next section](clients.md).

View File

@ -6,14 +6,18 @@ Fear not! For Garage is fully equipped to handle drive failures, in most common
## A note on availability of Garage ## A note on availability of Garage
With nodes dispersed in 3 datacenters or more, here are the guarantees Garage provides with the default replication strategy (3 copies of all data, which is the recommended value): With nodes dispersed in 3 zones or more, here are the guarantees Garage provides with the 3-way replication strategy (3 copies of all data, which is the recommended replication mode):
- The cluster remains fully functional as long as the machines that fail are in only one datacenter. This includes a whole datacenter going down due to power/Internet outage. - The cluster remains fully functional as long as the machines that fail are in only one zone. This includes a whole zone going down due to power/Internet outage.
- No data is lost as long as the machines that fail are in at most two datacenters. - No data is lost as long as the machines that fail are in at most two zones.
Of course this only works if your Garage nodes are correctly configured to be aware of the datacenter in which they are located. Of course this only works if your Garage nodes are correctly configured to be aware of the zone in which they are located.
Make sure this is the case using `garage status` to check on the state of your cluster's configuration. Make sure this is the case using `garage status` to check on the state of your cluster's configuration.
In case of temporarily disconnected nodes, Garage should automatically re-synchronize
when the nodes come back up. This guide will deal with recovering from disk failures
that caused the loss of the data of a node.
## First option: removing a node ## First option: removing a node

View File

@ -40,7 +40,9 @@ replication_mode = "none"
rpc_bind_addr = "[::]:3901" rpc_bind_addr = "[::]:3901"
bootstrap_peers = [] bootstrap_peers = [
"127.0.0.1:3901",
]
[s3_api] [s3_api]
s3_region = "garage" s3_region = "garage"