garage/doc/book/cookbook/recovering.md

+++
title = "Recovering from failures"
weight = 50
+++

Garage is meant to work on old, second-hand hardware.
In particular, this makes it likely that some of your drives will fail, and some manual intervention will be needed.
Fear not! For Garage is fully equipped to handle drive failures, in most common cases.

## A note on availability of Garage

With nodes dispersed in 3 zones or more, here are the guarantees Garage provides with the 3-way replication strategy (3 copies of all data, which is the recommended replication mode):

- The cluster remains fully functional as long as the machines that fail are in only one zone. This includes a whole zone going down due to power/Internet outage.
- No data is lost as long as the machines that fail are in at most two zones.

Of course this only works if your Garage nodes are correctly configured to be aware of the zone in which they are located.
Make sure this is the case using `garage status` to check on the state of your cluster's configuration.

In case of temporarily disconnected nodes, Garage should automatically re-synchronize
when the nodes come back up. This guide will deal with recovering from disk failures
that caused the loss of the data of a node.


## First option: removing a node

If you don't have spare parts (HDD, SDD) to replace the failed component, and if there are enough remaining nodes in your cluster
(at least 3), you can simply remove the failed node from Garage's configuration.
Note that if you **do** intend to replace the failed parts by new ones, using this method followed by adding back the node is **not recommended** (although it should work),
and you should instead use one of the methods detailed in the next sections.

Removing a node is done with the following command:

```bash
garage layout remove <node_id>
garage layout show    # review the changes you are making
garage layout apply   # once satisfied, apply the changes
```

(you can get the `node_id` of the failed node by running `garage status`)

This will repartition the data and ensure that 3 copies of everything are present on the nodes that remain available.


## Replacement scenario 1: only data is lost, metadata is fine

The recommended deployment for Garage uses an SSD to store metadata, and an HDD to store blocks of data.
In the case where only a single HDD crashes, the blocks of data are lost but the metadata is still fine.

This is very easy to recover by setting up a new HDD to replace the failed one.
The node does not need to be fully replaced and the configuration doesn't need to change.
We just need to tell Garage to get back all the data blocks and store them on the new HDD.

First, set up a new HDD to store Garage's data directory on the failed node, and restart Garage using
the existing configuration.  Then, run:

```bash
garage repair -a --yes blocks
```

This will re-synchronize blocks of data that are missing to the new HDD, reading them from copies located on other nodes.

You can check on the advancement of this process by doing the following command: 

```bash
garage stats -a
```

Look out for the following output:

```
Block manager stats:
  resync queue length: 26541
```

This indicates that one of the Garage node is in the process of retrieving missing data from other nodes.
This number decreases to zero when the node is fully synchronized.


## Replacement scenario 2: metadata (and possibly data) is lost

This scenario covers the case where a full node fails, i.e. both the metadata directory and
the data directory are lost, as well as the case where only the metadata directory is lost.

To replace the lost node, we will start from an empty metadata directory, which means
Garage will generate a new node ID for the replacement node.
We will thus need to remove the previous node ID from Garage's configuration and replace it by the ID of the new node.

If your data directory is stored on a separate drive and is still fine, you can keep it, but it is not necessary to do so.
In all cases, the data will be rebalanced and the replacement node will not store the same pieces of data
as were originally stored on the one that failed. So if you keep the data files, the rebalancing
might be faster but most of the pieces will be deleted anyway from the disk and replaced by other ones.

First, set up a new drive to store the metadata directory for the replacement node (a SSD is recommended),
and for the data directory if necessary. You can then start Garage on the new node.
The restarted node should generate a new node ID, and it should be shown with `NO ROLE ASSIGNED` in `garage status`.
The ID of the lost node should be shown in `garage status` in the section for disconnected/unavailable nodes.

Then, replace the broken node by the new one, using:

```bash
garage layout assign <new_node_id> --replace <old_node_id> \
		-c <capacity> -z <zone> -t <node_tag>
garage layout show    # review the changes you are making
garage layout apply   # once satisfied, apply the changes
```

Garage will then start synchronizing all required data on the new node.
This process can be monitored using the `garage stats -a` command.
Reorganize documentation for new website (#213) This PR should be merged after the new website is deployed. - [x] Rename files - [x] Add front matter section to all `.md` files in the book (necessary for Zola) - [x] Change all internal links to use Zola's linking system that checks broken links - [x] Some updates to documentation contents and organization Co-authored-by: Alex Auvolat <alex@adnab.me> Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/213 Co-authored-by: Alex <alex@adnab.me> Co-committed-by: Alex <alex@adnab.me> 2022-02-07 10:51:12 +00:00			`+++`
			`title = "Recovering from failures"`
Add best practices and doc of monitoring (fix #419) 2022-11-16 12:27:24 +00:00			`weight = 50`
Reorganize documentation for new website (#213) This PR should be merged after the new website is deployed. - [x] Rename files - [x] Add front matter section to all `.md` files in the book (necessary for Zola) - [x] Change all internal links to use Zola's linking system that checks broken links - [x] Some updates to documentation contents and organization Co-authored-by: Alex Auvolat <alex@adnab.me> Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/213 Co-authored-by: Alex <alex@adnab.me> Co-committed-by: Alex <alex@adnab.me> 2022-02-07 10:51:12 +00:00			`+++`
Section on recovering from failures 2021-04-02 17:29:51 +00:00
			`Garage is meant to work on old, second-hand hardware.`
			`In particular, this makes it likely that some of your drives will fail, and some manual intervention will be needed.`
			`Fear not! For Garage is fully equipped to handle drive failures, in most common cases.`

			`## A note on availability of Garage`

Documentation fixes (typo and small reorganization) 2021-05-31 21:55:51 +00:00			`With nodes dispersed in 3 zones or more, here are the guarantees Garage provides with the 3-way replication strategy (3 copies of all data, which is the recommended replication mode):`
Section on recovering from failures 2021-04-02 17:29:51 +00:00
Documentation fixes (typo and small reorganization) 2021-05-31 21:55:51 +00:00			`- The cluster remains fully functional as long as the machines that fail are in only one zone. This includes a whole zone going down due to power/Internet outage.`
			`- No data is lost as long as the machines that fail are in at most two zones.`
Section on recovering from failures 2021-04-02 17:29:51 +00:00
Documentation fixes (typo and small reorganization) 2021-05-31 21:55:51 +00:00			`Of course this only works if your Garage nodes are correctly configured to be aware of the zone in which they are located.`
Section on recovering from failures 2021-04-02 17:29:51 +00:00			Make sure this is the case using `garage status` to check on the state of your cluster's configuration.

Documentation fixes (typo and small reorganization) 2021-05-31 21:55:51 +00:00			`In case of temporarily disconnected nodes, Garage should automatically re-synchronize`
			`when the nodes come back up. This guide will deal with recovering from disk failures`
			`that caused the loss of the data of a node.`

Section on recovering from failures 2021-04-02 17:29:51 +00:00
			`## First option: removing a node`

			`If you don't have spare parts (HDD, SDD) to replace the failed component, and if there are enough remaining nodes in your cluster`
			`(at least 3), you can simply remove the failed node from Garage's configuration.`
			`Note that if you do intend to replace the failed parts by new ones, using this method followed by adding back the node is not recommended (although it should work),`
			`and you should instead use one of the methods detailed in the next sections.`

			`Removing a node is done with the following command:`

Improve how node roles are assigned in Garage - change the terminology: the network configuration becomes the role table, the configuration of a nodes becomes a node's role - the modification of the role table takes place in two steps: first, changes are staged in a CRDT data structure. Then, once the user is happy with the changes, they can commit them all at once (or revert them). - update documentation - fix tests - implement smarter partition assignation algorithm This patch breaks the format of the network configuration: when migrating, the cluster will be in a state where no roles are assigned. All roles must be re-assigned and commited at once. This migration should not pose an issue. 2021-11-09 11:24:04 +00:00			```bash
			`garage layout remove <node_id>`
			`garage layout show # review the changes you are making`
			`garage layout apply # once satisfied, apply the changes`
Section on recovering from failures 2021-04-02 17:29:51 +00:00			```

			(you can get the `node_id` of the failed node by running `garage status`)

			`This will repartition the data and ensure that 3 copies of everything are present on the nodes that remain available.`



			`## Replacement scenario 1: only data is lost, metadata is fine`

			`The recommended deployment for Garage uses an SSD to store metadata, and an HDD to store blocks of data.`
			`In the case where only a single HDD crashes, the blocks of data are lost but the metadata is still fine.`

			`This is very easy to recover by setting up a new HDD to replace the failed one.`
			`The node does not need to be fully replaced and the configuration doesn't need to change.`
			`We just need to tell Garage to get back all the data blocks and store them on the new HDD.`

			`First, set up a new HDD to store Garage's data directory on the failed node, and restart Garage using`
			`the existing configuration. Then, run:`

Improve how node roles are assigned in Garage - change the terminology: the network configuration becomes the role table, the configuration of a nodes becomes a node's role - the modification of the role table takes place in two steps: first, changes are staged in a CRDT data structure. Then, once the user is happy with the changes, they can commit them all at once (or revert them). - update documentation - fix tests - implement smarter partition assignation algorithm This patch breaks the format of the network configuration: when migrating, the cluster will be in a state where no roles are assigned. All roles must be re-assigned and commited at once. This migration should not pose an issue. 2021-11-09 11:24:04 +00:00			```bash
Section on recovering from failures 2021-04-02 17:29:51 +00:00			`garage repair -a --yes blocks`
			```

			`This will re-synchronize blocks of data that are missing to the new HDD, reading them from copies located on other nodes.`

			`You can check on the advancement of this process by doing the following command:`

Improve how node roles are assigned in Garage - change the terminology: the network configuration becomes the role table, the configuration of a nodes becomes a node's role - the modification of the role table takes place in two steps: first, changes are staged in a CRDT data structure. Then, once the user is happy with the changes, they can commit them all at once (or revert them). - update documentation - fix tests - implement smarter partition assignation algorithm This patch breaks the format of the network configuration: when migrating, the cluster will be in a state where no roles are assigned. All roles must be re-assigned and commited at once. This migration should not pose an issue. 2021-11-09 11:24:04 +00:00			```bash
Section on recovering from failures 2021-04-02 17:29:51 +00:00			`garage stats -a`
			```

			`Look out for the following output:`

			```
			`Block manager stats:`
			`resync queue length: 26541`
			```

			`This indicates that one of the Garage node is in the process of retrieving missing data from other nodes.`
			`This number decreases to zero when the node is fully synchronized.`


			`## Replacement scenario 2: metadata (and possibly data) is lost`

			`This scenario covers the case where a full node fails, i.e. both the metadata directory and`
			`the data directory are lost, as well as the case where only the metadata directory is lost.`

			`To replace the lost node, we will start from an empty metadata directory, which means`
			`Garage will generate a new node ID for the replacement node.`
			`We will thus need to remove the previous node ID from Garage's configuration and replace it by the ID of the new node.`

			`If your data directory is stored on a separate drive and is still fine, you can keep it, but it is not necessary to do so.`
			`In all cases, the data will be rebalanced and the replacement node will not store the same pieces of data`
			`as were originally stored on the one that failed. So if you keep the data files, the rebalancing`
			`might be faster but most of the pieces will be deleted anyway from the disk and replaced by other ones.`

			`First, set up a new drive to store the metadata directory for the replacement node (a SSD is recommended),`
			`and for the data directory if necessary. You can then start Garage on the new node.`
Reorganize documentation for new website (#213) This PR should be merged after the new website is deployed. - [x] Rename files - [x] Add front matter section to all `.md` files in the book (necessary for Zola) - [x] Change all internal links to use Zola's linking system that checks broken links - [x] Some updates to documentation contents and organization Co-authored-by: Alex Auvolat <alex@adnab.me> Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/213 Co-authored-by: Alex <alex@adnab.me> Co-committed-by: Alex <alex@adnab.me> 2022-02-07 10:51:12 +00:00			The restarted node should generate a new node ID, and it should be shown with `NO ROLE ASSIGNED` in `garage status`.
Section on recovering from failures 2021-04-02 17:29:51 +00:00			The ID of the lost node should be shown in `garage status` in the section for disconnected/unavailable nodes.

			`Then, replace the broken node by the new one, using:`

Improve how node roles are assigned in Garage - change the terminology: the network configuration becomes the role table, the configuration of a nodes becomes a node's role - the modification of the role table takes place in two steps: first, changes are staged in a CRDT data structure. Then, once the user is happy with the changes, they can commit them all at once (or revert them). - update documentation - fix tests - implement smarter partition assignation algorithm This patch breaks the format of the network configuration: when migrating, the cluster will be in a state where no roles are assigned. All roles must be re-assigned and commited at once. This migration should not pose an issue. 2021-11-09 11:24:04 +00:00			```bash
			`garage layout assign <new_node_id> --replace <old_node_id> \`
			`-c <capacity> -z <zone> -t <node_tag>`
			`garage layout show # review the changes you are making`
			`garage layout apply # once satisfied, apply the changes`
Section on recovering from failures 2021-04-02 17:29:51 +00:00			```

			`Garage will then start synchronizing all required data on the new node.`
			This process can be monitored using the `garage stats -a` command.