Commit Graph

139 Commits

Author SHA1 Message Date
Alex Auvolat
605a630333
Use streaming in block manager 2022-07-29 12:25:02 +02:00
Alex Auvolat
a35d4da721
update netapp to 0.5 2022-07-29 12:25:02 +02:00
Alex Auvolat
8e7e680afe
First adaptation to WIP netapp with streaming body 2022-07-29 12:25:02 +02:00
Alex
4f38cadf6e Background task manager (#332)
- [x] New background worker trait
- [x] Adapt all current workers to use new API
- [x] Command to list currently running workers, and whether they are active, idle, or dead
- [x] Error reporting
- Optimizations
  - [x] Merkle updater: several items per iteration
  - [ ] Use `tokio::task::spawn_blocking` where appropriate so that CPU-intensive tasks don't block other things going on
- scrub:
  - [x] have only one worker with a channel to start/pause/cancel
  - [x] automatic scrub
  - [x] ability to view and change tranquility from CLI
  - [x] persistence of a few info
- [ ] Testing

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/332
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-07-08 13:30:26 +02:00
Alex
382e74c798 First version of admin API (#298)
**Spec:**

- [x] Start writing
- [x] Specify all layout endpoints
- [x] Specify all endpoints for operations on keys
- [x] Specify all endpoints for operations on key/bucket permissions
- [x] Specify all endpoints for operations on buckets
- [x] Specify all endpoints for operations on bucket aliases

View rendered spec at <https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/admin-api/doc/drafts/admin-api.md>

**Code:**

- [x] Refactor code for admin api to use common api code that was created for K2V

**General endpoints:**

- [x] Metrics
- [x] GetClusterStatus
- [x] ConnectClusterNodes
- [x] GetClusterLayout
- [x] UpdateClusterLayout
- [x] ApplyClusterLayout
- [x] RevertClusterLayout

**Key-related endpoints:**

- [x] ListKeys
- [x] CreateKey
- [x] ImportKey
- [x] GetKeyInfo
- [x] UpdateKey
- [x] DeleteKey

**Bucket-related endpoints:**

- [x] ListBuckets
- [x] CreateBucket
- [x] GetBucketInfo
- [x] DeleteBucket
- [x] PutBucketWebsite
- [x] DeleteBucketWebsite

**Operations on key/bucket permissions:**

- [x] BucketAllowKey
- [x] BucketDenyKey

**Operations on bucket aliases:**

- [x] GlobalAliasBucket
- [x] GlobalUnaliasBucket
- [x] LocalAliasBucket
- [x] LocalUnaliasBucket

**And also:**

- [x] Separate error type for the admin API (this PR includes a quite big refactoring of error handling)
- [x] Add management of website access
- [ ] Check that nothing is missing wrt what can be done using the CLI
- [ ] Improve formatting of the spec
- [x] Make sure everyone is cool with the API design

Fix #231
Fix #295

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/298
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-24 12:16:39 +02:00
Alex
5768bf3622 First implementation of K2V (#293)
**Specification:**

View spec at [this URL](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/k2v/doc/drafts/k2v-spec.md)

- [x] Specify the structure of K2V triples
- [x] Specify the DVVS format used for causality detection
- [x] Specify the K2V index (just a counter of number of values per partition key)
- [x] Specify single-item endpoints: ReadItem, InsertItem, DeleteItem
- [x] Specify index endpoint: ReadIndex
- [x] Specify multi-item endpoints: InsertBatch, ReadBatch, DeleteBatch
- [x] Move to JSON objects instead of tuples
- [x] Specify endpoints for polling for updates on single values (PollItem)

**Implementation:**

- [x] Table for K2V items, causal contexts
- [x] Indexing mechanism and table for K2V index
- [x] Make API handlers a bit more generic
- [x] K2V API endpoint
- [x] K2V API router
- [x] ReadItem
- [x] InsertItem
- [x] DeleteItem
- [x] PollItem
- [x] ReadIndex
- [x] InsertBatch
- [x] ReadBatch
- [x] DeleteBatch

**Testing:**

- [x] Just a simple Python script that does some requests to check visually that things are going right (does not contain parsing of results or assertions on returned values)
- [x] Actual tests:
  - [x] Adapt testing framework
  - [x] Simple test with InsertItem + ReadItem
  - [x] Test with several Insert/Read/DeleteItem + ReadIndex
  - [x] Test all combinations of return formats for ReadItem
  - [x] Test with ReadBatch, InsertBatch, DeleteBatch
  - [x] Test with PollItem
  - [x] Test error codes
- [ ] Fix most broken stuff
  - [x] test PollItem broken randomly
  - [x] when invalid causality tokens are given, errors should be 4xx not 5xx

**Improvements:**

- [x] Descending range queries
  - [x] Specify
  - [x] Implement
  - [x] Add test
- [x] Batch updates to index counter
- [x] Put K2V behind `k2v` feature flag

Co-authored-by: Alex Auvolat <alex@adnab.me>
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/293
Co-authored-by: Alex <alex@adnab.me>
Co-committed-by: Alex <alex@adnab.me>
2022-05-10 13:16:57 +02:00
Alex Auvolat
def78c5e6f
Update netapp to 0.4.4, fix #300 2022-05-09 12:08:47 +02:00
Alex Auvolat
94f1e48fff Update to netapp 0.4.2 (a tiny fix) 2022-04-07 11:50:03 +02:00
Alex Auvolat
9d0ed78887 Add feature flag for Kubernetes discovery 2022-03-24 16:57:43 +01:00
Alex Auvolat
509d256c58
Make layout optimization work in relative terms 2022-03-24 15:27:14 +01:00
Alex Auvolat
7e0e2ffda2
Slight change and add comment to layout assignation algo 2022-03-24 15:27:13 +01:00
Alex Auvolat
413ab0eaed
Small change to partition assignation algorithm
This change helps ensure that nodes for each partition are spread
over all datacenters, a property that wasn't ensured previously
when going from a 2 DC deployment to a 3 DC deployment
2022-03-24 15:27:10 +01:00
Alex Auvolat
db46cdef79
Update netapp to v0.4.1 2022-03-15 17:09:57 +01:00
Alex Auvolat
ba6b56ae68
Fix some new clippy lints 2022-03-14 12:27:49 +01:00
Alex Auvolat
2377a92f6b
Add wrapper over sled tree to count items (used for big queues) 2022-03-14 10:54:25 +01:00
Alex Auvolat
203e8d2c34
Bump version to 0.7 because of incompatible Netapp 2022-03-14 10:54:24 +01:00
Alex Auvolat
f869ca625d
Add spans to table calls, change span names in RPC 2022-03-14 10:54:12 +01:00
Alex Auvolat
0cc31ee169
add missing netapp telemetry feature 2022-03-14 10:54:11 +01:00
Alex Auvolat
dc8d0496cc
Refactoring: rename config files, make modifications less invasive 2022-03-14 10:53:51 +01:00
Alex Auvolat
2a5609b292
Add metrics to API endpoint 2022-03-14 10:53:36 +01:00
Alex Auvolat
818daa5c78
Refactor how durations are measured 2022-03-14 10:53:35 +01:00
Alex Auvolat
bb04d94fa9
Update to Netapp 0.4 which supports distributed tracing 2022-03-14 10:52:30 +01:00
Alex Auvolat
8c2fb0c066
Add tracing integration with opentelemetry 2022-03-14 10:52:13 +01:00
Alex Auvolat
2cab84b1fe
Add many metrics in table/ and rpc/ 2022-03-14 10:51:50 +01:00
Max Audron
9d44127245
add support for kubernetes service discovery
This commit adds support to discover garage instances running in
kubernetes.

Once enabled by setting `kubernetes_namespace` and
`kubernetes_service_name` garage will create a Custom Resources
`garagenodes.deuxfleurs.fr` with nodes public key as the resource name.
and IP and Port information as spec in the namespace configured by
`kubernetes_namespace`.

For discovering nodes the resources are filtered with the optionally set
`kubernetes_service_name` which sets a label
`garage.deuxfleurs.fr/service` on the resources.

This allows to separate multiple garage deployments in a single
namespace.

the `kubernetes_skip_crd` variable allows to disable the creation of the
CRD by garage itself. The user must deploy this manually.
2022-03-12 13:05:52 +01:00
Alex Auvolat
beeef4758e
Some movement of helper code and refactoring of error handling 2022-01-04 12:52:46 +01:00
Alex Auvolat
5b1117e582
New model for buckets 2022-01-04 12:45:46 +01:00
Alex Auvolat
c94406f428
Improve how node roles are assigned in Garage
- change the terminology: the network configuration becomes the role
  table, the configuration of a nodes becomes a node's role
- the modification of the role table takes place in two steps: first,
  changes are staged in a CRDT data structure. Then, once the user is
  happy with the changes, they can commit them all at once (or revert
  them).
- update documentation
- fix tests
- implement smarter partition assignation algorithm

This patch breaks the format of the network configuration: when
migrating, the cluster will be in a state where no roles are assigned.
All roles must be re-assigned and commited at once. This migration
should not pose an issue.
2021-11-16 16:05:53 +01:00
Alex Auvolat
e8811f7c9d
Request strategy: don't launch all 3 requests if not needed 2021-11-04 16:19:27 +01:00
Alex Auvolat
6f13d083ab
Add semaphore to limit RAM used by buffered outgoing requests 2021-11-03 18:02:57 +01:00
Alex Auvolat
8c4f418fe8
Fix peer list persistence: do not forget previous peers 2021-11-03 17:34:44 +01:00
Alex Auvolat
43e13a501d
Use published netapp crate instead of git repo 2021-10-26 10:36:57 +02:00
Alex Auvolat
ada7899b24
Fix clippy lints (fix #121) 2021-10-26 10:20:05 +02:00
Alex Auvolat
de4276202a
Improve CLI, adapt tests, update documentation 2021-10-25 14:21:48 +02:00
Alex Auvolat
1b450c4b49
Improvements to CLI and various fixes for netapp version
Discovery via consul, persist peer list to file
2021-10-22 16:55:24 +02:00
Alex Auvolat
4067797d01
First port of Garage to Netapp 2021-10-22 15:55:18 +02:00
Alex Auvolat
fa394dcd27
Support pkcs8 private keys (allowing for ed25519 to be used for rpc) 2021-07-06 11:16:01 +02:00
trinity-1686a
30a7dee920 exit when inconsistent level of replication is detected (#92)
fix #88

Authored-by: Trinity Pointard <trinity.pointard@gmail.com>
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/92
Co-authored-by: trinity-1686a <trinity.pointard@gmail.com>
Co-committed-by: trinity-1686a <trinity.pointard@gmail.com>
2021-06-02 13:30:39 +02:00
Trinity Pointard
289521886b make most changes suggested during install-party 2021-05-29 21:37:49 +02:00
Alex Auvolat
b9127dd6f8
Prepare for v0.3.0 and add migration path from v0.2.1.x 2021-05-28 15:29:58 +02:00
Alex Auvolat
ddb2b29bfd
Rename datacenters into zones (doc not yet updated) 2021-05-28 14:07:36 +02:00
Alex Auvolat
b490ebc7f6
Many improvements on ring/replication and its configuration:
- Explicit "replication_mode" configuration parameters that takes
  either "none", "2" or "3" as values, instead of letting user configure
  replication factor themselves. These are presets whose corresponding
  replication/quorum values can be found in replication/mode.rs

- Explicit support for single-node and two-node deployments
  (number of nodes must be at least "replication_mode", with "none"
  we can have only one node)

- Ring is now stored much more compactly with 256*8 + n*32 bytes,
  instead of 256*32 bytes

- Support for gateway-only nodes that do not store data
  (these nodes still need a metadata_directory to store the list
  of bucket and keys since those are stored on all nodes; it also
  technically needs a data_directory to start but it will stay
  empty unless we have bugs)
2021-05-28 14:07:36 +02:00
Trinity Pointard
e4b9e4e24d
rename types to CamelCase 2021-05-03 22:15:09 +02:00
Trinity Pointard
f05bb111c2
fix clippy warnings on util and rpc 2021-05-03 22:11:41 +02:00
Trinity Pointard
2812a027ea
change some more comments and revert changes on TableSchema 2021-04-27 16:49:07 +02:00
Trinity Pointard
74373aebcf
make most requested changes 2021-04-27 16:47:08 +02:00
Alex Auvolat
16300bbd89
remove useless comment 2021-04-27 16:44:01 +02:00
Trinity Pointard
f871689571
run cargo fmt on util and make missing doc warning 2021-04-27 16:37:10 +02:00
Trinity Pointard
8e0524ae15
document rpc crate 2021-04-27 16:37:10 +02:00
Alex Auvolat
6b2b400292
small simplify 2021-04-27 16:37:09 +02:00
Alex Auvolat
8c33d565d6
Merge discovery loop with consul 2021-04-27 16:37:09 +02:00
Alex Auvolat
948e44a3f6
cargo fmt 2021-04-27 16:37:09 +02:00
Alex Auvolat
3e2e38c830
Print stats 2021-04-27 16:37:09 +02:00
Alex Auvolat
2e53e31cdd
Cargo fmt 2021-04-27 16:37:09 +02:00
Alex Auvolat
64b91c2645
Keep old data 2021-04-27 16:37:09 +02:00
Alex Auvolat
e16077f40a
Persist directly and not in background 2021-04-27 16:37:09 +02:00
Alex Auvolat
9ced9f78dc
Improve bootstraping: do it regularly; persist peer list 2021-04-27 16:37:08 +02:00
Alex Auvolat
f859d15062 update to v0.2.1 2021-03-19 13:39:18 +01:00
Alex Auvolat
4c26a0b9c1 Update Cargo.toml files with AGPL license info 2021-03-18 21:59:17 +01:00
Alex Auvolat
dead945c8f Prepare for release 0.2 2021-03-18 19:33:15 +01:00
Alex Auvolat
f4346cc5f4 Update dependencies 2021-03-16 15:58:40 +01:00
Alex Auvolat
2a41b82384 Simpler Merkle & sync 2021-03-16 12:18:03 +01:00
Alex Auvolat
1d9961e411 Simplify replication logic 2021-03-16 11:14:27 +01:00
Alex Auvolat
6a8439fd13 Some improvements in background worker but we terminate late 2021-03-15 23:14:12 +01:00
Alex Auvolat
0cd5b2ae19 WIP migrate to tokio 1 2021-03-15 22:36:41 +01:00
Alex Auvolat
4d4117f2b4 Refactor block resync loop; make workers infaillible 2021-03-15 20:09:44 +01:00
Alex Auvolat
537f652fec Tiny things 2021-03-15 18:40:27 +01:00
Alex Auvolat
3bf2df622a Time and metadata improvements 2021-03-15 16:21:41 +01:00
Alex Auvolat
c475471e7a Implement table gc, currently for block_ref and version only 2021-03-12 19:57:37 +01:00
Alex Auvolat
046b649bcc (not well tested) use merkle tree for sync 2021-03-11 18:28:27 +01:00
Alex Auvolat
8d63738cb0 Checkpoint: add merkle tree in data table 2021-03-11 13:47:21 +01:00
Alex Auvolat
3214dd52dd Very minor changes 2021-03-10 21:50:09 +01:00
Alex Auvolat
6a3dcf3974 Rename n_tokens into capacity 2021-03-10 14:52:03 +01:00
Alex Auvolat
7cda917b6b update condition 2021-03-05 17:08:03 +01:00
Alex Auvolat
d7e005251d Not fully tested: new multi-dc MagLev 2021-03-05 16:22:29 +01:00
Alex Auvolat
20e6e9fa20 Update sled & try to debug deadlock (but its in sled...) 2021-02-23 21:27:28 +01:00
Alex Auvolat
40763fd749 Cargo fmt 2021-02-23 18:46:25 +01:00
Alex Auvolat
6e6f7e8555 Replace some checksums where it makes sense 2021-02-23 18:14:37 +01:00
Alex Auvolat
b1b640ae8b rename hash() to sha256sum(), we might want to change it at some places 2021-02-21 15:24:30 +01:00
Alex Auvolat
80892df8cc Some refactoring 2021-02-21 13:11:10 +01:00
Alex Auvolat
1d1d497e2b Bump everything to 0.1.1 2021-01-15 17:54:48 +01:00
Alex Auvolat
8956db2a81 Make less things public 2020-12-12 17:58:19 +01:00
Alex Auvolat
a50fa70d45 Refactor error management in API part 2020-11-08 15:05:28 +01:00
Alex Auvolat
3b0b11085e Add versions to dependencies 2020-07-07 14:18:47 +02:00
Alex Auvolat
cc65cdc0fe Add license, description and repository to .toml files 2020-07-07 14:14:58 +02:00
Alex Auvolat
fbe8fe81f2 Add automatic peer discovery from Consul 2020-06-30 18:33:14 +02:00
Alex Auvolat
16fbb32fd3 Rate limit requests a bit more seriously
droping the slot later (after reading the request response)
means that we aren't freeing our quota slot,
so the maximum number of simultaneous requests now also counts the
response reading phase

TODO next: quotas per rpc destination node, or maybe per datacenter (?)
2020-05-01 19:18:54 +00:00
Alex Auvolat
d8f5e643bc Split code for modular compilation 2020-04-24 10:10:01 +00:00
Alex Auvolat
c9c6b0dbd4 Reorganize code 2020-04-23 17:05:46 +00:00