Skip to content

Commit 412ea32

Browse files
committed
Small improvements to repo analysis docs
* Reflows the text so it's all sentence-per-line. * Improves the one-line summary at the top of the article. * Explains why you must run increasingly large analyses rather than just accepting the first run. * Expands the docs about why failures are bad even if snapshots appear to work. * Mentions verifying the reference implementation before reporting ES bugs.
1 parent 3e527a8 commit 412ea32

File tree

1 file changed

+17
-7
lines changed

1 file changed

+17
-7
lines changed

specification/snapshot/repository_analyze/SnapshotAnalyzeRepositoryRequest.ts

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -24,23 +24,30 @@ import { Duration } from '@_types/Time'
2424

2525
/**
2626
* Analyze a snapshot repository.
27-
* Analyze the performance characteristics and any incorrect behaviour found in a repository.
2827
*
29-
* The response exposes implementation details of the analysis which may change from version to version.
30-
* The response body format is therefore not considered stable and may be different in newer versions.
28+
* Performs operations on a snapshot repository in order to check for incorrect behaviour.
3129
*
3230
* There are a large number of third-party storage systems available, not all of which are suitable for use as a snapshot repository by Elasticsearch.
33-
* Some storage systems behave incorrectly, or perform poorly, especially when accessed concurrently by multiple clients as the nodes of an Elasticsearch cluster do. This API performs a collection of read and write operations on your repository which are designed to detect incorrect behaviour and to measure the performance characteristics of your storage system.
31+
* Some storage systems behave incorrectly, or perform poorly, especially when accessed concurrently by multiple clients as the nodes of an Elasticsearch cluster do.
32+
* This API performs a collection of read and write operations on your repository which are designed to detect incorrect behaviour and to measure the performance characteristics of your storage system.
3433
*
3534
* The default values for the parameters are deliberately low to reduce the impact of running an analysis inadvertently and to provide a sensible starting point for your investigations.
3635
* Run your first analysis with the default parameter values to check for simple problems.
37-
* If successful, run a sequence of increasingly large analyses until you encounter a failure or you reach a `blob_count` of at least `2000`, a `max_blob_size` of at least `2gb`, a `max_total_data_size` of at least `1tb`, and a `register_operation_count` of at least `100`.
36+
* Some repositories may behave correctly when lightly loaded but incorrectly under production-like workloads.
37+
* If the first analysis is successful, run a sequence of increasingly large analyses until you encounter a failure or you reach a `blob_count` of at least `2000`, a `max_blob_size` of at least `2gb`, a `max_total_data_size` of at least `1tb`, and a `register_operation_count` of at least `100`.
3838
* Always specify a generous timeout, possibly `1h` or longer, to allow time for each analysis to run to completion.
39+
* Some repositories may behave correctly when accessed by a small number of Elasticsearch nodes but incorrectly when accessed concurrently by a production-scale cluster.
3940
* Perform the analyses using a multi-node cluster of a similar size to your production cluster so that it can detect any problems that only arise when the repository is accessed by many nodes at once.
4041
*
4142
* If the analysis fails, Elasticsearch detected that your repository behaved unexpectedly.
4243
* This usually means you are using a third-party storage system with an incorrect or incompatible implementation of the API it claims to support.
4344
* If so, this storage system is not suitable for use as a snapshot repository.
45+
* Repository analysis triggers conditions that occur only rarely when taking snapshots in a production system.
46+
* Snapshotting to unsuitable storage may sometimes appear to work correctly most of the time despite repository analysis failures.
47+
* However your snapshot data is at risk if you store it in a snapshot repository that does not reliably pass repository analysis.
48+
* You can demonstrate that the analysis failure is due to an incompatible storage implementation by verifying that Elasticsearch does not detect the same problem when analysing the reference implementation of the storage protocol you are using.
49+
* For instance, if you are using storage that the supplier claims to offer an API that is compatible with AWS S3, demonstrate that the storage API is not truly S3-compatible by verifying that repositories in AWS S3 do not fail repository analysis.
50+
* Please do not report Elasticsearch issues involving third-party storage systems unless you can demonstrate that the same issue exists when analysing a repository that uses the reference implementation of the same storage protocol.
4451
* You will need to work with the supplier of your storage system to address the incompatibilities that Elasticsearch detects.
4552
*
4653
* If the analysis is successful, the API returns details of the testing process, optionally including how long each operation took.
@@ -72,7 +79,9 @@ import { Duration } from '@_types/Time'
7279
* You must ensure this load does not affect other users of these systems.
7380
* Analyses respect the repository settings `max_snapshot_bytes_per_sec` and `max_restore_bytes_per_sec` if available and the cluster setting `indices.recovery.max_bytes_per_sec` which you can use to limit the bandwidth they consume.
7481
*
75-
* NOTE: This API is intended for exploratory use by humans. You should expect the request parameters and the response format to vary in future versions.
82+
* NOTE: This API is intended for exploratory use by humans.
83+
* You should expect the request parameters and the response format to vary in future versions.
84+
* The response exposes immplementation details of the analysis which may change from version to version.
7685
*
7786
* NOTE: Different versions of Elasticsearch may perform different checks for repository compatibility, with newer versions typically being stricter than older ones.
7887
* A storage system that passes repository analysis with one version of Elasticsearch may fail with a different version.
@@ -83,7 +92,8 @@ import { Duration } from '@_types/Time'
8392
*
8493
* *Implementation details*
8594
*
86-
* NOTE: This section of documentation describes how the repository analysis API works in this version of Elasticsearch, but you should expect the implementation to vary between versions. The request parameters and response format depend on details of the implementation so may also be different in newer versions.
95+
* NOTE: This section of documentation describes how the repository analysis API works in this version of Elasticsearch, but you should expect the implementation to vary between versions.
96+
* The request parameters and response format depend on details of the implementation so may also be different in newer versions.
8797
*
8898
* The analysis comprises a number of blob-level tasks, as set by the `blob_count` parameter and a number of compare-and-exchange operations on linearizable registers, as set by the `register_operation_count` parameter.
8999
* These tasks are distributed over the data and master-eligible nodes in the cluster for execution.

0 commit comments

Comments
 (0)