Skip to content

Commit 830b569

Browse files
committed
Commit scope comparisons - first draft
Signed-off-by: Dj Walker-Morgan <dj.walker-morgan@enterprisedb.com>
1 parent adf9f8c commit 830b569

File tree

4 files changed

+124
-36
lines changed

4 files changed

+124
-36
lines changed

product_docs/docs/pgd/5/choosing_server.mdx

Lines changed: 28 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -6,31 +6,32 @@ EDB Postgres Distributed can be deployed with three different Postgres distribut
66

77
The following table lists features of EDB Postgres Distributed that are dependent on the Postgres distribution and version.
88

9-
| Feature | PostgreSQL | EDB Postgres Extended | EDB Postgres Advanced |
10-
|-------------------------------------------------|------------|-----------------------|-----------------------|
11-
| [Rolling application and database upgrades](upgrades/) | Y | Y | Y |
12-
| [Row-level last-update wins conflict resolution](consistency/conflicts/) | Y | Y | Y |
13-
| [DDL replication](ddl/) | Y | Y | Y |
14-
| [Granular DDL Locking](ddl/ddl-locking/) | Y | Y | Y |
15-
| [Streaming of large transactions](transaction-streaming/) | v14+ | v13+ | v14+ |
16-
| [Distributed sequences](sequences/#pgd-global-sequences) | Y | Y | Y |
17-
| [Subscribe-only nodes](node_management/subscriber_only/) | Y | Y | Y |
18-
| [Monitoring](monitoring/) | Y | Y | Y |
19-
| [OpenTelemetry support](monitoring/otel/) | Y | Y | Y |
20-
| [Parallel apply](parallelapply) | Y | Y | Y |
21-
| [Conflict-free replicated data types (CRDTs)](consistency/crdt/) | Y | Y | Y |
22-
| [Column-level conflict resolution](consistency/column-level-conflicts/) | Y | Y | Y |
23-
| [Transform triggers](striggers/#transform-triggers) | Y | Y | Y |
24-
| [Conflict triggers](striggers/#conflict-triggers) | Y | Y | Y |
25-
| [Asynchronous replication](durability/) | Y | Y | Y |
26-
| [Legacy synchronous replication](durability/legacy-sync) | Y | Y | Y |
27-
| [Group Commit](durability/group-commit/) | N | Y | 14+ |
28-
| [Commit At Most Once (CAMO)](durability/camo/) | N | Y | 14+ |
29-
| [Eager Conflict Resolution](consistency/eager/) | N | Y | 14+ |
30-
| [Lag Control](durability/lag-control/) | N | Y | 14+ |
31-
| [Decoding Worker](node_management/decoding_worker) | N | 13+ | 14+ |
32-
| [Lag tracker](monitoring/sql/#monitoring-outgoing-replication) | N | Y | 14+ |
33-
| [Missing partition conflict](consistency/conflicts/#target_table_note) | N | Y | 14+ |
9+
| Feature | PostgreSQL | EDB Postgres Extended | EDB Postgres Advanced |
10+
|-------------------------------------------------------------------------------------------------|------------|-----------------------|-----------------------|
11+
| [Rolling application and database upgrades](upgrades/) | Y | Y | Y |
12+
| [Row-level last-update wins conflict resolution](consistency/conflicts/) | Y | Y | Y |
13+
| [DDL replication](ddl/) | Y | Y | Y |
14+
| [Granular DDL Locking](ddl/ddl-locking/) | Y | Y | Y |
15+
| [Streaming of large transactions](transaction-streaming/) | 14+ | 13+ | 14+ |
16+
| [Distributed sequences](sequences/#pgd-global-sequences) | Y | Y | Y |
17+
| [Subscribe-only nodes](node_management/subscriber_only/) | Y | Y | Y |
18+
| [Monitoring](monitoring/) | Y | Y | Y |
19+
| [OpenTelemetry support](monitoring/otel/) | Y | Y | Y |
20+
| [Parallel apply](parallelapply) | Y | Y | Y |
21+
| [Conflict-free replicated data types (CRDTs)](consistency/crdt/) | Y | Y | Y |
22+
| [Column-level conflict resolution](consistency/column-level-conflicts/) | Y | Y | Y |
23+
| [Transform triggers](striggers/#transform-triggers) | Y | Y | Y |
24+
| [Conflict triggers](striggers/#conflict-triggers) | Y | Y | Y |
25+
| [Asynchronous replication](durability/) | Y | Y | Y |
26+
| [Legacy synchronous replication](durability/legacy-sync) | Y | Y | Y |
27+
| [Group Commit](durability/group-commit/) | N | Y | 14+ |
28+
| [Commit At Most Once (CAMO)](durability/camo/) | N | Y | 14+ |
29+
| [Eager Conflict Resolution](consistency/eager/) | N | Y | 14+ |
30+
| [Lag Control](durability/lag-control/) | N | Y | 14+ |
31+
| [Decoding Worker](node_management/decoding_worker) | N | 13+ | 14+ |
32+
| [Lag tracker](monitoring/sql/#monitoring-outgoing-replication) | N | Y | 14+ |
33+
| [Missing partition conflict](consistency/conflicts/#target_table_note) | N | Y | 14+ |
3434
| [No need for UPDATE Trigger on tables with TOAST](consistency/conflicts/#toast-support-details) | N | Y | 14+ |
35-
| [Automatically hold back FREEZE](consistency/conflicts/#origin-conflict-detection) | N | Y | 14+ |
36-
| [Transparent Data Encryption](/tde/latest/) | N | 15+ | 15+ |
35+
| [Automatically hold back FREEZE](consistency/conflicts/#origin-conflict-detection) | N | Y | 14+ |
36+
| [Transparent Data Encryption](/tde/latest/) | N | 15+ | 15+ |
37+
| [Synchronous Commit](durability/synchronous_commit) | N | 14+ | 14+ |
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
---
2+
title: Comparing Commit Scope Kinds
3+
navTitle: Comparing Commit Scope Kinds
4+
deepToC: true
5+
---
6+
7+
## Comparing Group Commit and Synchronous Commit
8+
9+
### How Group Commit works
10+
11+
A Group Commit first writes a prepared transaction to local disk and waits till it is replicated to enough nodes and acknowledged by them before then committing it locally.
12+
13+
This also gives resiliency to failure of N nodes, where N is the number of nodes required to acknowledge the transaction.
14+
Commit scope syntax allows you to specify N as part of the rule.
15+
16+
### How Synchronous Commit works
17+
18+
A Synchronous Commit first commits to the local disk and then waits till the transaction is replicated to N other nodes, as per rules of the commit scope.
19+
On receiving confirmation that the transaction has been received, applied or made durable, again based on commit scope settings, the client gets a completion acknowledgement of the transaction.
20+
21+
Synchronous commit increases resiliency - it ensures that the transaction is received by more than N nodes before it is acknowledged to caller.
22+
This makes it resilient to failure of N nodes.
23+
When conflicts occur, with Synchronous Commit the conflict resolution is always the default async PGD conflict resolution.
24+
25+
### Network partitions and node crashes
26+
27+
Consider a network partition where the originating node is on one side of the partition and the replicas on the other.
28+
We will look at the state of each kind, before and after connectivity is restored.
29+
30+
#### Group Commit and network partitions
31+
32+
With Group Commit and before restoration, the origin node will have prepared a transaction and wait for confirmations which will not arrive.
33+
It could timeout and abort the transaction or just wait if the timeout setting is long enough.
34+
Writes that are recieved by the replicas will be held back if the nodes have prepared transactions from the now-disconnected origin. If there are no prepared transactions, they can continue to recieve writes.
35+
36+
When connectivity is restored, eventually where the origin node aborted the transaction, the prepared transaction on some or all of the replicas will follow in aborting the transaction.
37+
If there are prepared transactions on all nodes, including the origin, the reconciliation process will commit them. Until that happens, each prepared transaction will be holding locks that will prevent nodes that have resumed writing from overwriting the transaction's writes from other nodes
38+
39+
#### Synchronous Commit and network partitions
40+
41+
With Synchronous Commit and before restoration, the origin node will have committed the transaction and be waiting for confirmations which won't arrive.
42+
The client may cancel it or it could timeout, but the transaction stays committed on this node.
43+
On the other side of the partition, it's possible that one of the replicas becomes write leader and that segment of the cluster starts writing to the other nodes.
44+
45+
As the network restores, and if the commit hasn't reached the other nodes, it is possible that the writes on the non-origin side could conflict with the writes from the origin node. The default async conflict resolution may choose one over the other and give errors or potential inconsistencies.
46+
47+
#### Group Commit and node crashes
48+
49+
A similar situation to Group Commit network partitioning occurs with node crashes. The restored origin node may have a number of locally prepared transactions which it will start replicating as it returns. If conflict resolution is set to eager, and there are conflicts, the transaction can be aborted.
50+
51+
#### Synchronous Commit and node crashes
52+
53+
Like Synchronous Commit network partioning, when a restored origin node returns, it may have a number of locally committed transactions. It will re-commence replicating, including those transactions and there may be conflicts which will be resolve through the default async conflict resolution. Again, this may choose one over the other and result in errors or potential inconsistencies.
54+
55+
### Performance
56+
57+
TBD
58+
59+
60+
### Overall result
61+
62+
To compare Group Commit and Synchronous Commit at the highest level, we can look at the effect each has on RPO and RTO.
63+
64+
#### RPO
65+
66+
RPO - Recovery Point Objective - is the term for how much data it is acceptable to lose after a disaster.
67+
With PGD's default async replication, the amount of data that could be lost is based on replication lag.
68+
A cluster where the other nodes lag by 100MB behind the write-leader could lose 100MB of data in a disaster.
69+
If that is acceptable, the RPO is therefore 100MB.
70+
71+
With Group Commit configured for MAJORITY or ALL nodes, there is going to be an RPO of 0 in the case of one node failing. That's because the origin node will not have any data which has not been confirmed as replicated on another node.
72+
73+
With Synchronous Commit, even when MAJORITY OR ALL nodes are configured, the originating node may have committed transactions that the other nodes do not have. Since these transactions have not been acknowledged to the client, it can be thought of as an RPO of 0. When the node recovers though, these transactions will show up as committed and the node will have deal with the conflicts as it recovers.
74+
75+
#### RTO
76+
77+
RTO - Recovery Time Objective - is the term for how much time it is acceptable to wait to regain access to data after a disaster. With PGD's default async replication, RTO is 0 but at the cost of data loss.
78+
79+
With Group Commit, getting access to the data can be immediate, but there will be time take once accessible again, to reconcile the prepared transactions. This only stops access to data that is modified by the prepared transactions and may ony be a few seconds. We can say the RTO is close to 0
80+
81+
With Synchronous Commit, the RTO can be 0 but will require dealing with conflicts later.
82+

product_docs/docs/pgd/5/durability/index.mdx

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ navigation:
77
- commit-scopes
88
- commit-scope-rules
99
- comparing
10+
- comparing-commit-scopes
1011
- '# Commit Scope kinds'
1112
- group-commit
1213
- camo
@@ -41,7 +42,8 @@ of commit scopes and how to define them for your needs.
4142
* [Commit scope rules](commit-scope-rules) looks at the syntax of and how to formulate
4243
a commit scope rule.
4344

44-
* [Comparing](comparing) compares how each option behaves.
45+
46+
* [Comparing](comparing) compares how each confirmation option behaves.
4547

4648
## Commit scope kinds
4749

@@ -59,6 +61,8 @@ out of sync nodes may go when a database node goes out of service.
5961
* [Synchronous Commit](synchronous_commit) examines a commit scope mechanism which works
6062
in a similar fashion to legacy synchronous replication, but from within the commit scope framework.
6163

64+
* [Comparing commit scopes](comparing-commit-scopes) compares commit scope kinds operate.
65+
6266
## Working with commit scopes
6367

6468
* [Administering](administering) addresses how a PGD cluster with Group Commit

product_docs/docs/pgd/5/durability/synchronous_commit.mdx

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ PGD's `SYNCHRONOUS_COMMIT` is a commit scope kind that works in a way that is mo
1111

1212
Unlike other commit scope kinds such as GROUP COMMIT and CAMO, the transactions in a SYNCHRONOUS_COMMIT operation will not be transformed into a two phase commit (2PC) transaction, but work more like a Postgres Synchronous commit.
1313

14+
`SYNCHRONOUS_COMMIT` is supported only by EDB Postgres Extended and EDB Postgres Advanced Server versions 14 or later.
15+
1416
## Example
1517

1618
```
@@ -30,14 +32,13 @@ There are no parameters for `SYNCHRONOUS_COMMIT` and therefore no configuration.
3032

3133
## Confirmation
3234

33-
Confirmation&nbsp;Level | PGD Synchronous Commit Handling
34-
-------------------------|-------------------------------
35-
`received` | A remote PGD node confirms the transaction once it's been fully received and is in in-memory write queue.
36-
`replicated` | Same behavior as `received`.
37-
`durable` | Confirms the transaction after all of its changes are flushed to disk. Analogous to `synchronous_commit = on` in legacy synchronous replication.
38-
`visible` (default) | Confirms the transaction after all of its changes are flushed to disk and it's visible to concurrent transactions. Analogous to `synchronous_commit = remote_apply` in legacy synchronous replication.
35+
| Confirmation&nbsp;Level | PGD Synchronous Commit Handling |
36+
|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
37+
| `received` | A remote PGD node confirms the transaction once it's been fully received and is in in-memory write queue. |
38+
| `replicated` | Same behavior as `received`. |
39+
| `durable` | Confirms the transaction after all of its changes are flushed to disk. Analogous to `synchronous_commit = on` in legacy synchronous replication. |
40+
| `visible` (default) | Confirms the transaction after all of its changes are flushed to disk and it's visible to concurrent transactions. Analogous to `synchronous_commit = remote_apply` in legacy synchronous replication. |
3941

4042
## Details
4143

42-
Currently `SYNCHRONOUS_COMMIT` does not use the confirmation levels of the commit scope rule syntax.
43-
44+
SYNCHRONOUS_COMMIT works with EDB Postgres Advanced Server and EDB Postgres Extended versions 14 or later.

0 commit comments

Comments
 (0)