Skip to content

DOCS-144 - Commit scope comparisons - first draft #5349

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 28 additions & 27 deletions product_docs/docs/pgd/5/choosing_server.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,31 +6,32 @@ EDB Postgres Distributed can be deployed with three different Postgres distribut

The following table lists features of EDB Postgres Distributed that are dependent on the Postgres distribution and version.

| Feature | PostgreSQL | EDB Postgres Extended | EDB Postgres Advanced |
|-------------------------------------------------|------------|-----------------------|-----------------------|
| [Rolling application and database upgrades](upgrades/) | Y | Y | Y |
| [Row-level last-update wins conflict resolution](consistency/conflicts/) | Y | Y | Y |
| [DDL replication](ddl/) | Y | Y | Y |
| [Granular DDL Locking](ddl/ddl-locking/) | Y | Y | Y |
| [Streaming of large transactions](transaction-streaming/) | v14+ | v13+ | v14+ |
| [Distributed sequences](sequences/#pgd-global-sequences) | Y | Y | Y |
| [Subscribe-only nodes](node_management/subscriber_only/) | Y | Y | Y |
| [Monitoring](monitoring/) | Y | Y | Y |
| [OpenTelemetry support](monitoring/otel/) | Y | Y | Y |
| [Parallel apply](parallelapply) | Y | Y | Y |
| [Conflict-free replicated data types (CRDTs)](consistency/crdt/) | Y | Y | Y |
| [Column-level conflict resolution](consistency/column-level-conflicts/) | Y | Y | Y |
| [Transform triggers](striggers/#transform-triggers) | Y | Y | Y |
| [Conflict triggers](striggers/#conflict-triggers) | Y | Y | Y |
| [Asynchronous replication](durability/) | Y | Y | Y |
| [Legacy synchronous replication](durability/legacy-sync) | Y | Y | Y |
| [Group Commit](durability/group-commit/) | N | Y | 14+ |
| [Commit At Most Once (CAMO)](durability/camo/) | N | Y | 14+ |
| [Eager Conflict Resolution](consistency/eager/) | N | Y | 14+ |
| [Lag Control](durability/lag-control/) | N | Y | 14+ |
| [Decoding Worker](node_management/decoding_worker) | N | 13+ | 14+ |
| [Lag tracker](monitoring/sql/#monitoring-outgoing-replication) | N | Y | 14+ |
| [Missing partition conflict](consistency/conflicts/#target_table_note) | N | Y | 14+ |
| Feature | PostgreSQL | EDB Postgres Extended | EDB Postgres Advanced |
|-------------------------------------------------------------------------------------------------|------------|-----------------------|-----------------------|
| [Rolling application and database upgrades](upgrades/) | Y | Y | Y |
| [Row-level last-update wins conflict resolution](consistency/conflicts/) | Y | Y | Y |
| [DDL replication](ddl/) | Y | Y | Y |
| [Granular DDL Locking](ddl/ddl-locking/) | Y | Y | Y |
| [Streaming of large transactions](transaction-streaming/) | 14+ | 13+ | 14+ |
| [Distributed sequences](sequences/#pgd-global-sequences) | Y | Y | Y |
| [Subscribe-only nodes](node_management/subscriber_only/) | Y | Y | Y |
| [Monitoring](monitoring/) | Y | Y | Y |
| [OpenTelemetry support](monitoring/otel/) | Y | Y | Y |
| [Parallel apply](parallelapply) | Y | Y | Y |
| [Conflict-free replicated data types (CRDTs)](consistency/crdt/) | Y | Y | Y |
| [Column-level conflict resolution](consistency/column-level-conflicts/) | Y | Y | Y |
| [Transform triggers](striggers/#transform-triggers) | Y | Y | Y |
| [Conflict triggers](striggers/#conflict-triggers) | Y | Y | Y |
| [Asynchronous replication](durability/) | Y | Y | Y |
| [Legacy synchronous replication](durability/legacy-sync) | Y | Y | Y |
| [Group Commit](durability/group-commit/) | N | Y | 14+ |
| [Commit At Most Once (CAMO)](durability/camo/) | N | Y | 14+ |
| [Eager Conflict Resolution](consistency/eager/) | N | Y | 14+ |
| [Lag Control](durability/lag-control/) | N | Y | 14+ |
| [Decoding Worker](node_management/decoding_worker) | N | 13+ | 14+ |
| [Lag tracker](monitoring/sql/#monitoring-outgoing-replication) | N | Y | 14+ |
| [Missing partition conflict](consistency/conflicts/#target_table_note) | N | Y | 14+ |
| [No need for UPDATE Trigger on tables with TOAST](consistency/conflicts/#toast-support-details) | N | Y | 14+ |
| [Automatically hold back FREEZE](consistency/conflicts/#origin-conflict-detection) | N | Y | 14+ |
| [Transparent Data Encryption](/tde/latest/) | N | 15+ | 15+ |
| [Automatically hold back FREEZE](consistency/conflicts/#origin-conflict-detection) | N | Y | 14+ |
| [Transparent Data Encryption](/tde/latest/) | N | 15+ | 15+ |
| [Synchronous Commit](durability/synchronous_commit) | N | 14+ | 14+ |
82 changes: 82 additions & 0 deletions product_docs/docs/pgd/5/durability/comparing-commit-scopes.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
title: Comparing Commit Scope Kinds
navTitle: Comparing Commit Scope Kinds
deepToC: true
---

## Comparing Group Commit and Synchronous Commit

### How Group Commit works

A Group Commit first writes a prepared transaction to local disk and waits till it is replicated to enough nodes and acknowledged by them before then committing it locally.

This also gives resiliency to failure of N nodes, where N is the number of nodes required to acknowledge the transaction.
Commit scope syntax allows you to specify N as part of the rule.

### How Synchronous Commit works

A Synchronous Commit first commits to the local disk and then waits till the transaction is replicated to N other nodes, as per rules of the commit scope.
On receiving confirmation that the transaction has been received, applied or made durable, again based on commit scope settings, the client gets a completion acknowledgement of the transaction.

Synchronous commit increases resiliency - it ensures that the transaction is received by more than N nodes before it is acknowledged to caller.
This makes it resilient to failure of N nodes.
When conflicts occur, with Synchronous Commit the conflict resolution is always the default async PGD conflict resolution.

### Network partitions and node crashes

Consider a network partition where the originating node is on one side of the partition and the replicas on the other.
We will look at the state of each kind, before and after connectivity is restored.

#### Group Commit and network partitions

With Group Commit and before restoration, the origin node will have prepared a transaction and wait for confirmations which will not arrive.
It could timeout and abort the transaction or just wait if the timeout setting is long enough.
Writes that are recieved by the replicas will be held back if the nodes have prepared transactions from the now-disconnected origin and they hold conflicting locks. If there are no conflicts, they can continue to recieve writes.

When connectivity is restored, eventually where the origin node aborted the transaction, the prepared transaction on some or all of the replicas will follow in aborting the transaction.
If there are prepared transactions on all nodes, including the origin, the reconciliation process will commit them. Until that happens, each prepared transaction will be holding locks that will prevent nodes that have resumed writing from overwriting the transaction's writes from other nodes

#### Synchronous Commit and network partitions

With Synchronous Commit and before restoration, the origin node will have committed the transaction and be waiting for confirmations which won't arrive.
The client may cancel it or it could timeout, but the transaction stays committed on this node.
On the other side of the partition, it's possible that one of the replicas becomes write leader and that segment of the cluster starts writing to the other nodes.

As the network restores, and the commit from the separated origin arrives, it could conflict with the writes from the other nodes. The default async conflict resolution may choose one over the other which could give errors or potential inconsistencies.

#### Group Commit and node crashes

A similar situation to Group Commit network partitioning occurs with node crashes. The restored origin node may have a number of locally prepared transactions which it will start replicating as it returns. If conflict resolution is set to eager, and there are conflicts, the transaction can be aborted.

#### Synchronous Commit and node crashes

Like Synchronous Commit network partioning, when a restored origin node returns, it may have a number of locally committed transactions. It will re-commence replicating, including those transactions and there may be conflicts which will be resolve through the default async conflict resolution. Again, this may choose one over the other and result in errors or potential inconsistencies.

### Performance

TBD


### Overall result

To compare Group Commit and Synchronous Commit at the highest level, we can look at the effect each has on RPO and RTO.

#### RPO

RPO - Recovery Point Objective - is the term for how much data it is acceptable to lose after a disaster.
With PGD's default async replication, the amount of data that could be lost is based on replication lag.
A cluster where the other nodes lag by 100MB behind the write-leader could lose 100MB of data in a disaster.
If that is acceptable, the RPO is therefore 100MB.

With Group Commit configured for MAJORITY or ALL nodes, there is going to be an RPO of 0 in the case of one node failing. That's because the origin node will not have any data which has not been confirmed as replicated on another node.

With Synchronous Commit, even when MAJORITY OR ALL nodes are configured, the originating node may have committed transactions that the other nodes do not have. Since these transactions have not been acknowledged to the client, it can be thought of as an RPO of 0. When the node recovers though, these transactions will show up as committed and the node will have deal with the conflicts as it recovers.

#### RTO

RTO - Recovery Time Objective - is the term for how much time it is acceptable to wait to regain access to data after a disaster. With PGD's default async replication, RTO is 0 but at the cost of data loss.

With Group Commit, getting access to the data can be immediate, but there will be time take once accessible again, to reconcile the prepared transactions. This only stops access to data that is modified by the prepared transactions and may ony be a few seconds. We can say the RTO is close to 0

With Synchronous Commit, the RTO can be 0 but will require dealing with conflicts later.

6 changes: 5 additions & 1 deletion product_docs/docs/pgd/5/durability/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ navigation:
- commit-scopes
- commit-scope-rules
- comparing
- comparing-commit-scopes
- '# Commit Scope kinds'
- group-commit
- camo
Expand Down Expand Up @@ -41,7 +42,8 @@ of commit scopes and how to define them for your needs.
* [Commit scope rules](commit-scope-rules) looks at the syntax of and how to formulate
a commit scope rule.

* [Comparing](comparing) compares how each option behaves.

* [Comparing](comparing) compares how each confirmation option behaves.

## Commit scope kinds

Expand All @@ -59,6 +61,8 @@ out of sync nodes may go when a database node goes out of service.
* [Synchronous Commit](synchronous_commit) examines a commit scope mechanism which works
in a similar fashion to legacy synchronous replication, but from within the commit scope framework.

* [Comparing commit scopes](comparing-commit-scopes) compares commit scope kinds operate.

## Working with commit scopes

* [Administering](administering) addresses how a PGD cluster with Group Commit
Expand Down
17 changes: 9 additions & 8 deletions product_docs/docs/pgd/5/durability/synchronous_commit.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ PGD's `SYNCHRONOUS_COMMIT` is a commit scope kind that works in a way that is mo

Unlike other commit scope kinds such as GROUP COMMIT and CAMO, the transactions in a SYNCHRONOUS_COMMIT operation will not be transformed into a two phase commit (2PC) transaction, but work more like a Postgres Synchronous commit.

`SYNCHRONOUS_COMMIT` is supported only by EDB Postgres Extended and EDB Postgres Advanced Server versions 14 or later.

## Example

```
Expand All @@ -30,14 +32,13 @@ There are no parameters for `SYNCHRONOUS_COMMIT` and therefore no configuration.

## Confirmation

Confirmation Level | PGD Synchronous Commit Handling
-------------------------|-------------------------------
`received` | A remote PGD node confirms the transaction once it's been fully received and is in in-memory write queue.
`replicated` | Same behavior as `received`.
`durable` | Confirms the transaction after all of its changes are flushed to disk. Analogous to `synchronous_commit = on` in legacy synchronous replication.
`visible` (default) | Confirms the transaction after all of its changes are flushed to disk and it's visible to concurrent transactions. Analogous to `synchronous_commit = remote_apply` in legacy synchronous replication.
| Confirmation Level | PGD Synchronous Commit Handling |
|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `received` | A remote PGD node confirms the transaction once it's been fully received and is in in-memory write queue. |
| `replicated` | Same behavior as `received`. |
| `durable` | Confirms the transaction after all of its changes are flushed to disk. Analogous to `synchronous_commit = on` in legacy synchronous replication. |
| `visible` (default) | Confirms the transaction after all of its changes are flushed to disk and it's visible to concurrent transactions. Analogous to `synchronous_commit = remote_apply` in legacy synchronous replication. |

## Details

Currently `SYNCHRONOUS_COMMIT` does not use the confirmation levels of the commit scope rule syntax.

SYNCHRONOUS_COMMIT works with EDB Postgres Advanced Server and EDB Postgres Extended versions 14 or later.