Skip to content
Merged
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
d56ea72
Add temp.md
qiancai Mar 20, 2025
fd08ad2
Delete temp.md
qiancai Mar 20, 2025
1942eda
add translation
qiancai Mar 20, 2025
b2d4f78
Update ticdc-new-arch.md
qiancai Mar 20, 2025
ad09682
Update ticdc/ticdc-new-arch.md
lidezhu Mar 21, 2025
20d8dd0
Apply suggestions from code review
qiancai Mar 21, 2025
f7406b1
Update ticdc/ticdc-new-arch.md
qiancai Mar 25, 2025
f4f0c1f
implement comments from https://github.com/pingcap/docs-cn/pull/19765…
qiancai Mar 26, 2025
1c26282
Update ticdc/ticdc-new-arch.md
qiancai Mar 26, 2025
986b84c
minor wording updates
qiancai Apr 17, 2025
1573114
sync from zh changes
qiancai Apr 17, 2025
d68ba4c
Apply suggestions from code review
qiancai Apr 23, 2025
368ce01
sync from zh changes
qiancai Apr 23, 2025
3146dc4
Refactor TiCDC architecture docs and update TOC
qiancai Oct 11, 2025
011c3c5
Update ticdc/ticdc-architecture.md
qiancai Oct 11, 2025
2d26993
Update ticdc/ticdc-server-config.md
qiancai Oct 11, 2025
a7168ac
Merge branch 'upstream-master' into ldz/add-ticdc-new-arch-19765
qiancai Oct 11, 2025
403b689
sync zh changes for ticdc-architecture.md
qiancai Oct 17, 2025
7d1f1f4
update the link of the ticdc-classic-architecture doc
qiancai Oct 17, 2025
b105c9a
Update monitor-ticdc.md
qiancai Oct 17, 2025
cfd0dc5
add monitoring related images
qiancai Oct 17, 2025
20c3672
update ticdc/ticdc-overview.md
qiancai Oct 17, 2025
2e4b787
Update ticdc-classic-architecture.md
qiancai Oct 17, 2025
9c76e17
Update ticdc-architecture.md
qiancai Oct 17, 2025
0873b6c
Update ticdc-architecture.md
qiancai Oct 17, 2025
84927a8
rename images for metrics
qiancai Oct 17, 2025
bed4373
Update monitor-ticdc.md
qiancai Oct 17, 2025
de832c2
Fix punctuation in dashboard section list
qiancai Oct 17, 2025
f2869c8
minor changes (adding an intro para for tab)
qiancai Oct 17, 2025
be7cdc8
Apply suggestions from code review
qiancai Oct 17, 2025
3e6163c
sync from zh
qiancai Oct 17, 2025
28124fd
Update ticdc/ticdc-architecture.md
lidezhu Oct 17, 2025
d81ebb8
Apply suggestions from code review
qiancai Oct 18, 2025
4e8664f
Update monitor-ticdc.md
qiancai Oct 18, 2025
7422640
Apply suggestions from code review
qiancai Oct 18, 2025
afce672
Include a direct link to v8.5.4-release.1
qiancai Oct 18, 2025
cb0050d
sync from zh
qiancai Oct 18, 2025
04f327d
Apply suggestions from code review
qiancai Oct 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion TOC.md
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,9 @@
- [Integrate with Confluent and Snowflake](/ticdc/integrate-confluent-using-ticdc.md)
- [Integrate with Apache Kafka and Apache Flink](/replicate-data-to-kafka.md)
- Reference
- [TiCDC Architecture](/ticdc/ticdc-architecture.md)
- TiCDC Architecture
- [TiCDC New Architecture](/ticdc/ticdc-architecture.md)
- [TiCDC Classic Architecture](/ticdc/ticdc-classic-architecture.md)
- [TiCDC Data Replication Capabilities](/ticdc/ticdc-data-replication-capabilities.md)
- [TiCDC Server Configurations](/ticdc/ticdc-server-config.md)
- [TiCDC Changefeed Configurations](/ticdc/ticdc-changefeed-config.md)
Expand Down
2 changes: 1 addition & 1 deletion br/backup-and-restore-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ Backup and restore might go wrong when some TiDB features are enabled or disable
| New collation | [#352](https://github.com/pingcap/br/issues/352) | Make sure that the value of the `new_collation_enabled` variable in the `mysql.tidb` table during restore is consistent with that during backup. Otherwise, inconsistent data index might occur and checksum might fail to pass. For more information, see [FAQ - Why does BR report `new_collations_enabled_on_first_bootstrap` mismatch?](/faq/backup-and-restore-faq.md#why-is-new_collation_enabled-mismatch-reported-during-restore). |
| Global temporary tables | | Make sure that you are using v5.3.0 or a later version of BR to back up and restore data. Otherwise, an error occurs in the definition of the backed global temporary tables. |
| TiDB Lightning Physical Import| | If the upstream database uses the physical import mode of TiDB Lightning, data cannot be backed up in log backup. It is recommended to perform a full backup after the data import. For more information, see [When the upstream database imports data using TiDB Lightning in the physical import mode, the log backup feature becomes unavailable. Why?](/faq/backup-and-restore-faq.md#when-the-upstream-database-imports-data-using-tidb-lightning-in-the-physical-import-mode-the-log-backup-feature-becomes-unavailable-why).|
| TiCDC | | BR v8.2.0 and later: if the target cluster to be restored has a changefeed and the changefeed [CheckpointTS](/ticdc/ticdc-architecture.md#checkpointts) is earlier than the BackupTS, BR does not perform the restoration. BR versions before v8.2.0: if the target cluster to be restored has any active TiCDC changefeeds, BR does not perform the restoration. |
| TiCDC | | BR v8.2.0 and later: if the target cluster to be restored has a changefeed and the changefeed [CheckpointTS](/ticdc/ticdc-classic-architecture.md#checkpointts) is earlier than the BackupTS, BR does not perform the restoration. BR versions before v8.2.0: if the target cluster to be restored has any active TiCDC changefeeds, BR does not perform the restoration. |
| Vector search | | Make sure that you are using v8.4.0 or a later version of BR to back up and restore data. Restoring tables with [vector data types](/vector-search/vector-search-data-types.md) to TiDB clusters earlier than v8.4.0 is not supported. |

### Version compatibility
Expand Down
Binary file added media/ticdc/ticdc-new-arch-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/ticdc/ticdc-new-arch-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/ticdc/ticdc-new-arch-import-grafana.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/ticdc/ticdc-new-arch-metric-log-puller.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/ticdc/ticdc-new-arch-metric-server.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/ticdc/ticdc-new-arch-metric-sink.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added media/ticdc/ticdc-new-arch-metric-summary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions releases/release-8.2.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@

* When using [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md) to import a CSV file, if you specify the `SPLIT_FILE` parameter to split a large CSV file into multiple small CSV files to improve concurrency and import performance, you need to explicitly specify the line terminator `LINES_TERMINATED_BY`. The values can be `\r`, `\n` or `\r\n`. Failure to specify a line terminator might result in an exception when parsing the CSV file data. [#37338](https://github.com/pingcap/tidb/issues/37338) @[lance6716](https://github.com/lance6716)

* Before BR v8.2.0, performing [BR data restore](/br/backup-and-restore-overview.md) on a cluster with TiCDC replication tasks is not supported. Starting from v8.2.0, BR relaxes the restrictions on data restoration for TiCDC: if the BackupTS (the backup time) of the data to be restored is earlier than the changefeed [`CheckpointTS`](/ticdc/ticdc-architecture.md#checkpointts) (the timestamp that indicates the current replication progress), BR can proceed with the data restore normally. Considering that `BackupTS` is usually much earlier, it can be assumed that in most scenarios, BR supports restoring data for a cluster with TiCDC replication tasks. [#53131](https://github.com/pingcap/tidb/issues/53131) @[YuJuncen](https://github.com/YuJuncen)
* Before BR v8.2.0, performing [BR data restore](/br/backup-and-restore-overview.md) on a cluster with TiCDC replication tasks is not supported. Starting from v8.2.0, BR relaxes the restrictions on data restoration for TiCDC: if the BackupTS (the backup time) of the data to be restored is earlier than the changefeed [`CheckpointTS`](/ticdc/ticdc-classic-architecture.md#checkpointts) (the timestamp that indicates the current replication progress), BR can proceed with the data restore normally. Considering that `BackupTS` is usually much earlier, it can be assumed that in most scenarios, BR supports restoring data for a cluster with TiCDC replication tasks. [#53131](https://github.com/pingcap/tidb/issues/53131) @[YuJuncen](https://github.com/YuJuncen)

Check warning on line 168 in releases/release-8.2.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'much' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'much' because it may cause confusion.", "location": {"path": "releases/release-8.2.0.md", "range": {"start": {"line": 168, "column": 538}}}, "severity": "INFO"}

### MySQL compatibility

Expand Down Expand Up @@ -263,7 +263,7 @@
+ Backup & Restore (BR)

- Optimize the backup feature, improving backup performance and stability during node restarts, cluster scaling-out, and network jitter when backing up large numbers of tables [#52534](https://github.com/pingcap/tidb/issues/52534) @[3pointer](https://github.com/3pointer)
- Implement fine-grained checks of TiCDC changefeed during data restore. If the changefeed [`CheckpointTS`](/ticdc/ticdc-architecture.md#checkpointts) is later than the data backup time, the restore operations are not affected, thereby reducing unnecessary wait times and improving user experience [#53131](https://github.com/pingcap/tidb/issues/53131) @[YuJuncen](https://github.com/YuJuncen)
- Implement fine-grained checks of TiCDC changefeed during data restore. If the changefeed [`CheckpointTS`](/ticdc/ticdc-classic-architecture.md#checkpointts) is later than the data backup time, the restore operations are not affected, thereby reducing unnecessary wait times and improving user experience [#53131](https://github.com/pingcap/tidb/issues/53131) @[YuJuncen](https://github.com/YuJuncen)
- Add several commonly used parameters to the [`BACKUP`](/sql-statements/sql-statement-backup.md) statement and the [`RESTORE`](/sql-statements/sql-statement-restore.md) statement, such as `CHECKSUM_CONCURRENCY` [#53040](https://github.com/pingcap/tidb/issues/53040) @[RidRisR](https://github.com/RidRisR)
- Except for the `br log restore` subcommand, all other `br log` subcommands support skipping the loading of the TiDB `domain` data structure to reduce memory consumption [#52088](https://github.com/pingcap/tidb/issues/52088) @[Leavrth](https://github.com/Leavrth)
- Support encryption of temporary files generated during log backup [#15083](https://github.com/tikv/tikv/issues/15083) @[YuJuncen](https://github.com/YuJuncen)
Expand Down
122 changes: 116 additions & 6 deletions ticdc/monitor-ticdc.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,127 @@ summary: Learn some key metrics displayed on the Grafana TiCDC dashboard.

# TiCDC Monitoring Metrics Details

If you use TiUP to deploy the TiDB cluster, you can see a sub-dashboard for TiCDC in the monitoring system which is deployed at the same time. You can get an overview of TiCDC's current status from the TiCDC dashboard, where the key metrics are displayed. This document provides a detailed description of these key metrics.
You can get an overview of TiCDC's current status from the TiCDC dashboard, where the key metrics are displayed. This document provides a detailed description of these key metrics.

The metric description in this document is based on the following replication task example, which replicates data to MySQL using the default configuration.

```shell
cdc cli changefeed create --server=http://10.0.10.25:8300 --sink-uri="mysql://root:123456@127.0.0.1:3306/" --changefeed-id="simple-replication-task"
```

The TiCDC dashboard contains four monitoring panels. See the following screenshot:
## Metrics for TiCDC in the new architecture

![TiCDC Dashboard - Overview](/media/ticdc/ticdc-dashboard-overview.png)
The monitoring dashboard **TiCDC-New-Arch** for [TiCDC New Architecture](/ticdc/ticdc-architecture.md) is not yet integrated into TiUP. To view the relevant monitoring information on Grafana, you need to manually import the TiCDC monitoring metrics file:

1. Download the monitoring metrics file for TiCDC in the new architecture:

```shell
wget https://raw.githubusercontent.com/pingcap/ticdc/refs/heads/release-8.5/metrics/grafana/ticdc_new_arch.json
```

2. Import the downloaded metrics file on the Grafana page.

![Import Metrics File](/media/ticdc/ticdc-new-arch-import-grafana.png)

The monitoring dashboard for TiCDC new architecture mainly includes the following sections:

- [**Summary**](#summary): The summary information of the TiCDC cluster
- [**Server**](#server): The summary information of TiKV nodes and TiCDC nodes in the TiDB cluster
- [**Log Puller**](#log-puller): The detailed information of the TiCDC Log Puller module
- [**Event Store**](#event-store): The detailed information of the TiCDC Event Store module
- [**Sink**](#sink): The detailed information of the TiCDC Sink module

### Summary

The following is an example of the **Summary** panel:

![Summary](/media/ticdc/ticdc-new-arch-metric-summary.png)

The description of each metric in the **Summary** panel is as follows:

- Changefeed Checkpoint Lag: The lag of a replication task between downstream and upstream
- Changefeed ResolvedTs Lag: The lag between the internal processing progress of TiCDC nodes and the upstream database
- Upstream Write Bytes/s: The write throughput of the upstream database
- TiCDC Input Bytes/s: The amount of data that TiCDC receives from the upstream per second
- Sink Event Row Count/s: The number of rows that TiCDC writes to the downstream per second
- Sink Write Bytes/s: The amount of data that TiCDC writes to the downstream per second
- The Status of Changefeeds: The status of each changefeed
- Table Dispatcher Count: The number of dispatchers corresponding to each changefeed
- Memory Quota: The memory quota and usage of the Event Collector; excessive usage might cause throttling

### Server

The following is an example of the **Server** panel:

![Server](/media/ticdc/ticdc-new-arch-metric-server.png)

The description of each metric in the **Server** panel is as follows:

- Uptime: The time for which TiKV nodes and TiCDC nodes have been running
- Goroutine Count: The number of Goroutines in TiCDC nodes
- Open FD Count: The number of file handles opened by TiCDC nodes
- CPU Usage: The CPU usage of TiCDC nodes
- Memory Usage: The memory usage of TiCDC nodes
- Ownership History: The historical record of Owner nodes in the TiCDC cluster
- PD Leader History: The historical record of PD Leader nodes in the upstream TiDB cluster

### Log Puller

The following is an example of the **Log Puller** panel:

![Log Puller](/media/ticdc/ticdc-new-arch-metric-log-puller.png)

The description of each metric in the **Log Puller** panel is as follows:

- Input Events/s: The number of events that TiCDC receives per second
- Unresolved Region Request Count: The number of Region incremental scan requests that TiCDC has sent but not yet completed
- Region Request Finish Scan Duration: The time consumed by Region incremental scans
- Subscribed Region Count: The total number of subscribed Regions
- Memory Quota: The memory quota and usage of Log Puller; excessive usage might cause throttling
- Resolved Ts Batch Size (Regions): The number of Regions included in a single Resolved Ts event

### Event Store

The following is an example of the **Event Store** panel:

![Event Store](/media/ticdc/ticdc-new-arch-metric-event-store.png)

The description of each metric in the **Event Store** panel is as follows:

- Resolved Ts Lag: The lag between Event Store processing progress and the upstream database
- Register Dispatcher StartTs Lag: The lag between dispatcher registration StartTs and the current time
- Subscriptions Resolved Ts Lag: The lag between subscription processing progress and the upstream database
- Subscriptions Data GC Lag: The lag between subscription data GC progress and the current time
- Input Event Count/s: The number of events that Event Store processes per second
- Input Bytes/s: The amount of data that Event Store processes per second
- Write Requests/s: The number of write requests that Event Store executes per second
- Write Worker Busy Ratio: The ratio of IO time to total runtime for Event Store write threads
- Compressed Rows/s: The number of rows compressed per second in Event Store (triggered only when row size exceeds the threshold)
- Write Duration: The time consumed by Event Store write operations
- Write Batch Size: The batch size of a single write operation
- Write Batch Event Count: The number of rows included in a single write batch
- Data Size On Disk: The total data size that Event Store occupies on disk
- Data Size In Memory: The total data size that Event Store occupies in memory
- Scan Requests/s: The number of scan requests that Event Store executes per second
- Scan Bytes/s: The amount of data that Event Store scans per second

### Sink

The following is an example of the **Sink** panel:

![Sink](/media/ticdc/ticdc-new-arch-metric-sink.png)

The description of each metric in the **Sink** panel is as follows:

- Output Row Batch Count: The average number of rows per DML batch written by the Sink module
- Output Row Count(per second): The number of DML rows written to downstream per second
- Output DDL Executing Duration: The time consumed by executing DDL events for the changefeed on the current node
- Sink Error Count / m: The number of error messages reported per minute by the Sink module
- Output DDL Count / Minutes: The number of DDLs executed per minute for the changefeed on the current node

## Metrics for TiCDC in the classic architecture

If you use TiUP to deploy the TiDB cluster, you can see a sub-dashboard for TiCDC in the monitoring system which is deployed at the same time.

The description of each panel is as follows:

Expand All @@ -24,7 +134,7 @@ The description of each panel is as follows:
- [**Events**](#events): The detail information about the data flow within the TiCDC cluster
- [**TiKV**](#tikv): TiKV information related to TiCDC

## Server
### Server

The following is an example of the **Server** panel:

Expand All @@ -40,7 +150,7 @@ The description of each metric in the **Server** panel is as follows:
- CPU usage: The CPU usage of TiCDC nodes
- Memory usage: The memory usage of TiCDC nodes

## Changefeed
### Changefeed

The following is an example of the **Changefeed** panel:

Expand Down Expand Up @@ -102,7 +212,7 @@ The description of each metric in the **Events** panel is as follows:
- KV client dispatch events/s: The number of events that the KV client module dispatches among the TiCDC nodes
- KV client batch resolved size: The batch size of resolved timestamp messages that TiKV sends to TiCDC

## TiKV
### TiKV

The following is an example of the **TiKV** panel:

Expand Down
Loading
Loading