Skip to content

*: add column-level masking policy feature document#21454

Open
tiancaiamao wants to merge 6 commits into
masterfrom
column-masking-policy
Open

*: add column-level masking policy feature document#21454
tiancaiamao wants to merge 6 commits into
masterfrom
column-masking-policy

Conversation

@tiancaiamao
Copy link
Copy Markdown
Contributor

@tiancaiamao tiancaiamao commented Mar 23, 2026

First-time contributors' checklist

What is changed, added or deleted? (Required)

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions (in Chinese).

  • master (the latest development version)
  • v9.0 (TiDB 9.0 versions)
  • v8.5 (TiDB 8.5 versions)
  • v8.1 (TiDB 8.1 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)

What is the related PR or file link(s)?

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

@ti-chi-bot ti-chi-bot Bot added missing-translation-status This PR does not have translation status info. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 23, 2026
@ti-chi-bot ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 23, 2026
@hfxsd hfxsd added translation/doing This PR’s assignee is translating this PR. and removed missing-translation-status This PR does not have translation status info. labels Mar 24, 2026
@qiancai
Copy link
Copy Markdown
Collaborator

qiancai commented Mar 24, 2026

@tiancaiamao Please involve a tech reviewer for this PR. Thanks.

@qiancai qiancai added the v9.0-beta.3 This PR/issue applies to TiDB v9.0-beta.3. label Mar 24, 2026
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
@qiancai qiancai self-assigned this Apr 17, 2026
@tiancaiamao tiancaiamao requested a review from bb7133 May 9, 2026 02:56
@bb7133
Copy link
Copy Markdown
Member

bb7133 commented May 9, 2026

Please address the comments~

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 9, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from qiancai. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated

列级脱敏策略是一项安全功能,允许你在列级别应用脱敏规则来保护敏感数据。当对列应用脱敏策略时,TiDB 会根据定义的规则自动对返回给用户的数据进行脱敏,而原始数据在存储中保持不变。

此功能对于满足 PCI-DSS(支付卡行业数据安全标准)等合规要求以及数据隐私法规(如 GDPR - 通用数据保护条例、CCPA - 加州消费者隐私法案)特别有用,这些法规要求严格控制谁可以查看信用卡号、个人标识符和其他机密信息等敏感信息。
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
此功能对于满足 PCI-DSS(支付卡行业数据安全标准)等合规要求以及数据隐私法规(如 GDPR - 通用数据保护条例、CCPA - 加州消费者隐私法案)特别有用,这些法规要求严格控制谁可以查看信用卡号、个人标识符和其他机密信息等敏感信息
列级脱敏策略适用于需要限制敏感数据可见性的场景,例如控制信用卡号、身份证号、电话号码、电子邮件地址、出生日期等敏感信息的访问范围

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从使用场景描述角度,原始的描述可能是更准确且易被理解的。
comment 后的描述更通用,但是反而不便于用户直接理解场景。

Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Comment thread column-level-masking-policy.md Outdated
Co-authored-by: Grace Cai <qqzczy@126.com>
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 13, 2026

@tiancaiamao: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-verify 7d9bc9c link true /test pull-verify

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.


列级脱敏策略是一项安全功能,允许你在列级别应用脱敏规则来保护敏感数据。当对列应用脱敏策略时,TiDB 会根据定义的规则自动对返回给用户的数据进行脱敏,而原始数据在存储中保持不变。

此功能对于满足 PCI-DSS(支付卡行业数据安全标准)等合规要求以及数据隐私法规(如 GDPR - 通用数据保护条例、CCPA - 加州消费者隐私法案)特别有用,这些法规要求严格控制谁可以查看信用卡号、个人标识符和其他机密信息等敏感信息。
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从使用场景描述角度,原始的描述可能是更准确且易被理解的。
comment 后的描述更通用,但是反而不便于用户直接理解场景。

- `table_name`:包含要脱敏列的表名
- `column_name`:要应用脱敏策略的列名
- `masking_expression`:定义脱敏逻辑的 SQL 表达式
- `RESTRICT ON`:可选。指定对于无法访问未脱敏数据的用户应阻止的操作
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

常规则的使用,我们说的脱敏都是针对查询讲的
比如,select sensitive_col from t where ... 这种我们知道会 masking
但是非常规的,就可能涉及
insert into t1 select select sensitive_col from t where ...
用户如果把敏感列数据读出来,再写到另一张表里面
那它是否应该从另一张表中读出脱敏数据?
这个行为就是 RESTRICT ON 来决定,它要限制作用域是什么
所以文档后面就有 RESTRICT ON 语义,来进一步描述

- `column_name`:要应用脱敏策略的列名
- `masking_expression`:定义脱敏逻辑的 SQL 表达式
- `RESTRICT ON`:可选。指定对于无法访问未脱敏数据的用户应阻止的操作
- `ENABLE | DISABLE`:可选。策略是否处于活动状态。默认为 `ENABLE`。
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对。就是我这些策略可以创建后暂时不使用,于是可以开启和关闭
而不是整体删和重建

Comment thread column-level-masking-policy.md Outdated
| `INSERT_INTO_SELECT` | 阻止通过 `INSERT ... SELECT` 将脱敏数据插入另一个表 |
| `UPDATE_SELECT` | 阻止通过 `UPDATE ... SET = (SELECT ...)` 使用脱敏数据进行更新 |
| `DELETE_SELECT` | 阻止通过 `DELETE ... WHERE ... IN (SELECT ...)` 基于脱敏数据进行删除 |
| `CTAS` | 阻止使用脱敏数据进行 Create Table As Select |
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiancai 这里是否特殊说明一下,CTAS 其实是没有实现的,或者叫保留给将来使用
原因不是 column masking 这边没做实现
而是说,由于我们的 DDL 那边本身是没有支持 create table as select 这样的用法
必须分成两步做,create table like + insert into select
由于没有 create table as select 的实现,column masking 也就没有实际 去实现 CTAS

1. **存储不变**:原始数据存储时未经修改
2. **查询处理使用原始值**:`JOIN`、`WHERE`、`GROUP BY`、`HAVING`、`ORDER BY` 等操作都使用原始值
3. **仅输出被脱敏**:返回给客户端的数据根据策略进行脱敏

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是最外层阶段做的脱敏
只要不是最外层阶段,它们用到的都还是原始的值
也就是 masking 表达式始终应该作用于最最外面一层

- 临时表上的脱敏策略
- 系统表上的脱敏策略
- 在脱敏策略处于活动状态时修改列类型或长度(先删除策略)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

行为应该是在最后给到用户的输入那里脱敏
而不是在脱敏过的结果之上应用视图

Comment on lines +548 to +552
使用 BR(备份与恢复)或 TiCDC 等工具复制数据时:

1. 脱敏策略 DDL 语句会被复制
2. 必须在目标集群上单独创建用户和角色定义
3. 目标集群必须具有相同的用户/角色才能使脱敏正常工作
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里还涉及 BR/TiCDC 版本
就是 TiDB / BR / TiCDC 必须都是支持的版本中,才能够支持

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. translation/doing This PR’s assignee is translating this PR. v9.0-beta.3 This PR/issue applies to TiDB v9.0-beta.3.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants